## NBA Fouls – Substitutions and Discussion

This is part 4 of my series on DeMarcus Cousins and how NBA players accrue personal fouls.
Part 3 can be found here.
Part 2 can be found here.
Part 1 can be found here.

I strongly recommend reading parts 2 and 3 before continuing as this series builds on the past.

Substitutions

A natural question that arises from our previous analysis is to question if anything can be done to prevent a player from “tilting.” We now show that making quick substitutions can change how a player accrues fouls and reduce his “tilt.” We define a quick substitution (QS) as a substitution that occurs within 30 seconds of a personal foul. While this definition may capture substitutions that are not a reaction to the player committing a foul, we believe it is adequate for the purposes of this paper. Fouls are then classified as happening before or after the QS. As a result, games without a QS will classify every foul as happening before a hypothetical QS, which may never be observed. Furthermore, for ease of analysis, we only consider the first time a player has a QS, despite the possibility of it happening more than once per game.

Table 4 gives the output for survival analysis that includes an indicator for being before or after a quick substitution in the conditional risk set model for DeMarcus Cousins, Al Horford, Robin Lopez, and all centers pooled. The coefficient on QS for all players examined is negative, indicating that a quick substitution is associated with a lower chance that a player will foul at any time after the substitution. However, for all players examined, it is not always a significant difference. Quick substitutions seem to be associated with a reduction in Al Horford’s foul tendencies, though not significantly and the effect size is smaller for Horford than for Cousins. The players still have significant positive coefficients for later fouls, indicating that while they may still “tilt,” the QS may mitigate some of it.

Focusing on Cousins, Table 5 displays the survival model output for all fouls before a QS and after a QS side-by-side to facilitate comparison. The analysis of fouls before a quick substitution shows a significant increase in the chance that he commits a foul once he has 3 or 4 fouls. However, after the substitution, the coefficients are smaller, indicating that he is no longer as “tilted.” We visualize this change in foul behavior in Figures 3a and 3b which show the survival curves before and after a quick substitution. Cousins’s foul tendencies prior to a QS (Figure 3a) are similar to those seen across a whole game (Figure 2a). However, after a QS (Figure 3b), there is much less of a stark contrast. He does appear to commit his 4th fouls faster than his 3rd, but not as significantly as before the QS.

Al Horford, by contrast does not seem to be significantly affected by a QS, though throughout we have seen that Horford does not seem to “tilt” as much as other centers in general.  Figures 4a and 4b show Horford’s survival curves before and after a quick substitution. While there may be some distinction between the fouls before a QS, it is not as extreme as seen with Cousins, and there is certainly little order after a QS. Al Horford simply does not foul, “tilt,” or get affected by quick substitutions as much as other centers.

Discussion – Further Research

While we focused on only centers for this research, the methods used here can easily be used for all players in the NBA to identify players who “tilt.” In addition to looking at quick substitutions, it would be interesting to note other events which may reduce the effect of a “tilting” player, particularly other stoppages of play like timeouts or breaks in a period. We chose to look at substitutions shortly after a foul in the hopes of best capturing a direct coaching reaction to the foul. A timeout following shortly after a foul may also reflect a direct reaction to the foul and is a clear avenue for further analysis. Furthermore, while we only considered personal fouls in this study, it would be interesting to note how technical fouls play a role in “tilting” players. Technical fouls are especially interesting since they are rarely a part of strategy in the way a normal personal foul can be. Our overall aim is to examine players who are considered by many to be emotional, so how these players accumulate, or their teammates accumulate, technical fouls may have an impact on their foul rates and overall “tilt.” Additionally, we only adjusted for time and score, but there are many other factors that could be included such as the player being guarded (a player may be more likely to “tilt” against players who tend to play more aggressively or are known trash talkers) or if the rate at which the player of interest draws fouls (players may become more upset if they feel they are not receiving foul calls on their behalf). Moreover, while this paper was limited to a select few centers, the methods could easily be applied to all NBA players. Expanding the number of players analyzed would allow for greater understanding of how different players and positions accrue fouls.

Finally, we did not do any causal inference. Any effects we see are just associations. Proper causal inference analysis is a clear area for further research.

Conclusion

In this analysis, we used a survival model for fouls to show that fouling rates are not always independent of the number of fouls a player has accumulated. Emotional players, such as DeMarcus Cousins, often “tilt”, increasing the likelihood of committing another foul as they accrue more fouls. Our analysis also indicates that quickly substituting a player could influence an emotional player’s foul rate, reducing the likelihood of them picking up another foul.

We cannot say for certain the precise reason why a quick substitution has an effect. It could be that taking a player out of the game gives him time to calm down and become level-headed. However, it may also be related to the common strategy of attacking a player that is in “foul trouble”, often defined as approaching 3 fouls by halftime or 6 by the end of the game. Before the player is substituted, he may be in “foul trouble” causing the opposing team to attempt to draw a foul against him. After a QS and the player returns to the game, there is less incentive to attack since he is no longer in “foul trouble” due to the passage of game time. It may well be that a QS is simply a good indicator of keeping that player from being attacked. This hypothesis certainly merits further investigation.

While the scope of this paper is somewhat limited, we hope it will encourage others to explore the process by which players accrue fouls. We believe that further research in this area will reveal new insights into how players can remain effective throughout the game, especially if something as simple as a coach making a quick substitution can have such a significant impact. It may not be easy to stop “tilting” entirely, but there are ways to mitigate the effects.

## NBA Fouls – Survival Analysis

This is Part 3 of my series on DeMarcus Cousins and how NBA players accrue personal fouls.
Part 2 can be found here.
Part 1 can be found here.
I strongly recommend reading at least Part 2 before continuing as I reference it.

Survival Analysis

To provide more statistical rigor, we analyze our players using a conditional risk set model for ordered events. This model, first proposed by Prentice, Williams, and Peterson, models the hazard at each foul event time as a function of the current number of fouls accumulated and time since the last foul. The model is flexible and can include other covariates as needed. For this paper, our covariates include the lead or deficit in the score of the player’s team, game time in minutes, and an interaction between the two. We chose these covariates, as we believe that a closer game can have an impact on a player’s fouling rates. We include actual game time in minutes to reflect how close the game is to ending, and to account for potential overtime periods.

Let $X_{ki}$ and $C_{ki}$ be the foul and censoring time for the kth foul (k=1, 2, …,6) in the ith game and let $Z_{ki}$ be the vector of covariates for the ith game and with respect to the kth foul. We assume $X_{ki}$ and $C_{ki}$ are independent given $Z_{ki}$. We then define $T_{ki}=\min(Z_{ki},C_{ki})$ and let $\beta$ be a vector of unknown regression coefficients.  Under the proportional hazard assumption, the hazard function of the ith game and for the kth foul is:

$\lambda_{k}(t,Z_{ki})=\lambda_{0}\left(t\right)e^{\beta Z_{ki}}$

From Table 2, we can see that the difference in score plays a minimal impact on player fouling rates, even after adjusting for game time for Cousins, Horford, and Lopez. Closer games do not seem to cause more fouls to be committed. However, the total game time that has been played has an impact. Furthermore, as time goes on, it appears that players are less likely to foul. This trend holds true for our three players of interest and all players when pooled together, which is surprising considering that players are more likely to foul later in the game. With this analysis, it shows that players are more likely to foul if they have already fouled as the game goes on. If a player has not fouled already in the game, they are less likely to foul since time plays a negative relationship with likelihood to foul. This trend holds true for all centers we analyzed. These results are line with what we saw in Figure 1. Moreover, these results are similarly likely due to the selection bias we have that precludes us from seeing every foul in every game.

As before, we can limit our analysis to games where the players had at least 5 fouls, and examine analysis of the first four fouls. Table 3 displays the survival model output for Cousins, Horford and Lopez when we use the restricted dataset. For all players, fouls 2, 3, and 4 are committed significantly sooner than the prior foul. To find the hazard ratios associated with each foul, we exponentiate the difference in the coefficients since each coefficient is with respect to the baseline of the 1st foul. For example, when Cousins has 3 fouls he is 405% more likely to commit a foul at any given time than when he only has 2 fouls. Cousins is 303% more likely to commit a foul when he has four fouls compared to when he only has three. Although the hazard ratios increase dramatically with each foul, it is important to keep in mind that the initial probability of fouling at any given moment is low, as the initial foul takes nearly 500 seconds (over 8 minutes) to take place on average for DeMarcus Cousins.

It is interesting to note that the opposite effect happens with game time. As each minute passes in the game, Cousins is only 90% as likely to commit a foul as the previous minute. This trend holds for all players.

From the table, we can see that although all players seem to have this “tilting” behavior, DeMarcus Cousins has a higher likelihood of committing a foul than other players as he accrues fouls. Cousins seems to “tilt” more than others centers in our analysis. Part of this behavior may be explained by teams attacking players who already have many fouls, attempting to get them in foul trouble. However, we believe that no one factor can tell the complete story.

Part 4 of this series can be found here

## NBA Fouls – Data, basic stats and visualizations

This is part 2 of my series on DeMarcus Cousins and how NBA players accrue personal fouls.
Part 1 can be found here

I’ll be pulling edited sections from the paper I wrote with Udam Saini for the 2017 Sloan Sports Analytics Conference research paper competition. A full, finalized version of the paper will be available at a later date.

The goal of this project is to examine how NBA players accrue fouls and if it is possible to mitigate their foul tendencies through simple coaching decisions. Let’s start with getting some data and looking at basic foul rates.

Data

We examine play-by-play data and box-score data from the NBA for the 2011-2012, 2012-2013, 2013-2014, 2014-2015, and 2015-2016 seasons. This data is publicly available from http://www.nba.com. The play-by-play contains rich event data for each game. The box-score includes data for which players started the game, and which players were on the court at the start of a quarter. Data, in csv format, can be found here.

Using the box-score data and substitutions in the play by play for each game, we can determine the amount of time any given player has actively played in the current game at each event in the play by play data.  We look at only active player time, rather than game time within a game to accurately determine how often a player commits foul. Most discussion of time throughout discusses only actual play time; that is, individual person time for each player. Using player play time should control for substitution patterns, as a player in foul trouble will likely not play until later in the game. If we used game time, it would artificially increase time between fouls. Additionally, censoring times for each player in each game were generated.  For example if a player only committed 3 fouls in a game, an entry was generated for his 4th  foul, with foul time equal to the max player time and an indicator that the foul did not occur. This is important as we need to account for censored fouls in our analysis.

For now, let us only consider only centers in our analysis, to minimize effects of fouling patterns between different NBA positions. Overall, we will further limit ourselves to Al Horford, Andrew Bogut, Brook Lopez, DeMarcus Cousins, Dwight Howard, Marc Gasol, Robin Lopez, and Tyson Chandler.
In our analysis, we will focus on DeMarcus Cousins, Al Horford, and Robin Lopez as these three centers exhibit three distinct trends that we see in other centers that we analyzed. All centers considered share many of the same characteristics in our analysis as well.

Summary Statistics

Even simple analysis and statistics can give us some insight into how NBA players accrue up to 6 personal fouls over the course of a game. Table 1 gives a few summary statistics. Table 1a gives basic statistics for most of DeMarcus Cousins’ fouls from the 2011-2016 season. We can see that on average, Cousins commits his 1st personal foul after about 500 seconds (or about 8 minutes and 20 seconds) of his personal playing. By contrast, he commits his 4th foul about 300 seconds (or about 5 minutes) of personal playing time after committing his 3rd foul. Table 1b gives the same statistics for Al Horford where we see that his 1st foul comes after an average of about 823 seconds while he commits his 4th foul an average of about 311 seconds after his 3rd. From these numbers, it might appear that Horford is more “tilted” given his time between fouls shrinks more than Cousins.

However, the tables also show that Horford had 80 games in which he only recorded a single foul and only 37 games where he recorded 4 or more fouls. By contrast, Cousins had only 11 games with a single foul and 193 games with 4 or more. Because games often end before a player commits all six fouls, many of the foul times are right censored by the end of the game. These foul times are not included in simple summary statistics and therefore merely examining the average time to foul does not accurately reflect all the differences between players or how those players individually accrue fouls.

Visualization – Survival Curves

Next, we visualize foul rates for each player by using Kaplan Meier survival curves. A survival curve, in general, is used to map the length of time that elapses before an event occurs. Here, they give the probability that a player has “survived” to a certain time without committing a particular number foul. These curves are useful for understanding how a player accrues fouls while accounting for the total length of time during which a player is followed, and allows us to compare how the different fouls are accrued.

Figure 1a gives the overall survival curves for Cousins. From the graph, there appears to be some evidence his time to foul decreases as he accrues fouls because there is layering between the fouls. While the trend may seem small, it is much starker than that for other centers, as we can see in Figures 1b for Al Horford and 1c for Robin Lopez. Their curves appear much more random. The survival curve for Al Horford’s 6th foul seems abnormal, along with Robin Lopez to a smaller extent. This abnormality is likely explained by the small sample sizes for 6 fouls as seen in Table 1.

I’d like to note here that it is important to use survival curves in this scenario as it accounts for censoring. If we were to just look at the densities of fouls for a given player, we might falsely see a very different trend. Figure 2 shows raw foul densities for DeMarcus Cousins, and there is a clear ordering for the fouls.

As mentioned above, if games were infinitely long, and players continued to play, we would observe every player until he committed his 6th personal foul and was removed from the game. As games are of finite length, many fouls are censored due the end of follow up time. Therefore, it makes sense that the 5th and 6th fouls would be subject to sampling bias. For example, if the 5th foul is committed with 4 minutes left in the game, we will never observe a 6th foul that comes 5 minutes later. To help adjust for this censoring, we considered limiting analysis to only games where all 6 fouls were committed. However, this limitation severely restricts the sample sizes for all players. Instead, we will examine games for each player where they committed a minimum of five fouls and limit our analysis to the first four fouls. This foul restriction gives us a larger sample size, though restricts us from gaining understanding about how players accrue their 5th and 6th fouls.

Figures 3a, 3b, and 3c show the 5 foul minimum survival curves for Cousins, Horford, and Lopez. Cousins displays much clearer ordering, where the more fouls he accrues, the more likely he is to foul. However, Horford and Lopez show much less distinction between the fouls. Lopez shows some ordering, especially for his 4th foul, but Horford’s curves are fairly random.

Of course we are not controlling for nearly enough variables and the sample size is sadly limited. A full discussion of areas for further research will be discussed later. However, for now we now have a nice way to model and visualize fouls so we can understand them better moving forward.

Part 3 of this series can be found here

## NBA Fouls – I Love DeMarcus Cousins

This is part 1 of my series on DeMarcus Cousins and how NBA players accrue personal fouls.

If you’ve ever talked about the NBA with me for any significant amount of time, you will know that one of my favorite players is DeMarcus Cousins (my favorite player being, of course, Shaun Livingston). I’ve always liked Boogie for the usual reasons related to his abilities on the court, of course, but also because he has been the focus and inspiration of much of my analytics work for the past year. Before I get into all the details and numbers, I’d like to share the story of how it all came to pass.

I am from Berkeley, California. As such, my home team is the Golden State Warriors. Last school year, during my Christmas break I was able to attend the December 28th 2015 matchup between the Warriors and Sacramento Kings at Oracle Arena. Many people remember that game because the end of the first half featured a three-point shoot out between Stephen Curry and Omri Casspi:

It was incredibly exciting and made for a close game.

A few months later, at the 2016 Sloan Sports Analytics Conference, that sequence came up in a conversation with my friend who was also in attendance.  I mentioned that I was at that game, but that the end of the first half wasn’t what I remembered most about the game. What I remember is this:

It was my first time seeing Cousins get ejected and even from my seat in the upper bowl, I could feel how frustrated and upset he was about the whole ordeal.

My friend commented that “if only Boogie could be, like, 15% less angry – he would be the most dominant player in the game.” Which of course got me thinking – how *would* you quantify how mad DeMarcus Cousins is at any given time?

A potential answer presented itself several months later at the 2017 Joint Statistical Meetings where I attended a session titled “For the Love of the Game: Applications of Statistics in Sports.” In that session, Douglas VanDerwerken presented “Does the Threat of Suspension Curb Dangerous Behavior in Soccer? A Case Study from the Premier League.” This paper (which can be found here for interested readers) showed that as EPL players approach the yellow card limit, and thus face suspension, they are less likely to foul.

Thinking back to that December 28th game, and many additional Kings games I have watched, it seemed to me that Cousins would get heated and “tilted” and play more aggressively and therefore foul more often. I hypothesized that the more Cousins fouls the more likely he was to foul.

He does. But he’s not the only one.

I’ll get into the math/stats in a later post. But here is a general idea of how we can think about this problem. Given there is a fixed amount of time that a given player is on the court, we might expect fouls to follow a Poisson arrival process with inter-arrival times following an Exponential distribution where each foul is independent of the previous fouls. We can consider a survival model, and look at the “failure time” for each foul – in other words the time it takes a player to commit his 1st found, 2nd foul, etc. If, for example, the time between the 2nd and 3rd foul is significantly longer than than the time between the 4th and 5th foul, we would have evidence of some sort of “tilt.” We can model foul rates using a conditional risk set model for ordered events and do some analysis with a stratified Cox model. From there we can try to identify if there are any actions a coach/team can take in order to mitigate increased fouling rates.

I’ll save the details for later.

Part 2 of this series can be found here