Statistics can be misleading, we all know this. BABIP is no exception. Used incorrectly, BABIP can lead us to some unfortunate conclusions. Before sabermetrics rushed onto the scene with flashy statistics measuring and objectifying everything there is to know about baseball, people relied on common sense to come to some pretty basic conclusions. Such as the idea that fast hitters are going to beat out more ground balls for hits than slow hitters. No one is going to argue with that, right? Unfortunately, many people misuse BABIP (Batting Average on Balls In Play) and forget about this basic principle. Tristan Cockroft of ESPN wrote a very helpful piece on understanding BABIP that I highly suggest reading before continuing with this article.
Now that you have a better understanding of BABIP, let’s continue. For today, we are going to focus on how to correctly use BABIP to remind us that a player’s speed will improve their batting average on ground balls in play, and thus their BABIP.
Cockroft touches on finding an expected BABIP by multiplying a players ground balls, line drives, fly balls and bunt hits by the league average BABIP for those categories. While I believe this is a good starting point — because as Cockroft has shown, the type of hits that a player tends to hit will go a long way in determining his BABIP — I want to take this a step further. I’m going to show you, sabermetrically, that we need to take a player’s speed into greater consideration when determining an expected BABIP.
First, lets take a look at some of the top BABIPs from 2011 to see how the data might be misinterpreted. Remember that the league average BABIP tends to hover around .300.
| Player | 2011 BABIP |
| Matt Kemp | 0.380 |
| Adrian Gonzalez | 0.380 |
| Emilio Bonifacio | 0.372 |
| Michael Bourn | 0.369 |
| Michael Young | 0.367 |
| Alex Avila | 0.366 |
| Miguel Cabrera | 0.365 |
| Hunter Pence | 0.361 |
| Alex Gordon | 0.358 |
| Dexter Fowler | 0.354 |
| Jose Reyes | 0.353 |
| Ryan Braun | 0.350 |
| Joey Votto | 0.349 |
| Andre Ethier | 0.348 |
Based on this data alone, it would be easy to conclude that all of the above players were very lucky in 2011 and due for significant regression to the league average of about a .300 BABIP in 2012. Now, let’s use Cockroft’s xBABIP formula to get a better idea of how lucky these players were.
| Player | 2011 BABIP | xBABIP | Difference |
| Matt Kemp | 0.380 | 0.307 | 0.073 |
| Adrian Gonzalez | 0.380 | 0.307 | 0.073 |
| Emilio Bonifacio | 0.372 | 0.354 | 0.018 |
| Michael Bourn | 0.369 | 0.356 | 0.013 |
| Michael Young | 0.367 | 0.335 | 0.027 |
| Alex Avila | 0.366 | 0.302 | 0.064 |
| Miguel Cabrera | 0.365 | 0.308 | 0.057 |
| Hunter Pence | 0.361 | 0.291 | 0.070 |
| Alex Gordon | 0.358 | 0.304 | 0.054 |
| Dexter Fowler | 0.354 | 0.312 | 0.042 |
| Jose Reyes | 0.353 | 0.307 | 0.046 |
| Ryan Braun | 0.350 | 0.301 | 0.049 |
| Joey Votto | 0.349 | 0.335 | 0.014 |
| Andre Ethier | 0.348 | 0.326 | 0.022 |
We’re already starting to see that we should expect speedy hitters that hit a lot of ground balls such as Emilio Bonifacio and Michael Bourn to have a higher xBABIP than normal. But others, such as the speedy Matt Kemp (who only hit ground balls 36% of the time in 2011) still appear to have been extremely lucky in 2011 based on their xBABIP.
So I dug deeper. Knowing that the league average for batting average on ground balls in play was .235 in 2011, I hypothesized that fast hitters would have a much higher average on ground balls in play. I took every player that has stolen 25 or more bases in a season since 2008, along with every player that had at least 500 plate appearances and had 0-1 stolen bases. From those two separate groups, I found their batting average on ground balls in play (we‘ll call it BAGBIP). Here is what I found.
| Year | BAGBIP >25 SBs | BAGBIP 0-1 SBs and >500 PA |
| 2011 | .292 (22 Players) | .239 (21 Players) |
| 2010 | .298 (25 Players) | .219 (29 Players) |
| 2009 | .290 (28 Players) | .220 (26 Players) |
| 2008 | .290 (23 Players) | .224 (28 Players) |
| AVG | 0.293 | 0.226 |
As you can see, players with at least 25 stolen bases have a much higher batting average on ground balls in play than the players that had 0-1 stolen bases. From here, I tweaked Cockroft’s xBABIP formula, (GB * .235) + (FB * .137) + (LD * .716) + (BUNT * .388), using the .293 AVG for ground balls on the players that had more than 25 SBs in 2011 and the .226 AVG on the players that had 0-1 stolen bases in 2011 instead of the .235 AVG Cockroft used, to better reflect the impact of speed on a players xBABIP.
| Player | 2011 BABIP | xBABIP | New xBABIP |
| Matt Kemp | 0.380 | 0.307 | 0.328 |
| Adrian Gonzalez | 0.380 | 0.307 | 0.302 |
| Emilio Bonifacio | 0.372 | 0.354 | 0.384 |
| Michael Bourn | 0.369 | 0.356 | 0.385 |
| Michael Young | 0.367 | 0.335 | N/A |
| Alex Avila | 0.366 | 0.302 | N/A |
| Miguel Cabrera | 0.365 | 0.308 | N/A |
| Hunter Pence | 0.361 | 0.291 | N/A |
| Alex Gordon | 0.358 | 0.304 | N/A |
| Dexter Fowler | 0.354 | 0.312 | N/A |
| Jose Reyes | 0.353 | 0.307 | 0.331 |
| Ryan Braun | 0.350 | 0.301 | 0.324 |
| Joey Votto | 0.349 | 0.335 | N/A |
| Andre Ethier | 0.348 | 0.326 | 0.322 |
Players with N/A did not have 25 SBs or 0-1 SB in 2011.
Just within this group, some numbers really jump out at you. At first glance, Emilio Bonifacio and Michael Bourn appeared extremely lucky in 2011 based on their BABIP’s compared to the league average of about .300 and look like prime candidates to regress in 2012. However, after taking their speed into consideration, they may have actually been slightly unlucky in 2011 with BABIP’s lower than their new xBABIPs.
Other players, such as Matt Kemp and Jose Reyes, appear to be headed for some BABIP regression in 2012, but to a lesser extent than first expected. The formula reinforced the fact that Adrian Gonzalez and Andre Ethier are headed for some BABIP regression.
As I stated at the beginning of this article, stats can be misleading. If you’re simply going to take a quick glance at BABIP and arrive at hasty conclusions regarding a player’s “luck” without considering why their BABIP may have been so high or so low, you might as well ignore BABIP altogether and be better off for it. Before there was BABIP, it was taken for granted that fast hitters beat out ground balls more than slow hitters. BABIP can be a very useful stat in fantasy baseball if you don’t allow it to steal the things you already know.
|
Want to write for the Cafe? Check out the Cafe's Pencil & Paper section! |

Cafe Home
Fantasy Football
Fantasy Basketball
Fantasy Hockey



(10 votes, average: 4.80 out of 5)




Thanks for the write up but … my head is spinning.
Nice, fresh perspective. BABIP is obviously very useful to predict regression for pitchers and hitters alike, but as you’ve displayed, looking deeper can reveal still more. The rabbit hole goes very deep!
Good job…
I’m not sure how much of a difference it would have made, but using SB’s as the proxy for speed is potentially troublesome in that they’re not purely a function of speed, but to a degree managerial philosophy as well. Probably wouldn’t have changed the pool too much, but a speed score or something similar may have been a better input.
@ML610:
I considered this, but decided to stick with SB’s as I didn’t want to overcomplicate what is already a complicated subject and potentially lose readers. There is a part two in the works that will take the formula a step further and substituting a speed score for SB’s is one of the potential changes.