StrategyFebruary 15, 2012

Post to Twitter

Does BABIP Steal Our Common Sense? - 4 comments

By Michael Caron

Statistics can be misleading, we all know this. BABIP is no exception. Used incorrectly, BABIP can lead us to some unfortunate conclusions. Before sabermetrics rushed onto the scene with flashy statistics measuring and objectifying everything there is to know about baseball, people relied on common sense to come to some pretty basic conclusions. Such as the idea that fast hitters are going to beat out more ground balls for hits than slow hitters. No one is going to argue with that, right? Unfortunately, many people misuse BABIP (Batting Average on Balls In Play) and forget about this basic principle. Tristan Cockroft of ESPN wrote a very helpful piece on understanding BABIP that I highly suggest reading before continuing with this article.

Now that you have a better understanding of BABIP, let’s continue. For today, we are going to focus on how to correctly use BABIP to remind us that a player’s speed will improve their batting average on ground balls in play, and thus their BABIP.

Cockroft touches on finding an expected BABIP by multiplying a players ground balls, line drives, fly balls and bunt hits by the league average BABIP for those categories. While I believe this is a good starting point — because as Cockroft has shown, the type of hits that a player tends to hit will go a long way in determining his BABIP — I want to take this a step further. I’m going to show you, sabermetrically, that we need to take a player’s speed into greater consideration when determining an expected BABIP.

First, lets take a look at some of the top BABIPs from 2011 to see how the data might be misinterpreted. Remember that the league average BABIP tends to hover around .300.

Player2011 BABIP
Matt Kemp0.380
Adrian Gonzalez0.380
Emilio Bonifacio0.372
Michael Bourn0.369
Michael Young0.367
Alex Avila0.366
Miguel Cabrera0.365
Hunter Pence0.361
Alex Gordon0.358
Dexter Fowler0.354
Jose Reyes0.353
Ryan Braun0.350
Joey Votto0.349
Andre Ethier0.348

Based on this data alone, it would be easy to conclude that all of the above players were very lucky in 2011 and due for significant regression to the league average of about a .300 BABIP in 2012. Now, let’s use Cockroft’s xBABIP formula to get a better idea of how lucky these players were.

Player2011 BABIPxBABIPDifference
Matt Kemp0.3800.3070.073
Adrian Gonzalez0.3800.3070.073
Emilio Bonifacio0.3720.3540.018
Michael Bourn0.3690.3560.013
Michael Young0.3670.3350.027
Alex Avila0.3660.3020.064
Miguel Cabrera0.3650.3080.057
Hunter Pence0.3610.2910.070
Alex Gordon0.3580.3040.054
Dexter Fowler0.3540.3120.042
Jose Reyes0.3530.3070.046
Ryan Braun0.3500.3010.049
Joey Votto0.3490.3350.014
Andre Ethier0.3480.3260.022

We’re already starting to see that we should expect speedy hitters that hit a lot of ground balls such as Emilio Bonifacio and Michael Bourn to have a higher xBABIP than normal. But others, such as the speedy Matt Kemp (who only hit ground balls 36% of the time in 2011) still appear to have been extremely lucky in 2011 based on their xBABIP.

So I dug deeper. Knowing that the league average for batting average on ground balls in play was .235 in 2011, I hypothesized that fast hitters would have a much higher average on ground balls in play. I took every player that has stolen 25 or more bases in a season since 2008, along with every player that had at least 500 plate appearances and had 0-1 stolen bases. From those two separate groups, I found their batting average on ground balls in play (we‘ll call it BAGBIP). Here is what I found.

YearBAGBIP >25 SBsBAGBIP 0-1 SBs and >500 PA
2011.292 (22 Players).239 (21 Players)
2010.298 (25 Players).219 (29 Players)
2009.290 (28 Players).220 (26 Players)
2008.290 (23 Players).224 (28 Players)

As you can see, players with at least 25 stolen bases have a much higher batting average on ground balls in play than the players that had 0-1 stolen bases. From here, I tweaked Cockroft’s xBABIP formula, (GB * .235) + (FB * .137) + (LD * .716) + (BUNT * .388), using the .293 AVG for ground balls on the players that had more than 25 SBs in 2011 and the .226 AVG on the players that had 0-1 stolen bases in 2011 instead of the .235 AVG Cockroft used, to better reflect the impact of speed on a players xBABIP.

Matt Kemp0.3800.3070.328
Adrian Gonzalez0.3800.3070.302
Emilio Bonifacio0.3720.3540.384
Michael Bourn0.3690.3560.385
Michael Young0.3670.335N/A
Alex Avila0.3660.302N/A
Miguel Cabrera0.3650.308N/A
Hunter Pence0.3610.291N/A
Alex Gordon0.3580.304N/A
Dexter Fowler0.3540.312N/A
Jose Reyes0.3530.3070.331
Ryan Braun0.3500.3010.324
Joey Votto0.3490.335N/A
Andre Ethier0.3480.3260.322

Players with N/A did not have 25 SBs or 0-1 SB in 2011.

Just within this group, some numbers really jump out at you. At first glance, Emilio Bonifacio and Michael Bourn appeared extremely lucky in 2011 based on their BABIP’s compared to the league average of about .300 and look like prime candidates to regress in 2012. However, after taking their speed into consideration, they may have actually been slightly unlucky in 2011 with BABIP’s lower than their new xBABIPs.

Other players, such as Matt Kemp and Jose Reyes, appear to be headed for some BABIP regression in 2012, but to a lesser extent than first expected. The formula reinforced the fact that Adrian Gonzalez and Andre Ethier are headed for some BABIP regression.

As I stated at the beginning of this article, stats can be misleading. If you’re simply going to take a quick glance at BABIP and arrive at hasty conclusions regarding a player’s “luck” without considering why their BABIP may have been so high or so low, you might as well ignore BABIP altogether and be better off for it. Before there was BABIP, it was taken for granted that fast hitters beat out ground balls more than slow hitters. BABIP can be a very useful stat in fantasy baseball if you don’t allow it to steal the things you already know.

Rate this article: DreadfulNot goodFairGoodVery good (10 votes, average: 4.80 out of 5)
Loading ... Loading ...

Want to write for the Cafe? Check out the Cafe's Pencil & Paper section!

Post to Twitter

Related Cafe Articles

• Other articles by Michael Caron

No related articles.

4 Responses to “Does BABIP Steal Our Common Sense?”

  1. User avatar MashinSpuds says:

    Thanks for the write up but … my head is spinning.

  2. IndyPostman says:

    Nice, fresh perspective. BABIP is obviously very useful to predict regression for pitchers and hitters alike, but as you’ve displayed, looking deeper can reveal still more. The rabbit hole goes very deep!

  3. ML610 says:

    Good job…

    I’m not sure how much of a difference it would have made, but using SB’s as the proxy for speed is potentially troublesome in that they’re not purely a function of speed, but to a degree managerial philosophy as well. Probably wouldn’t have changed the pool too much, but a speed score or something similar may have been a better input.

  4. CaronBoy says:


    I considered this, but decided to stick with SB’s as I didn’t want to overcomplicate what is already a complicated subject and potentially lose readers. There is a part two in the works that will take the formula a step further and substituting a speed score for SB’s is one of the potential changes.


Leave a Reply

You must be logged in to post a comment.