I debated doing this as an article but since I don't have a clue why this happens I figured I'd throw it out there as a discussion.

Aaron Harang is a pretty solid pitcher but traditional analysis of hr/9, k/9 and bb/9 would indicate he should be even better. Taking his career numbers and plugging them in to the formula I normally use to project hits in play I came up with a projected hit total of 765. Harang however has given up 817 hits in his career. Taking out his 95 home runs given up that means that he's given up 722 hits on balls in play versus the 670 that we would have expected. That's 7.8% or so more hits on balls in play which is about a .320 BABIP or so (can't find the exact number but that's a good ballpark number).

I could almost see that out of a ground ball pitcher but Harang's pretty much neutral in that regard with a 0.97 G/F ratio over his career. If he played in a hit inflating environment like Coors (not to be confused with a home run inflating environment which GABP definitely is) then I could almost understand it. The only explanation that I can find is that the Reds defense definitely qualifies as bad...but is it THAT bad? And for that many years?

At this point I'll open things up to discussion. Why do you think Aaron Harang sees so many of the balls that are put in play fall in for hits against him?

Having Felipe Lopez/Royce Clayton at SS his whole career, Adam Dunn in LF, an aging Griffey Jr. in CF (who is slower than most realize), an array of young, error-prone hitting prospects at third, and off-and-on stints with Wily Mo Pena in RF - I'm going to guess his defense has had quite a bit to do with it.

I enjoy watching my Reds, I really do, but the defense over the last 3 or 4 years is the some of the worst I have ever seen on any baseball team. Their only decent defenders of the last few years, Austin Kearns and Sean Casey, were traded. But things are finally starting to look brighter with the aquisition of Alex Gonzalez at SS, Brandon Phillips at 2B is definitely above average, EE is starting to figure out how to play 3B, Rich Aurilla is gone and Hatteburg can at least field his position, Ryan Freel is better in the OF than he was part-time at 2B, and Josh Hamilton looks extremely promising in the OF when he plays.

I have not looked closely at your formula yet but using the classic BABIP formulas there is nothing strange about Harang. He's pretty close to where you would expect him to be and he hasn't been far above or below the team BABIP over the last three seasons. (Note: These numbers are slightly off because Lahman doesn't have SF for pitchers.)

I have not looked closely at your formula yet but using the classic BABIP formulas there is nothing strange about Harang. He's pretty close to where you would expect him to be and he hasn't been far above or below the team BABIP over the last three seasons. (Note: These numbers are slightly off because Lahman doesn't have SF for pitchers.)

I only looked at career numbers on Harang...do you have the career BABIP numbers for him?

Also, that .316/.304 looks way out of whack. I know it doesn't sound like much but when you consider how many plays end on a ball in play that's pretty significant. I'm actually projecting 2.82 outs from the pitcher per inning in my formula to account for outs on the basepaths and then assuming a .300 BABIP to project our expected number of hits in play. That projection is off by about 7.8% over the course of his career. That's pretty unusual over that type of time-frame. However, if those 2005 numbers you posted are any indication that may be par for the course when playing for the Cincinnati Reds. I'd love to see team totals for BABIP versus Harang's totals for BABIP over his entire career averaged out (ie one number for Harang, one number for the team) in order to give us a large sample size from which to draw conclusions.

do you have the career BABIP numbers for him?

His career BABIP is ~.305.

Also, that .316/.304 looks way out of whack. I know it doesn't sound like much but when you consider how many plays end on a ball in play that's pretty significant.

Well what it tells us I think is that he was slightly unlucky last year and it wasn't a product of poor defense (or at least no more so than every other CIN pitcher had to deal with). But I'm not sure how much significance we can apply to a single season. He was .011 above his norm which comes out to ~7 hits above the team norm.

I'd love to see team totals for BABIP versus Harang's totals for BABIP over his entire career averaged out (ie one number for Harang, one number for the team) in order to give us a large sample size from which to draw conclusions.

I think averaging out team BABIP over multiple seasons would add a lot of noise. Also, Harang had limited playing time prior to 2004 so the bulk of the useful data is the 04-06 seasons.

That's really strange...it seems like he has a lot more hits than a pitcher with that type of babip should have. Maybe it's just that most other teams have better babip than what we're seeing here.

Just to go into a bit more detail on the formula I use to project hits, it's so straightforward that I don't think there'd be any weird noise there. Deriving the formula to solve for hits in play you get the following:

Hits in Play = (2.82 * IP - K) * BABIP / (1 - BABIP)

The formula's only questionable spot here is the 2.82 number. It's based upon the assumption that about 0.18 outs per inning are made on the basepaths against a pitcher. I've seen numbers either way of that figure by about 0.05 but I doubt that would cause much noise. Beyond that this isn't a theoretical formula...it's a matter of fact. The 2.82 * IP part is giving us our total outs not made on the basepaths. Subtracting the k's gives the total outs in play. Finally, BABIP / (1 - BABIP) gives us the ratio of hits in play to outs in play allowing us to solve for hits in play.

So assuming the 2.82 outs not on the basepaths per inning we can plug in Harang's career hits in play, innings and k's to solve for BABIP. Here it is:
722 = (2.82 * 774.33 - 621) * BABIP / (1 - BABIP)
0.462 = BABIP / (1 - BABIP)
0.462 - 0.462 * BABIP = BABIP
0.462 = 1.462 * BABIP
BABIP = 0.316

I'm honestly at a loss here as to what the explanation would be for the discrepancy between the formula I'm using here which shows him in the neighborhood of a 0.316 BABIP (which honestly seems about right for the high number of hits he gives up) and the more reasonable 0.305 that they have at the web site from which you're pulling the numbers. Not saying the BABIP you've found is wrong but again, the only place for error in the formula I'm using is the 2.82 constant...apart from that it's simply a way of stating a fact.

Well the numbers I have are using the standard BABIP formula. What do you mean by outs made on the base paths?

What do you mean by outs made on the base paths?

Outs made with runners trying to advance (1st to 3rd, scoring from 2nd on a single) and caught stealing. Actually double plays would apply as well since they're one at bat but they lead to multiple outs. Basically you have to deduct all of those results from the number of total outs because they would skew the batting average numbers...they're not at bats nor should they count as at bats. So the total number of outs in play is going to be some number less than 3 outs per inning...research has found that number to be in the neighborhood of 2.82 outs in play per inning.

I don't understand how base running outs have relevance to the rate of hits on balls in play?

I don't understand how base running outs have relevance to the rate of hits on balls in play?

They don't directly. However in the absence of the true number of outs in play (something not all sites provide) we can approximate that number by using the total outs (3 * IP) less the number of outs on the basepaths (about 0.18 / inning) to come to an approximate number of outs determined at the plate of 2.82 * IP. Subtract out the k's and that gives you the outs in play. From there it's easy to find babip, we just use total hits - home runs for the hits in play and figure out the babip with hits in play / (hits in play + outs in play).