For the past week I've been devouring the new Baseball Prospectus volume. I've never used it as a resource before, but it seems as well-researched and analytical as any available fantasy aid, which makes it ideal for projecting 2006 stats. I especially like the idea that it's projections are compiled "scientifically", with many factors mathematically computed to reach the most accurate numbers. It certainly is interesting and now that I understand some of the alternative stats, there's certainly a lot of information to consider.
However, it seems that the projected stats are almost universally low; many, many players (even ones that have a very positive scouting report underneath) seem to have diminished production projected for the upcoming season.
For example, Dan Haren is hailed as "a breakout candidate." However, his ERA is projected to jump from 3.73 to 3.93, his WHIP from 1.22 to 1.26 and his wins to drop from 14 to 12. Sounds more like regression than breakout to me. John Lackey is also touted as "a sleeper Cy Young candidate", but his win, inning, strikeout, and ERA numbers are all projected to be worse than last year. This doesn't seem to make sense to me.
Other major head-scratchers that I've come across include Johnny Damon projected to score just 93 runs atop the Yankees lineup. It seems to me that he would have to score at least 100 with his eyes closed, considering the boppers behind him.
Hideki Matsui is projected to drive in just 92 runs after posting RBI numbers of 106, 108, and 116 in the past three years?
David Wright is projected to see regression in his runs scored, RBI, and batting average numbers. Same with Jhonny Peralta and Mark Teixera (who is expected to lose 30 RBIs from last season). That seems odd, considering these guys are all young, getting better, and playing in terrific lineups.
Victor Martinez: Last season: 73 runs, 20 HRs, 80 RBIs, and .305 (despite a horrible start)
vs.
BP projections: 66 runs, 17 HRs, 79 RBIS, and .277
Chase Utley: Last season: 93 runs, 28 HRs, 105 RBIS, and .291 (despite not starting until mid-May)
vs.
BP projections: 82 runs, 24 HRs, 81 RBIs, and .280
Vladimir Guerrero Last season: 95 runs, 32 HRs, 108 RBIs, and .317 (despite significant health problems)
vs.
BP projections: 84 runs, 29 HRS, 103 RBIs, and .314
I can't think of three players, all of whom had extended struggles for various reasons (slump, platoon status, and injuries respectively), who are more likely to improve on their stats from last year. Yet all of them are projected to go down--in some cases, by a lot. What reason is there to believe that VMart will hit a career low in RBIs this year? Why should Chase Utley lose 24 RBIs, even though--unlike last year--he will be the uncontested starter at 2B from day one?
BP projections are always low. It's much better to look at their breakout % or collapse % or what they say. Some players just don't fit well into their system which is why the system has always been down on VMart.
My apologies. I have a nephew named Anfernee, and I know how mad he gets when I call him Anthony. Almost as mad as I get when I think about the fact that my sister named him Anfernee.
This has been brought up before, but the reason why the projections (at least the counting stats) seem much lower than expectations is that the line that you see in the book is not based on their own subjective prediction for playing time, but likely rather on the playing time of comparable players from the past. The projections online (see the depth charts page) do make an adjustment for expected playing time (and possibly lineup), bumping those numbers up a little. Though generally, the projections do tend to be somewhat more conservative in categories like RBI, W, and L, for obvious reasons.
The Prospectus projections were not designed with fantasy baseball in mind, only being adapted for it after the fact. Still, they are as objective and robust as any you will find on the market currently. It's preferable to putting your faith in three-year averages and idle speculation (especially from amateurish web-sites and the shills that run them). ( I kid, I kid.)
Regarding the "breakout candidate" topic for Haren and Lackey. The "breakout" is based on the 3 previous season's worth of stats. So Lackey, for example, had a good year last year but if you average his last 3 seasons, PECOTA expects him to be better than his 3-year average. Even if he regresses from last year, his projected numbers are better than his 3-year average.
Also, in the book, their projected playing time (plate appearances/innings) are always lower than other projections I look at. The same goes for Runs and RBI. I did some work in Excel in the off-season, looking at correlations for assorted projections sources (Bill James Handbook, Rototimes, Forecaster, ZiPS, and PECOTA) and PECOTA had the lowest correlation when looking specifically at Runs/PA and RBI/PA, while they were 2nd in OBP and SLG (we don't use AVG in the league I'm in). For hitters, their rate stats seem to useful but the projected Runs and RBI need to be taken with a grain of salt.
I believe it's always a good idea to look at more than one source and if something just doesn't "feel" right, make your own adjustments.
Roger Angell: I was talking with Bob Gibson and I said: 'Are you always this competitive?' He said: 'Oh, I think so. I got a three-year old daughter, and I've played about 500 games of tic-tac-toe with her and she hasn't beat me yet.'
Just to add in to this. The rate stats--BA, OBP, SLG, tend to be accurate in BP, but the counting stats tend to be low. If you read the introduction, you see that what they do is somewhat different than what others do. Other projections merely provide a point estimate, Player X will do this. The BP PECOTA system projects the player's distribution of performance. Player x has a 10 percent chance of doing this or better, a 20% chance of doing this or better, a 30% chance of doing this or better, and so on.
Given that approach, they have to make a choice of what exact estimate to publish in the book. Their choice is to use median estimate---Player X has a 50% chance of doing this or better. That estimate is a good predictor for rate states, but not counting stats. The full details on their projections for a player are on the web site.
If all the stats are low at more or less the same rate, then its still a useful guide to how players will do compared with each other.
Frankly I dont need a specific player to hit X HRs, I need my team of players to do better as a whole then the other teams of players. As long as I'm grabbing the better players all around I should be fine.
swyck wrote:If all the stats are low at more or less the same rate, then its still a useful guide to how players will do compared with each other.
Frankly I dont need a specific player to hit X HRs, I need my team of players to do better as a whole then the other teams of players. As long as I'm grabbing the better players all around I should be fine.
This is not true. Players that are outliers in the valuation curve are more valuable or less valuable based on how far they differentiate themselves from the mean. Grouping everyone closer to the mean takes away from this. Do it over 4 cumulative categories used in the standard 5x5 for hitting and you will be way off.
swyck wrote:If all the stats are low at more or less the same rate, then its still a useful guide to how players will do compared with each other.
Frankly I dont need a specific player to hit X HRs, I need my team of players to do better as a whole then the other teams of players. As long as I'm grabbing the better players all around I should be fine.
This is not true. Players that are outliers in the valuation curve are more valuable or less valuable based on how far they differentiate themselves from the mean. Grouping everyone closer to the mean takes away from this. Do it over 4 cumulative categories used in the standard 5x5 for hitting and you will be way off.
No, you're wrong here. If the counting stats are uniformly lowered by the same factor (say 7/8 to pick a number), a player would be the same number of standard deviations from the mean as he would have been before the scaling.