all of this started from the fact i was dissatisfied with yahoo's (our preferred hosting site) player rankings not taking in account our additional scoring categories. it's just common sense that this league would yield substantially different results from traditional 5x5 leagues.
so i made an excel spreadsheet and imported all batters with over 50 abs and all pitchers with over 20 IP for the 2010 season just to keep things relatively simple at the time.
i then weighted every scoring category to attempt to have them equal the same value within a h2h format, as all wins = 1 point and all loses = 0 points come weeks end. i felt i have accomplished this by dividing a particular players total for that scoring category by the total number of occurrences found within this particular sample set.
for instance: there were 1198 saves, 1993 holds, and 2410 wins for the entire imported player pool. my reasoning tells me that due to the disparity between these categories, a save will be twice as effective at winning its respective category versus a win due to it being twice as scarce, and subsequently that a relief pitcher with 5 wins and 10 saves will be 39.9% more effective (valuable) as a relief pitcher with 5 wins and 10 holds.
here is a breakdown of all the scoring categories totals for players meeting the AB/IP minimum
2410 wins 1198 saves 33294 strikeouts 1993 holds 4.018 average ERA 1.336 average WHIP 2.219 average K/BB 7.139 average K/9 **quality starts: 10076.92 - see note
all of the dynamic categories (ERA, WHIP, K/9, K/BB, AVG) ive taken in account for a IP factor or a AB factor. obviously a pitcher with 1.00 whip over 10 innings is not as valuable as a pitcher with 1.05 WHIP over 200 innings, and this hopefully accounts for that. it takes the number of innings pitched/AB for that player divided by the entire value for the league.
i have converted this stat into an value representing the likelihood of a QS for a pitchers season due to the criteria specificity nature of this stat. players with a higher value will have a higher chance of earning a QS. i did this to also allow me to see any potential outliers or abnormally high/low QS versus probable expectancy. this value takes in account games started calculated against games pitched (to take out inflated numbers that spot starters would have), ER, and innings pitched. my reasoning: pitchers that have higher innings per start and lower runs per inning will theoretically have higher quality starts.
example of arguably the top 5 pitchers for 2011 (in no particular order): felix hernandez computed a value of 134.48 (1st), adam wainwright: 122.47 (2nd), roy halladay: 121.42 (3rd), cliff lee 79.18 (38th), and tim lincecum: 86.41(29th)
number of quality starts and rank in parenthesis: felix hernandez: 30 (1st), adam wainwright: 25 (t-4th), roy halladay (t-4th), cliff lee: 18 (t-53), tim lincecum:22 (t-18)
this leads me to be confident in my quality start predictor. the actual value of of the calculated number as the end result is to compare each pitcher against one another in an attempt to see who will have a higher effect (adding to the scoring) for this one particular stat, thus making them more valuable.
********************************************
interesting observations i have made:
one triple is equally as valuable as almost 5.35 homeruns and one steal is equally as valuable as 1.55 home runs. one triple is equally as valuable as almost 23.22 RBI and one steal is equally as valuable as 6.75 RBI
so with the same exact logic i have used to compare relievers i have concluded that a player with 10 triples and 34 stolen bases (shane victorino) would be more valuable than a player with 42 Home Runs and 118 RBI (albert pujols) in this simplified comparison.
10 triples = 232.2 RBI and 34 SB = 52.7 HR.
now i understand there is A LOT more that goes into calculating a players relative value versus another player, but taken across all 18 scoring categories, last years statistics show me that shane victorino (a total score of 0.04934) was, in fact, more valuable than albert pujols (0.04747), a difference of 3.8%
i understand that a total value calculation cannot be performed to the precision that i would like because not all of these stats would count and some leagues would yield more valuable yielding box score stats than others, but this should show a general trend.
i have run this for just the standard 5x5 scoring cats, and under last year the best h2h player a team could have in terms of winning the most points is: health bell (0.0512 weighted score) . highest ranked positional player: juan pierre (but maybe there is validity i nthis statement as i was struggling to make the playoffs last year in my league. needed help in stolen bases and walks. added juan pierre and daric barton (both 1 cat guys) and eventually won the league).
is/where is my logic wrong? everything that i (thought) i know about fantasy baseball goes completely against what i am find through this trial algorithm.
yes i understand this is borderline perverse but i have a job that allows me ample time on the computer, so i took this up instead of playing games. any critiques or suggestions? attached is my file. i have mocked together some projections for players of interest to see how they compare in some other worksheets within the spreadsheet.
mr anderson... wrote: last years statistics show me that shane victorino (a total score of 0.04934) was, in fact, more valuable than albert pujols (0.04747), a difference of 3.8%
I knew it
That's one hell of an undertaking there, and Welcome to the Cafe
mr anderson... wrote:now i understand there is A LOT more that goes into calculating a players relative value versus another player, but taken across all 18 scoring categories, last years statistics show me that shane victorino (a total score of 0.04934) was, in fact, more valuable than albert pujols (0.04747), a difference of 3.8%
highest ranked positional player: juan pierre (but maybe there is validity i nthis statement as i was struggling to make the playoffs last year in my league. needed help in stolen bases and walks. added juan pierre and daric barton (both 1 cat guys) and eventually won the league).
These should immediately set off alarms that your system doesn't pass the sniff test and something is incorrect in your methodology.
mr anderson... wrote:so i made an excel spreadsheet and imported all batters with over 50 abs and all pitchers with over 20 IP for the 2010 season just to keep things relatively simple at the time.
for instance: there were 1198 saves, 1993 holds, and 2410 wins for the entire imported player pool.
This seems like the part you went wrong. Your player pool is way too big for your league. You didn't mention how many teams and roster spots your league has, but unless you're league is something crazy like 30 teams with 20 man rosters, you have far too many players to total each stat. The reason Victorino is so valuable in your methodology is because the bottom of your sample (guys with 50-250 AB's who should be ignored) are going to proportionality get a lot fewer 3B than other stats. Similarly, guys who rack up SV & QS will be drastically over valued.
To use your method, you need to either total your league's actual stats last year or take the top X players at each position (position can even mean hitter vs pitcher) where X=the number of rostered players in your league.
A good starting point is finding the theoretical "replacement player" which is a player whose stats can be had for free. The best player not on any team. Something like .270 average (weighted however you like for ABs), 15 HR, 70 R/RBI, 8 SBs, 5 triples, etc. Those values are completely made up, but should be based on the amount of players rostered in your league as well as a way of determining replacement value, and need to be calculated. You'll find that this will help your method pass the sniff test. Value is only given to the stats produced above the replacement level. Maybe assign a nominal value for each category (easy for auction leagues in which actual dollars can be used). Even still, adjustments need to be made. Maybe stolen bases are overvalued because they are available cheap, or saves are undervalued because people spend a lot to get them. So you assign category values. It's a very, very complicated process. It's fun to do and learn yourself, but if you're simply interested in proper valuation and advice then I would highly recommend Mastersball.com which not only provides excellent advice, articles, projections, and expert valuations, but they offer (or at least they did last year) customized valuations based on your league settings which are manually performed and presented in excel format. You don't have to use them verbatim, but they really help you understand concepts that aren't so apparent or intuitive. It's 50 bucks for the whole year ad I've paid twice (last season and in 2004, both were well worth it). If you're jut looking for free advice, I'm sure that google will provide useful information on the topic. But remember: replacement players are the key to the whole process and give the stats a reference point.
mr anderson... wrote:I am in the process of developing a fantasy baseball algorithm specifically designed to aid in draft day selection for my leagues custom format.
strategy wise, I'd punt SB and starting pitching. Just load up on power, and top of the line relievers. you could even go so far as to draft all of your bats first, then grab relievers.
mblax10 wrote: These should immediately set off alarms that your system doesn't pass the sniff test and something is incorrect in your methodology.
my thoughts exactly. thats why i felt i needed to get some outside opinions since my results were just too far removed from traditional lines of thought, and in the thick of fantasy football theres not too many active resources readily available.
mblax10 wrote:This seems like the part you went wrong. Your player pool is way too big for your league. You didn't mention how many teams and roster spots your league has, but unless you're league is something crazy like 30 teams with 20 man rosters, you have far too many players to total each stat. The reason Victorino is so valuable in your methodology is because the bottom of your sample (guys with 50-250 AB's who should be ignored) are going to proportionality get a lot fewer 3B than other stats. Similarly, guys who rack up SV & QS will be drastically over valued.
the number of teams varies per year but it is a fairly large player pool. last years lineup (it varies slightly year to year to keep things fresh, which is why i was aiming to have this calculator as streamlined as possible)
c 1b 2b 3b ss MI lf rf cf of util util
sp sp sp sp rp rp rp rp p p
8 bench, 2dl
so thats 32 total roster spots (counting dl), with 22 active per day. last year we only had 10 players (normally around 12-14) so a player pool of 300-320 being used. all mlb. thats why i wanted to cast a wide net
mblax10 wrote:To use your method, you need to either total your league's actual stats last year or take the top X players at each position (position can even mean hitter vs pitcher) where X=the number of rostered players in your league.
this was where i had planned for my calculator to ultimately end up (input the number of players and positions scored and it would weight it that way based off projected statistics), i was just befuddled from my preliminary results. i raised my filtering criteria to 250 abs have shown to be even MORE unexpected in the 9x9 format
from the previous example, the new rankings are: albert pujols (0.05618, #6 ranked batter), shane victorino (0.05854, #3 ranked batter [4.0% improvement]). carl crawford dominates at 0.73011. in 5x5 scoring, albert pujols is the 5th ranked offensive player, with victorino being the 13th (juan pierre 1, crawford 2, carlos gonzalez 3, bautista 4).
i will spend the rest of the day (workload permitting) calculating values that would more closely resemble fantasy league totals versus major league totals. the only problem with that is my sample set is low (1 league). are there any resources for what the average yahoo public league accumulated stats are anywhere? i figured this would be the most widely used format for data available (if possible).
i will check out that website too, but i still want to continue my project just because, well, it's pretty fun actually.
mr anderson... wrote:I am in the process of developing a fantasy baseball algorithm specifically designed to aid in draft day selection for my leagues custom format.
strategy wise, I'd punt SB and starting pitching. Just load up on power, and top of the line relievers. you could even go so far as to draft all of your bats first, then grab relievers.
interesting. the conclusion that i was drawing was to focus on top of the line relievers and strong starting pitching, and going for slapstick speedsters and awesome ratio middle relievers in the later rounds. a draft something like this
1-4 SP 5-8 top closers 9-12 (the juan pierres, chone figgins, brett gardner, elvis andrus like players) 13-? (walk heavy players) or something of the like - im far from working out the logistics
but, a lineup based off elite pitching and speed would give you something approaching weekly wins in:
R, 3b, sb, BB, W, Sv, Hld, ERA, WHIP, K/BB, K/9, QS (12 categories). the first place team record wise come playoff time averaged just under 10 wins per week. now theres always going to be waiver wire adds - maybe i get lucky and get someone that sticks, but a team like this, i feel, would be a strategy worth exploring because it would go so strongly against the grain of traditional strategy that it might just work (but it wouldnt be pretty). i thinking that partly because of the fan's love of the HR, it carries over into fantasy baseball that we will put more stock in a guy like adam dunn (whom you could typically account for 40 HR, 100 RBI, 80 walks, 80 runs, 2 triples, 30 doubles, 260 average) than a guy like denard span (90 runs, 20 2b, 10 3b, 5 hr, 60 rbi, 65 walks, 280 average). people always refer to position scarcity increasing a players value in fantasy formats (catchers) - so why doesnt statistical scarcity increase a players value likewise (triples) in leagues that account for that statistic?