Hi, guys. I'm one of the "inputs" at http://www.scoutingbook.com
. Omaha-person there dropped us an e-mail with the questions above (thanks). I'm one of the people who writes a lot of the "soft" info, distilling the databases into a pithy (I hope) paragraph or two that's easier to remember.
We should probably explain it better somewhere on the home page, but the gist of the site is this: there are six of us, three working in real-world baseball jobs (hence the pseudonyms) and three others who are hardcore fans with tech/numbers addictions. The site came about because we realized we were all keeping our own lists of up-and-coming prospects and spent a lot of time comparing them and discussing the kids, either because it was part of our day job, or because we were busy messing with our own fantasy teams and leagues. Our get-togethers, usually in the form of well-lubricated all-night arguments, often boil down to the data of numbers-folks versus the opinions of scouting-guts: some of my fondest memories of the last decade. We have respect for each other's methods, and ScoutingBook.com is an expression of very different people trying to strike a balance.
Because of that last point, our rankings are half-weighted toward "useful for fantasy purposes", which is why the best middle reliever in the world, an innings-eating starter who's not a strikeout machine, or the best-fielding shortstop of the last 100 years won't be ranked as highly as they might be in the real world. We've entertained the idea of calculating and displaying both rankings side-by-side ("#1 LHP for fantasy, #7 for real-world") but for now they're blended, and if there's a huge difference, we mention it in the capsule.
Our Scouting Book "Combine" ranking is produced on a formula incorporating on all our own "personal" rankings and inputs, as well as the ranks from other more-public prospect lists, all of these weighted differently. The rankings come about as a side-effect, in other words, of our other work(s). The weighting and mechanics of the formula is something we're always tinkering with (this is our laboratory, not our thesis), and we look back at past seasons to see how we're doing... a lot. But the theory at work is that by combining all the inputs in the right way, we'll end up with data that might be more useful to us. It's a sort of group notebook that we're always updating. Obviously, it gets more activity and updating in spring than in winter, but the rankings are recalculated automatically every day, even if there's no obvious change that shows.
The Matrix page was a bit of a last-minute addition (per reader request) that attempts to provide an "at-a-glance" view of how different prospect ranking publications viewed the same players. We're definitely not "afraid" of comparisons, since respecting others' ranks is part of our whole purpose and process.
Sadly (and I'm not a database person so forgive me if I mangle this), the current setup doesn't allow the accruing historical data to be summarized, which is too bad, because I'd love to see (for example) graphs of how a player has moved up and down over time, or their "acceleration" up the ladder. This is something we've talked about and hope to have implemented for next year, or more realistically the year after. The list of cool new things on our list exceeds our time available.
Our combine rankings are updated nightly by a scripted formula, and the whole database is re-scored a few minutes later. If a player has been upgraded or downgraded enough, by enough of us, to register by enough of us, they move up or down. The weights from other sources (SI's rankings, or BA's) are fixed inputs: they're not updated after publishing unless the originator republishes the information (BA recently did this with a "Midseason Top 50"). We count on heads-ups from the publishers or avid readers to let us know when new "comparative" data is available and we add it in as needed, but the comparison data shown in the Matrix is almost always the least up-to-date info on the site, since most of it comes from places that publish annually.
Dating the rankings wouldn't help much, since they'd always be dated "midnight last night", whether appropriate or not. They're never perfect or frozen, and a player we have ranked #1 on April 30 could be #4 on May 10th without obvious explanation, sorry. We're working on more ways to show the activity/process in the future. Right now, many steps in the chain are magical black boxes.
I hope this scrambled overview helps you guys understand what we are, and what we're not. Scouting Book is a pretty different type of undertaking than the usual book or website that tries to capture a single moment in time, and we're still wrestling with making the most of that beast without spewing confusion.
We'll have a lot of new tools and data available, along with new ways to view and use it, for next spring... some of these from our own wish lists and some from helpful reader suggestions. In the mean time, the machine keeps on grinding, and our numbers on-site change whenever enough pebbles of data have moved. Expect, for example, a lot of the 2008 draftees to move around in the next couple of months, as some return to school and others get tested in rookie / instructional ball.
I'll try to pop back here later to see if there are any other questions!