It's funny how statistical forecasting is perceived. When a baseball analyst designs an ingenious method to project how the season will play out, all people can talk about is how the system is selling their team short, or how it missed on their team three years ago, so it will again this year. When that same analyst starts blogging about an ingenious method to project election outcomes, all of a sudden he's the greatest thing since sliced bread, despite limited evidence that his work represents any substantial improvement over what we were doing before.
I hope I don't sound too flippant here; I know the vast majority of Americans (unlike me) cared more about who came out on top last November than who came out on top last October. Still, Nate Silver's image makes for an interesting case study. Why is so much praise heaped upon him for his election predictions, which were essentially the same as many other forecasters', while so much scorn meets his accurate but contrarian baseball picks?
Well, the contrarian part is a good place to start: people don't like it when computers buck popular opinion or recent trends. When computers recommended a less aggressive strategy for buying real estate and stocks, people instead looked at recent upward trends and gambled their savings away. When the BCS computers suggested a different national championship matchup than the human polls, the system was changed so that the computers would agree with the pollsters; now the BCS rankings, which were built to reflect a spectrum of opinions, instead simply reinforce the old way of doing things. If the computer says Obama will win by a larger margin than anticipated, that's basically a reinforcement of the status quo; but saying the White Sox will go from first to last is uncouth and should be ridiculed.
Furthermore, there are the results: Silver was "right" about the U.S. election, but his projected MLB standings constantly "miss"--by an average of 6.54 wins per team since 2003! (More on this later.) This might be a popular way of looking at things, but I retort that an election is simply much easier to forecast than a season of Major League Baseball. It's one game instead of 2430, with much more data and smaller variables.
If you're reading this, you can probably name plenty of instances where one at-bat greatly influenced the outcome of a baseball season. Think Bill Mazeroski, Joe Carter, Bobby Thomson, etc. Outside of a bad Kevin Costner movie, can you name one major election that was similarly influenced by one voter? Compared to simulating an entire season of baseball, forecasting an election is more like picking the winner of the Super Bowl...on the day of the game.
Anyway, now that we're six paragraphs in and you haven't stopped reading, on to my thesis. How bad is it to miss by 6.54 games per team? (We're talking mean absolute error, because a typical PECOTA dissenter doesn't know what a standard deviation is.) If you're not familiar with probability and statistics, 6.54 games probably sounds like a lot, but it isn't.
Perhaps the best way of showing this is to look at an ideal league. Our team, the Average Means, is the very definition of league-average: during each plate appearance, each Means hitter winds up with a single 15.6% of the time, a walk 8.5%, etc. Every pitcher gives up 4.32 earned runs and .37 unearned runs per game in front of a league-average defense. Furthermore, the Means play such a consistently average schedule that they have exactly a 50% chance of winning every game. If we had to predict the Means' record in the upcoming season, obviously we would tab them to go 81-81, but how often would we be right?
Perhaps surprisingly, the Means would win exactly 81 games just 6.26% of the time; 90% of the time, they would win between 71 and 91 games inclusive. On average, our 81-win forecast would miss by 5.07 games. That's not very far from 6.54--and remember, that's the absolute best we can do with perfect information. In the real world, we have to deal with injuries, trades, and Andruw Jones in Dodger blue. Under those circumstances, an average miss of 6.54 wins is damn good, and the kind of thing I'll gladly take to the bank every year.
Before I go, one more comment. Some White Sox backers are especially angry with PECOTA because it failed to see their 2005 World Series title coming--like every other intelligent analyst on the planet--and has consistently predicted disappointing finishes for them since. This reflects a common error in sports analysis: identifying a team by its uniform rather than its players.
Think back to the 2008 preseason: analysts were touting the Rays as the year's surprise team, but others couldn't get the images of Ryan Rupe and Esteban Yan out of their heads. Nobody was suggesting Rupe and Yan could suddenly turn things around for Tampa; they instead believed that Matt Garza, Carlos Pena and Evan Longoria were good baseball players and thus would make for a good baseball team. Similarly, PECOTA's 2009 White Sox forecast is not an attempt to take away their World Series trophy. Only six players remain from the '05 squad, and that generously counts Jose Contreras, who may not pitch at all this year. Even if Nate Silver completely whiffed on the '05 projections for Jon Garland and Dustin Hermanson, what difference does that make for this year's Pale Hose?
If you don't like the computer forecast for your favorite team, you don't have to believe the results, or even read them at all. Just don't declare that the computer is flat wrong, unless you want to put your money where your mouth is. If you do, great, I could use a new house.
Suggested reading: Randomness in Team Standings Predictions. Since 2003, PECOTA has a standard deviation of 8.67 games against the actual results, versus an ideal of about 6.3 games.