|The book can be bought from Amazon|
The book is pretty non-technical, in spite of a valiant attempt to explain Bayes' theorem using only high school algebra, but that kind of works since one of his main points is that the errors people make in prediction are mostly not mathematical errors like miscounting, miscalculating, or even undersampling. The errors are more likely to be overconfidence, failure to consider unfamiliar but possible scenarios, or ignoring inconvenient evidence. Not to mention the biggest category of problem: predictions where being right isn't even important to the person doing the predicting.
This last problem, predictors with the wrong motivation, is most evident in political pundits, who are motivated to produce surprising or partisan predictions rather than right ones. But I think a more interesting example from the book is his section on weather forecasting. First of all, the US government weather forecasters are doing a pretty good job, and getting better as computing power improves (though human forecasters improve the computer forecasts significantly, he points out, by correcting for known limitations). The values they report are pretty much right - if they claim there's a 10% chance of rain, it rains about 10% of the time. (Incidentally, he doesn't talk much about the right way to check this sort of prediction, which is something I've been thinking about for a while.) But many people get their news from local weather programs or from for-profit weather forecasting sites. These sources all have access to the US government forecasts, and yet their predictions are substantially less accurate. They will, for example, extend predictions further into the future than the computer models are reliable, just to have something to report, even when it's no more accurate (and sometimes less) than a Farmer's Almanac. They also misreport their predictions: rounding near-50% predictions to 60% or 40% to look more knowledgeable, or artificially boosting small chances of rain because they'd rather be wrong by predicting rain when it doesn't than vice versa.
This last kind of misreporting got me thinking, though. I don't think results should be misreported. But the way we react to predictions should depend not just on their likelihood but on the importance of their consequences. If the weather is predicted to have a 1% chance of rain, it's probably not worth cancelling your picnic. But if there's a 1% chance of a terrorist attack at a specific place and time, that's probably worth a lot more effort even though it's not very likely. Which brings me to another chapter I found particularly interesting in the book.
Silver talks about the September 11th 2001 terrorist attacks. In fact he talks to Donald Rumsfeld, who coined the term "unknown unknowns". Possibilities the forecaster simply didn't think of are always going to be a problem for predictions. But Nate Silver does some rough calculations and points out that something like the September 11th attacks should not have been a total surprise. The scale of them - the number of people killed - was unprecedentedly large. But the sizes of terrorist attacks, like the sizes of earthquakes, follow a power-law distribution, and the September 11th attacks fall pretty much right on the line. As with earthquakes (which he talks about in another chapter) this doesn't predict any individual attack's time and place. But an intelligence agency that was watching the numbers should have had somebody keeping an eye out for attacks on this scale, because they could have expected one every decade or two. As for the details of the attacks, well, there have been other analyses of the ways they might have been found out ahead of time. The key fact, though, is that these attacks were part of a well-established distribution of attacks. They don't necessarily signal a new era of global terrorism or even a change in terrorist behaviour.
Nate Silver also argues strongly that to make good predictions it's a tremendous help to understand the underlying mechanisms. Even if it's the weather, where chaos forces you to pour tremendous extra effort in for every extra day of lookahead, or chess, where what limits your predictions is just computation, having an understanding of the rules by which the process works can help. Earthquakes, for example, are basically not predictable, not individually (though you can say there'll probably be a big earthquake in California sometime soon, and you can even predict how often earthquakes of a certain size will hit California, there's no way to tell one day from another), and he suggests that this is because the mechanics - the stresses deep underground, the strength of the rocks - are not accessible to us.
On the other hand, Silver points out the hazard of overfitting - of building a highly elaborate model that fits what is essentially noise. Such a model will fit the existing data quite well, since it was built from that data, but it can be worse than a simpler model when used to predict new data.
Finally, what about the election? How did Silver do so well? His prediction system is simply based on combining many polls. In principle, this should work rather well: people phone up voters, ask them how they're going to vote, and report that. Then on election day, people vote, mostly the way they said they would. How hard can it be? Well, unfortunately, there are myriad ways pollsters can introduce biases and problems, and usually all they report is an uncertainty based on sample size. So Silver's system builds an estimate of the reliability and bias in each poll, taking into account how far in advance of the election it was taken, and combines all the polls taking this into account. I'm not sure exactly how he builds his reliability estimates; past elections, I would guess, and perhaps cross-checking. But in general, combining independent predictions tends to produce better predictions than any individual one, provided you can avoid being carried away by widespread bias. And it seems Silver did that. The book is a decent guide to how.