The Analytics Gap

Rado Kotorov's picture
 By | November 14, 2016
November 14, 2016

The 2016 U.S. Presidential election was notable for many reasons, including the fact that the outcome was an upset to all major polls. As the New York Times’ Steve Lohr and Natasha Singer put it in a recent article, “It was a rough night for number crunchers.”

Their piece examines the challenges with election forecasting and makes a very important point which is not frequently discussed when we talk about predictive analytics. The authors state: "This week’s failed election predictions suggest that the rush to exploit data may have outstripped the ability to recognize its limits." This is a critical takeaway and something we in the industry can’t caution against enough.

Predictive analytics takes historical data and from it extrapolates predictions. If some pattern of voting has occurred in the past it is likely to repeat. This was at work throughout the election, as analysts compared voting histories with what was actually happening. But the voting history did not reflect the change, and predictive models do very poorly when there are unexpected events.

The accuracy of modeling depends critically on the assumptions made by the analysts, including the general assumption that the past is a good predictor of the future. It is the second assumption that trips analysts most often. They are so into the data that they fail to see signs—a significant shift for which we do not have data. Adding to this challenge, analysts are often biased to the data they have, meaning they tend to trust collected data more than circumstantial evidence.

For example, in the run-up to the election, some journalists traveling with Hillary Clinton’s campaign pointed out that there weren’t many lawn signs in support of the candidate in rural American. Yet, no one saw that as a signal that something was changing. Pollsters and others fell for the same fallacy that the present data cannot be wrong, illustrating how this bias can affect outcomes. 

Predictive analytics is an incredibly powerful tool but, as with all things, it has its limits. That’s certainly one thing this historical Presidential race has taught us, and it will be interesting to see how the political pollsters adapt their approach moving forward.