Published in The Monkey Cage. Co-written with Stefan Bauchowitz.
The 2012 presidential election in the United States saw an increased popularity of poll aggregators such as Drew Linzer at Votamatic, Nate Silver at FiveThirtyEight and Sam Wang at Princeton Election Consortium. While their models significantly differ, they have one thing in common: their accuracy. With different methodologies Linzer’s, Silver’s and Wang’s forecasts were more accurate than those of any given pollster. They proved that aggregating polls could be more useful than using a single poll to predict the result of an election.
This prompted us, at TresQuintos, to develop a model of our own. We tested it on the 2013 presidential election in Chile, which took place Nov. 17. While we followed the methods of the poll aggregators named above, our model is mainly informed by statistics and political science literature (e.g. Andrew Gelman and Simon Jackman). Our forecast, too, was more accurate than the prediction of any given pollster. We also proved that aggregating polls could be more useful than using a single poll to predict the result of an election.
The following table shows the result of the election and the final forecast (of a longer series of forecasts) published by TresQuintos, four days before the election. It also shows a selection of pollsters that published their prediction of vote intention for each of the nine presidential hopefuls. All of these polls were fielded at the national level, some with face-to-face interviews and others via telephone. To compare them to the result of the election, we use their latest prediction and consider their valid preferences as the total sample.
The following table shows that TresQuintos had the lowest total and average error. Some critics argue that forecasts made by poll aggregators should not be compared to predictions made by pollsters. We believe the contrary; poll aggregators and pollsters are essentially at odds. They compete against each other to get the numbers right. In countries where the media is highly politicized it is crucial to provide the public with as much unbiased information as possible. This is what Linzer, Silver and Wang did in the United States and what we did in Chile.
Even though the idea of poll aggregation worked for us, we had to deal with a number of issues. We found the US and Chilean political contexts and polling scenes to be significantly different. In the US two-party system, voting intention for each candidate tended to be stable over time, and there were literally thousands of national and state-level polls to confirm trends. In Chile’s multi-party system, independent third-party candidates dominated fluctuations in public opinion, and there were only a handful of polls to keep track of variations.
The stable political system in the US, together with the high frequency of polls favored a model based on few assumptions. By contrast, Chile’s unstable political system, together with the low frequency of polls forced us to build a model with additional assumptions. Linzer’s, Silver’s and Wang’s models were more parsimonious than our model. However, in retrospect, we do not see this as an imperative. Instead, we simply understand it as a reflection of the advantages and limitations of the different political contexts and polling scenes.
A brief overview of how our model works will help clarify the similarities and differences with other poll aggregators.
At TresQuintos we use a two-stage process to aggregate polls. In the first stage we weigh polls according to their accuracy (pollster ranking), their non-forced error (sample size) and their age (distance to election). The two former steps allow us to calibrate poll predictions, and the latter allows us to estimate a time sensitive margin of error. In the second stage we (re)construct public opinion trends using Bayesian analysis: we assess the likelihood that a newly published poll is accurate given the information gathered from previous polls.
Part of the complexity of our model is given by theoretical factors. Complex models work well in complex systems. We don’t think a model as simple as those that had been successful in the US would have worked well in Chile. The other part of the complexity is given by environmental factors. When pollsters are substantially biased, it is extremely hard for simple aggregation models to get the numbers right. A theoretically complex model combines well with complex environmental settings.
Our forecast was extremely accurate in the point estimate, but fuzzy in the credibility interval. While we got the important numbers right, we had a large margin of error (see the following graph). This was mainly because of the divergent information being fed into the model. For some candidates, polling data were consistent across pollsters. For other candidates, polling data varied significantly. The accurate polls helped identify common point estimates, but the biased polls increased the credibility interval considerably.
To our knowledge TresQuintos is the first successful poll aggregator in Latin America. The accurate forecast of the Chilean 2013 presidential election is the first look at a largely unexplored set of tools and body of literature in the region. The precedent set by Linzer, Silver and Wang is useful to frame poll aggregation, but we strongly believe that more complex models work better in developing democracies. The next step is to calibrate our model to forecast forthcoming elections in Brazil, Colombia and Uruguay.