If pollsters want us to trust their results they need to lose the secrecy

https://www.theguardian.com/commentisfree/2019/dec/05/pollsters-results-secrecy-statisticians-surveys

Version 0 of 1.

Poll analysis should be accessible, intelligible, usable and assessable, says statistician Teddy Groves

Election campaigns are always full of statistical discord, but this one has been particularly acrimonious. Misleading bar charts, selective publication, dubious tactical voting recommendations and outright fake research are all in the news, and the mantra “trust no poll” is resounding on social media. Questions are understandably being asked about the contributions of the media, political parties and electoral regulators to the poor level of public discussion of public opinion. But I think the polling industry also has questions to answer.

The best set of principles for judging whether statistics are being communicated in a trustworthy fashion come from the philosopher Onora O’Neill. She says that statisticians should make their work accessible, intelligible, usable and assessable.

How does the polling industry do according to these standards? YouGov’s recent MRP poll (it stands for multilevel regression and post-stratification) can serve as a high-water mark. As a piece of statistics communication it surpasses the rest of the polling published so far in this campaign: it comes with a nice website, accessibly formatted results and a detailed document explaining its methodology. Yet according to O’Neill’s criteria, even this analysis falls short.

YouGov may have released constituency-level results, but other, more important, parts of its analysis remain inaccessible. The company has not shared the data that went into its analysis, or the code that implements it, so it is impossible to check it for errors or compare YouGov’s approach against alternatives.

The explanatory material on the website is helpful, but this isn’t enough to make YouGov’s analysis acceptably intelligible. It doesn’t spell out how exactly its models used demographic traits to predict voting behaviour – a crucial part of multiple regression – or what assumptions it made about how many people have all those traits in order to carry out its post-stratification.

Absolute clarity is important here because, like other statistical techniques, multilevel regression and post-stratification are highly sensitive to modelling choices where there is no obvious right answer. Before the 2016 US election the New York Times asked four respected teams of pollsters to predict the Trump-Clinton spread in Florida based on the same opinion poll. The results ranged from Trump ahead by one to Clinton ahead by four. None of the analyses were bad, they just made different judgment calls. This shows that in order to understand any opinion poll analysis, we need to know exactly what choices the pollsters made at every stage.

How might people use YouGov’s analysis? Well, many people have probably seen it as a way to predict constituency results in the election. Unfortunately, this isn’t possible: the results are merely snapshots of the week of 20-27 November, when the polling took place, and do not try to capture how public opinion will change between then and election day. This is a big problem because there aren’t many sources of information about constituency-level public opinion, so people are bound to be tempted to misuse YouGov’s results. It may caution against using its results as forecasts, but that’s no excuse for putting such easily misinterpreted numbers into the public sphere in the first place.

As for O’Neill’s last requirement, YouGov’s analysis is impossible to assess. The natural way to test its model would be to use it to predict unseen survey responses, but with the data and code kept out of reach, no one outside the company can do this.

It’s clear that even this poll fails to meet the standards of statistics communication. So what could YouGov and the rest of the polling industry do to be more trustworthy?

The best thing would be for the industry to make all its analysis reproducible by sharing its data and code online, following the standard practices of open-source software development. That would enable any interested person with a computer to run the same analysis. It would then be possible to catch polling firms’ errors by picking through their code, and to assess their models against fresh data. Perhaps members of the public could even build on the polling industry’s work to produce forecasts that people can actually use, taking into account how public opinion changes over time. In short, opinion polling would become more trustworthy.

Thanks to modern software, there is no technical barrier to making opinion poll analysis reproducible; only a minimal amount of extra work would be involved. The objection I anticipate from the polling industry is that its analyses depend on secret proprietary data that is simply too valuable to share. Valuable to whom? It’s mainly valuable to the polling companies themselves, helping them differentiate themselves from their competitors while also providing a handy excuse for ducking some scrutiny. But is that really worth sacrificing our trust in opinion polls?

It’s not surprising the public discussion about opinion polls is in such a bad state, when even the best polls fail to meet reasonable standards of trustworthy statistics communication. The polling industry benefits greatly from the exposure it gains during election campaigns, which amounts to free advertising. In return it should be required to provide reproducible analysis, even if that means sacrificing a little secrecy.

• Teddy Groves is a statistician at the Novo Nordisk Foundation Centre for Biosustainability