Abstract
Political commentators have offered evidence that the “polling misses” of 2016 were caused by a number of factors. This project focuses on one explanation: that likely-voter models—tools used by preelection pollsters to predict which survey respondents are most likely to make up the electorate and, thus, whose responses should be used to calculate election predictions—were flawed. While models employed by different pollsters vary widely, it is difficult to systematically study them because they are often considered part of pollsters’ methodological black box. In this study, we use Cooperative Congressional Election Study surveys since 2008 to build a probabilistic likely-voter model that takes into account not only the stated intentions of respondents to vote, but also other demographic variables that are consistently strong predictors of both turnout and overreporting. This model, which we term the Perry-Gallup and Demographics (PGaD) approach, shows that the bias and error created by likely-voter models can be reduced to a negligible amount. This likely-voter approach uses variables that pollsters already collect for weighting purposes and thus should be relatively easy to implement in future elections.