There have been rash of 50-state polls coming out recently: Survey Monkey (with the Washington Post), Ipsos (with Reuters), and Morning Consult all released polls within the week. For the sake of this article think of polling as a two-step process, researchers: collect responses from a sample of the population and analyze that data. There are trade-offs and benefits to collecting samples across the country at one time, but the key reason to do a single 50-state polls is that data analysis has evolved to make it very accurate and cost-effective to analyze all 50-states at the same time.
The Democratic and Republican parties are selling products in all 50-states, so it is natural and necessary for them (and the media that follows the industry) to get regularly updated market intelligence in order to efficiently allocate their resources. Nate Silver, of FiveThirtyEight, posed the question of why researchers would attempt to determine state-by-state forecasts from a national sample, rather than run 50 separate samples. He determines that the data collection is less accurate. Which is true, but ignores two key things. First, for the same overall quantity of respondents, the data collection for big national sample is much, much, much cheaper than 50 state-level polls. Second, with the right analytics, national samples may actually make more accurate forecasts for the 50 states than 50 state polls.
Using traditional analytics, this question of sample design a simple cost-accuracy trade-off.
Cost: The cost depends heavily on the nature of the sample, but it is much cheaper to conduct a survey that is randomly (or stratified) representative of the nation than of each of the states. It can be anywhere from 5 to 10 times as much money to interview the same number of people in 50 state poll versus one national poll.
Accuracy: A researcher conducting a national poll is considering the sample’s relationship to the national voting population, not the voting population of each of the 50 states, so for any given state the answer is further from the truth than if the sample was created for that state. If the researcher is going to do raking, the traditional poll analytics of stratifying the actual sample to the marginal demographics of each state, then a more biased sample is going to make the raking do more work. For some of the states, the balance will be so bad that the weights on some respondents will become dangerously large. And, the answer will be much less accurate than a sample stratified originally for each of the 50 states.
But, while ignoring the cost, Silver also overestimates the loss of accuracy. He computed the accuracy of the 50,000 CCES poll in 2012. This poll is a national sample and Silver wanted to show that a national sample, with raked weights towards each state, was not that accurate. He did the work himself and came up with a mean absolute error of 7.3 percentage points. I am not sure what he did, but the CCES reports a very respectable state-by-state error of 3.3 percentage points as the root mean square error.
And, that 3.3 percentage point error, is a using a sub-optimal analytical method for a national sample trying to get state-by-state results; the more biased the sample, the more modelling and post-stratification (MRP) does better than raking. MRP makes a very elegant assumption: you can learn something about the sentiment of white men from Kentucky by looking at white men from West Virginia. Or, more generally, you can learn something about the sentiment of any person by considering, independently, all of the demographics that define that person: age, gender, race, education, party identification, and, of course, geography
MRP is how I, along with Gelman, Goel, and Rivers, took opt-in polling data of about 15,000 respondents per day, from Xbox, and created forecasts for each state, each day, in 2012. MRP is how Morning Consult took only 18,000 respondents and created accurate state-by-state predictions, whereas Survey Monkey took 75,000 respondents to do the same thing with raking. Each voter in each state in Morning Consult is estimated with the all 18,000 respondents, while Survey Monkey is isolating the results from each state to each state, dramatically reducing the effective sample size. MRP is basically like raking, but it weights all users on the full interaction of their demographics, rather than just their marginal demographics.
MRP further takes advantage of the relative stability of votes in any given year and region to model voter turnout with historical data, as well as the data from a given poll. Polls do two things simultaneously, determine the population of likely voters and the sentiment of that population. Raking starts with the Census definition of the average American adult as its population and determines the population of likely voters exclusively from the single poll, while MRP starts with historical definitions (from Exit polls, voter files, etc.) of the average American voter and then lets the poll modify from there.
Silver also makes one other odd error: online polls do not necessarily use IP address to input address, they, like most other polls, also ask it directly!
In short, Silver’s column does not acknowledge cost as a major driver of simplified data collection, And, he is unaware of the advances in statistical modeling that really drive this innovation.