# Prediction Market Ineffeciency

On PredictIt, the price to sell a contract, worth \$1.00 if the Democratic candidate wins Pennsylvania and \$0.00 if s/he loses Pennsylvania, is \$0.79. The price to buy a contract, worth \$1.00 if the Democratic candidate wins the election and \$0.00 if s/he loses the election, is \$0.76. You can sell someone Democrat wins Pennsylvania for \$0.03 more than you can buy Democrat wins the election. This is surprising, because it is highly unlikely that the Democratic candidate does not win the election if s/he carries Pennsylvania. And, the price to sell Florida is \$0.73 and Ohio \$0.71, just \$0.03 and \$0.05 below the buy price for Democrat wins election. Basically, the candidate just needs to win one of these three states to win the election.

PredictWise tries to reflect the most accurate and calibrated forecasts of upcoming events and generally relies on prediction market data. But, I have also authored numerous academic papers and many blog posts on when and where markets have inefficiencies. Understanding the inefficiencies allows me to (1) model the current data streams to make them as close what I believe is the most accurate and calibrate forecasts (2) learn and improve on market designs to make better data streams in the future. Here is a new problem that my graduate student Sam Corbett-Davies and I have been tackling: can we determine the efficient relationship between the topline prediction market price for the Democratic candidate to win the election and the prices for the Democratic candidate to win any of the individual states? Well, in 2016, there is no reasonable interpretation of how to aggregate the state-by-state prices to the a probability of victory in the election that give a value as low as the current topline price.

Beyond providing the state-by-state and topline predictions, each year I include a probability distribution of possible outcomes. That probability distribution is derived by taking the state-by-state predictions, creating a matrix of pairwise correlations between each state, and simulating the election. Two common choices heuristics for pairwise correlations are to assume 100% correlation and 0% correlation. Both of them are obviously wrong as we know the states related, but they are not perfectly related. But, we have a few hybrid models that provide a range of possible election predictions from the state-by-state predictions. I will talk about them at length in the coming weeks. All of them are now over 5 pp more than the topline prediction market-based forecast. (In this chart I am using the prediction market-based predictions, but the it is pretty similar if I use the raw prices).

The topline and state-by-state predictions are highly correlated and pretty close, but not the same. I believe the state-by-state predictions a little more than the topline prediction, because math is hard. It is a serious academic research agenda that allows me to begin to estimate how 51 Electoral College outcomes come together. But, it is not an exact science, and my historically derived estimates of correlations between the states, may not account for subtle 2016 ideosyncracies that the market can when it creates the market for the topline outcome.

I am going to continue to provide the same topline forecast as I have since 2014, derived by the topline prediction market prices, but I will also regularly derive an overall forecast from the state-by-state values. This is not an exact science and it is not a huge deal that one prediction says 80% and other 85%, but I will be following this closely all election cycle.