Over the summer I got together with my research assistant Deepak Pathak and my colleague Miro Dudik to take a look at four different types of Oscar data: fundamentals, polling, prediction markets, and experts. While there are certainly meaningful things to learn from all of the data sources, properly translated prediction market prices were, by far, the most superior data source for creating continuously updating and accurate forecasts for all 24 Oscar categories.
Where did the other data go wrong?
Forecasts created with fundamentals (i.e., box office receipts, number of screens, release dates, other awards shows, etc.) are simply not that accurate across 24 categories. While this data is enticing, because it does so well in sports and politics, when you think a lot about it (like we did) you realize the huge problem; this data does not provide much identification between categories like song or makeup. If a movie does well in the box office, is because people like the song or the makeup? Further, there needs to be separate model for each one of these categories. While the same variables correlate for any football game or any of the Electoral College elections in a given year, the song or makeup categories all need different variables.
Polling has the potential to create both accurate and timely forecasts, but it requires incentives for frequent responses by high-information users to stay timely, and proper transformation of raw polls into forecasts to be accurate. By definition, real money prediction markets provide the incentives that polls generally lack.
Experts can create something similar to fundamental models, but it is not clear ex-ante which experts are going to work as most of the methodology is opaque.
That leaves prediction market data; the prices that investors are willing to sell contracts that will be worth either $1 or $0 depending on whether the outcome occurs or not. The price is highly suggestive of a probability, but theory, backed by empirical data does allow us to translate raw prediction market prices into probabilities.
The paper, which suggests a method derived from our review of these four data types, is currently under academic review, but you can see how its method does in 2014 on here (and here and here); I launched the predictions shortly after the nomination and they have been updating every few minutes since then. Here are the current predictions for eight big categories. They are scary confident with an average of 84%.
Best Picture: 12 Years a Slave is at 84%, which is a strong lead over Gravity at 14%.
Best Director: Gravity’s Alfonso Curaon at 98% is dominating this race with 12 Years a Slave’s Steve McQueen at just 2%.
Best Leading Actor: Dallas Buyers Club’s Matthew McConaughey at 90% is leading big over The Wolf on Wall Street’s Leonardo DiCaprio at 7%.
Best Leading Actress: Blue Jasmine’s Cate Blanchett at 98% is a massive favorite over American Hustle’s Amy Adams at 2%.
Best Supporting Actor: Dallas Buyers Club’s Jared Leto at 97% is also a strong favorite over Captain Phillip’s Barkhad Abdi at 2%.
Best Supporting Actress: In one of the tightest categories I have, 12 Years Slave’s Lupta Nyongo at 62% is leading American Hustle’s Jennifer Lawrence at 37%.
Best Adapted Screenplay: 12 Years a Slave at 88% is a strong favorite over Philomena at 7%.
Best Original Screenplay: In the only real-toss up of the top categories I have Her at 53% to American Hustle at 42%.
As a disclaimer, I have only seen two of the movies mentioned in all of these categories, but I am not saying which ones!