Oscar Analysis: Can Social and Box Office Data Predict the Best Picture Winner

Dan Grady's picture
 By | February 27, 2014
in Business Analytics, Predictive Analytics, Query and Analysis, Sentiment Analysis, Social Media
February 27, 2014

Many of us will be watching the Oscars this weekend, rooting for our favorites – or whomever we picked in the office pool – to walk away with a gold statue. The anticipation of this event got me thinking about the social experience of the movies.

Going to the movies could be considered a very anti-social experience. Whether you go alone or with someone else, after the lights dim and the previews start, you are pretty much forbidden to speak with anyone. However, before the final credit has even rolled, most of us are leaning over to the person next to us, and asking, “What did you think?” The socializing has begun and the debate will continue until Sunday night (and probably after) – when the Academy tells everyone what he or she thought.

Social media has amplified these post-movie debates to a whole new level, so I thought it would be interesting to take a more analytical, data-driven approach to predicting who will win Best Picture this year.

In terms of the data, I used Information Builders’ iWay data access capabilities to pull the Facebook data for each of the nominated films during the first four days of their full release. In most cases, that was Thursday to Sunday, except for a couple of movies that opened during a holiday period. I then enriched that data using WebFOCUS Sentiment Analysis capabilities to generate numeric scores for each of the posts based on the overall tone of the post. This gave me a lot of good data to work with, but social data alone was not going to be enough. Money talks, especially in Hollywood, so I also pulled the 2013 gross dollars for each of the nominated films. Lastly, I integrated the Metascores for each of the films from IMDB to give me three distinct sources of data.

Based on the data, Gravity is a force to be reckoned with:

  • Gravity generated the highest gross for 2013. It was the first movie generally available and playing in the most theatres (3,820), but the gap is so significant that I feel comfortable declaring it the winner for this analysis.
  • Gravity had the most activity on its Facebook page for the opening weekend with 2,155 posts and comments. Second place in the social activity race went to The Wolf of Wall Street, with 1,962 posts and comments.
  • More important than overall activity is what is actually being said. If we take a look at the sentiment on each of the pages we find that Gravity scored better than most there as well. Gravity had 50.4 percent of its posts score positively versus only 5.8 percent scoring negatively. 12 Years a Slave had a higher percentage of positive posts at 56.2 percent but a significantly higher negative percentage at 15.3 percent. Those percentages were the highest in both categories, which suggests people had strong feelings about the film.
  • If we look at the most frequently mentioned terms in the posts you’ll see things like “amazing” and “awesome”, which aren’t overly insightful. But if you drill into the posts that contain the term “amazing” you’ll find that the most popular terms are “Sandra” and “Bullock”, which bodes well for her personally and for Gravity.
  • Lastly, if we look at the bubble chart that is analyzing Gross $, Facebook Fans, and the Metascore for a film from IMDb’s site, we find Gravity once again in a very strong position. This allows us to start correlating social media’s potential impact on sales. A film would hope to be as far to the upper right as possible, indicating a high number of Facebook fans, as well as strong box office gross. (Most of the Metascores used to determine the size of the bubble were about the same, so you won’t see much difference there.)  

o    Gravity scored well again here, positioning itself above the average fans to Gross line
o    The Wolf of Wall Street had by far the most fans on its Facebook page but was below average when it came to fans to Gross analysis
o    Oddly enough, American Hustle positioned itself right in the middle of both the Fans to Gross bubble chart and the Sentiment Analysis bubble chart

While this is an interesting exercise and each of the pieces of analysis is important, it’s that last visualization that is the most important when we start looking at applying these capabilities to our own organizations. Why? It involves integrating social data with other enterprise data to look for correlations and business impact. Bridging that gap between social data and business data is one of the many benefits Information Builders’ customers are getting from our platform-based approach.

Does all of this number crunching give us a clear-cut winner for Best Picture on Sunday night?

Maybe. Check back on Monday to see where the actual winner placed in our analysis.