Of the actors nominated for ‘Best Actor-leading role’ at the Oscars— the premier international awards for film— I was curious to learn more about their measurable attributes (age, national origin, film credits), the commercial success of the film they starred in, and how these variables might or might not be correlated with ultimately winning the award.
The Oscars are nothing without the self-reverence and controversy they inspire. For example, my favorite performance from 2014— Jake Gyllenhaal in Nightcrawler— was likely snubbed, not as a result of his exemplary psychopathic depiction, but as a function of the odd voting regime and the film’s overall ‘momentum’. Even after receiving a nomination, there is substantial cynicism surrounding the underlying biases that seem to dictate winners. For example, I assumed that actors from commercially successful films were disadvantaged— the academy (the Oscars’ voting body) deeming popularity the equivalent of lowbrow and thus unworthy of acting excellence.
Who gets nominated?
I created a dataset of the Best Actor nominees from 2000-2014, using IMDB, that included the following variables: (1) the actor’s age at the time of the film, (2) the actor's number of previous feature film acting credits, (3) the most up-to-date gross box office revenue in the United States- adjusted for inflation to 2000 US dollars (4) whether they won the oscar, (5) whether they were American, and (6) whether they had previously won for best actor (supporting or lead). This added up to 75 observational units and 450 data points.
The descriptive statistics were as follows:
The winners are shockingly representative of the overall pool of nominees. Immediately, it seemed clear that it would be incredibly unlikely for any of these variables to be a statistically valid indicator of the likelihood of winning the award. While the two groupings had very similar averages, the standard deviations—a measurement of the clustering of observations around the mean— for all nominees were larger than for the winners. For example, the standard deviation for the winner’s age was 7.6 years versus 12.2 for all nominees and the standard deviation for box office revenue was $50.5 million for the winners versus $58.8 million for all nominees.
Why does this matter? It might imply that while the academy is open to nominating a wide range of actors representing a variety of films (a large standard deviation), the winners are more likely to fit an archetype (a small standard deviation). Here is a look at a chart of the inflation adjusted box office revenue of the winners and those nominated:
The three largest revenue films were, in order of highest revenue: Pirates of the Caribbean (Depp, 2003), American Sniper (Cooper, 2014), and Cast Away (Hanks, 2000). The actor from the highest revenue film to win was Russell Crowe for Gladiator (2000), which earned $187.7 million in domestic box office. The lowest grossing film to earn a nomination for best actor was A Better Life (Bichir, 2011), which earned a paltry $1.3 million in the U.S. The trend line—the black dash cutting through the chart— shows that while there is variation from year-to-year, the revenues have remained fairly constant in real terms. And of course, as I mentioned before, the winners are more closely clustered around the trend line.
Now for a look at the actor’s ages:
Here, the trend line indicates that the nominees have gotten slightly older over time, despite Eddie Redmayne winning for the Theory of Everything in 2014 at age 32. The oldest winner was Jeff Bridges (Crazy Heart, 2009) at age 60. The youngest nominees were Ryan Gosling (Half Nelson, 2006) and Heath Ledger (Brokeback Mountain, 2005) both at age 26. The oldest was Bruce Dern (Nebraska, 2013) at age 77.
Intriguingly, there also appears to be a relationship between age and box office (I highlighted the three outliers, highest grossing films, in red to emphasize their impact on the trend line):
What can be said statistically?
As I assumed, after examining the similarities between the two group’s means, none of the variables, after many attempted transformations, functional form specifications, and regression models, were a statistically significant indicator of winning for best actor. The model below reflects many failed attempts to reject a non-zero relationship— non-zero implying that a change in the predictor variable is correlated with a change in the dependent variable.
Here the statistical significance of the beta on natural log of box office revenue (I used natural log to make the relationship linear, although you can ignore this for our purposes) is rejected at the 10% significant level— basically, there is no reason to interpret it. However, where I did find some interesting correlations was conducting a log-linear regression of inflation-adjusted box office revenue on age, nationality, and winning the Oscar. Looking first at a model of Log(Box Office^) = Age^:
Rejecting the null hypothesis that there is no relationship at the 10% level (statistical speak), increasing the age of the lead actor by one year is, on average, associated with a 2.1% decrease in box office revenue. This result should be taken with a grain of salt, however, as the 95% confidence interval does include 0 (although just barely) and it likely suffers from severe omitted variable bias—which would invalidate these results. For example, older lead actors might garner a lower production budget from the studio, which would be correlated with both their age and the film’s box office revenue. Luckily, I have a couple more variables that I can include.
Could the lead actor being American also be influencing the film’s revenue? Could the relative quality of the performance (winning the Oscar) be influencing revenue?
Controlling for the relative acting performances (whether it won the Oscar) and a possible home country bias (whether the actor is American), does lead to a slightly more accurate model. Age is now statistically significant at the 5% level and its 95% confidence interval no longer includes 0. Not surprisingly, and as we saw from the difference in film revenues earlier, whether the film won for best actor is not statistically significant.
However, if the lead actor is American could potentially be influencing the box office revenues— it is statistically significant at the 10% level and suggests that an American lead actor is associated, on average, with a 50% increase in box office revenues (from the descriptive data, the average box office revenue for an American is $60 million versus $54.7 million for a non-American). Again though, I would urge caution, as there is likely serious omitted variable bias in this result. For example, whether the film is produced in America, as opposed to internationally, could be correlated with whether the lead actor is American and box office revenue.
Please fill in the contact form below to have new articles emailed to you directly!