Parking tickets are irritating. They are especially irritating in Los Angeles where a failure to properly feed your meter can run you nearly $70—and even more if you forget to pay it on time, as I regrettably learned this summer.
To be fair, parking violations play a necessary role in deterring scofflaws, easing mobility, and creating equitable opportunities for commerce in commercially vibrant urban centers. There is an economic cost for businesses and consumers if parking spaces are not optimally utilized (consumers may waste fuel searching for parking or forgo shopping altogether) and this must be safeguarded against. As we learned from Seinfeld, the scarcity of parking can be the source of considerable agony:
Comedy aside, there must be a point where the cost of a parking ticket transcends from a socially just deterrent to a punitive and regressive gouge—regressively taxing low income residents who lack access to off-street parking and can least afford to pay.
A few weeks ago, I posted on how policymakers are increasingly using a "happiness" indicator to inform public policy. Ironically, while writing that piece, I was very unhappy and not as a result of any of the factors traditionally deemed significant. Rather, my unhappiness was singularly due to my neurotic and perpetual search for updates on a pending federal court case related to the eligibility of my favorite NFL quarterback--who, thankfully, has now been vindicated. Despite my own proclivities, sports and their related scandals were not deemed an important driver of happiness by the United Nations (unless being a rabid sports fan is a form of mental illness).
My own sporting biases aside, this got me thinking. How many others around the world are, far more often than not, made miserable by their favorite sports teams?
You can be forgiven for not knowing where Bhutan is. It is a small country—its first ever census in 2005 revealed less than one million inhabitants--sandwiched between China and India, governed by a very poor monarchy, that exports little, has no formal diplomatic relations with the United States, and generally embraces reclusion.
Internationally, Bhutan is known for two things. First, they exiled more than 100,000 of their ethnic-Nepalese minority, whose religious and linguistic views made them unfit for citizenship, to live in refugee camps in Nepal--where they were also not wanted and subsequently resettled by third parties (more than 66,000 of them in the United States since the mid-2000s). Second, ironically, is the notion of measuring national success, not by gross domestic product, but by gross national happiness. Measuring wellbeing by contentment rather than income is certainly a noble goal in our gilded age, but how can it be appropriately measured?
Of the actors nominated for ‘Best Actor-leading role’ at the Oscars— the premier international awards for film— I was curious to learn more about their measurable attributes (age, national origin, film credits), the commercial success of the film they starred in, and how these variables might or might not be correlated with ultimately winning the award.
The Oscars are nothing without the self-reverence and controversy they inspire. For example, my favorite performance from 2014— Jake Gyllenhaal in Nightcrawler— was likely snubbed, not as a result of his exemplary psychopathic depiction, but as a function of the odd voting regime and the film’s overall ‘momentum’. Even after receiving a nomination, there is substantial cynicism surrounding the underlying biases that seem to dictate winners. For example, I assumed that actors from commercially successful films were disadvantaged— the academy (the Oscars’ voting body) deeming popularity the equivalent of lowbrow and thus unworthy of acting excellence.
Who gets nominated?
I created a dataset of the Best Actor nominees from 2000-2014, using IMDB, that included the following variables: (1) the actor’s age at the time of the film, (2) the actor's number of previous feature film acting credits, (3) the most up-to-date gross box office revenue in the United States- adjusted for inflation to 2000 US dollars (4) whether they won the oscar, (5) whether they were American, and (6) whether they had previously won for best actor (supporting or lead). This added up to 75 observational units and 450 data points.
The descriptive statistics were as follows:
The winners are shockingly representative of the overall pool of nominees. Immediately, it seemed clear that it would be incredibly unlikely for any of these variables to be a statistically valid indicator of the likelihood of winning the award. While the two groupings had very similar averages, the standard deviations—a measurement of the clustering of observations around the mean— for all nominees were larger than for the winners. For example, the standard deviation for the winner’s age was 7.6 years versus 12.2 for all nominees and the standard deviation for box office revenue was $50.5 million for the winners versus $58.8 million for all nominees.
Why does this matter? It might imply that while the academy is open to nominating a wide range of actors representing a variety of films (a large standard deviation), the winners are more likely to fit an archetype (a small standard deviation). Here is a look at a chart of the inflation adjusted box office revenue of the winners and those nominated:
The three largest revenue films were, in order of highest revenue: Pirates of the Caribbean (Depp, 2003), American Sniper (Cooper, 2014), and Cast Away (Hanks, 2000). The actor from the highest revenue film to win was Russell Crowe for Gladiator (2000), which earned $187.7 million in domestic box office. The lowest grossing film to earn a nomination for best actor was A Better Life (Bichir, 2011), which earned a paltry $1.3 million in the U.S. The trend line—the black dash cutting through the chart— shows that while there is variation from year-to-year, the revenues have remained fairly constant in real terms. And of course, as I mentioned before, the winners are more closely clustered around the trend line.
Now for a look at the actor’s ages:
Here, the trend line indicates that the nominees have gotten slightly older over time, despite Eddie Redmayne winning for the Theory of Everything in 2014 at age 32. The oldest winner was Jeff Bridges (Crazy Heart, 2009) at age 60. The youngest nominees were Ryan Gosling (Half Nelson, 2006) and Heath Ledger (Brokeback Mountain, 2005) both at age 26. The oldest was Bruce Dern (Nebraska, 2013) at age 77.
Intriguingly, there also appears to be a relationship between age and box office (I highlighted the three outliers, highest grossing films, in red to emphasize their impact on the trend line):
What can be said statistically?
As I assumed, after examining the similarities between the two group’s means, none of the variables, after many attempted transformations, functional form specifications, and regression models, were a statistically significant indicator of winning for best actor. The model below reflects many failed attempts to reject a non-zero relationship— non-zero implying that a change in the predictor variable is correlated with a change in the dependent variable.
Here the statistical significance of the beta on natural log of box office revenue (I used natural log to make the relationship linear, although you can ignore this for our purposes) is rejected at the 10% significant level— basically, there is no reason to interpret it. However, where I did find some interesting correlations was conducting a log-linear regression of inflation-adjusted box office revenue on age, nationality, and winning the Oscar. Looking first at a model of Log(Box Office^) = Age^:
Rejecting the null hypothesis that there is no relationship at the 10% level (statistical speak), increasing the age of the lead actor by one year is, on average, associated with a 2.1% decrease in box office revenue. This result should be taken with a grain of salt, however, as the 95% confidence interval does include 0 (although just barely) and it likely suffers from severe omitted variable bias—which would invalidate these results. For example, older lead actors might garner a lower production budget from the studio, which would be correlated with both their age and the film’s box office revenue. Luckily, I have a couple more variables that I can include.
Could the lead actor being American also be influencing the film’s revenue? Could the relative quality of the performance (winning the Oscar) be influencing revenue?
Controlling for the relative acting performances (whether it won the Oscar) and a possible home country bias (whether the actor is American), does lead to a slightly more accurate model. Age is now statistically significant at the 5% level and its 95% confidence interval no longer includes 0. Not surprisingly, and as we saw from the difference in film revenues earlier, whether the film won for best actor is not statistically significant.
However, if the lead actor is American could potentially be influencing the box office revenues— it is statistically significant at the 10% level and suggests that an American lead actor is associated, on average, with a 50% increase in box office revenues (from the descriptive data, the average box office revenue for an American is $60 million versus $54.7 million for a non-American). Again though, I would urge caution, as there is likely serious omitted variable bias in this result. For example, whether the film is produced in America, as opposed to internationally, could be correlated with whether the lead actor is American and box office revenue.
Out of the 115.6 million homes with televisions in 2014, approximately 7.8 million tuned in for The Bachelor season 19 premier- 7% of all TV viewership at 8pm EST that Monday night. Of these 7.8 million, it is likely that a sizable portion of loyal fans were gambling on the show’s outcome- as measured by a contestant’s longevity before being eliminated based on the contestant’s biographical data (which on the ABC homepage includes a photo, their age, occupation, and hometown). Whether the gambling was formal- there exist a number of online sports books that allow for betting on reality television despite information that can leak prior to the show airing- or informal- pools and drafts that may or may not include a financial reward- the show has potentially produced enough data over its 19 seasons for a savvy gambler to position him/herself statistically.
Given the data available on the contestants before the season premier, and accounting for attractiveness not being readily or objectively quantifiable, an analysis can be performed that looks at the week a contestant was eliminated versus their age, hometown, and occupation.
The data was compiled for seasons 11- 19 and included 239 contestants. A contestant’s outcome was measured to be the week they were eliminated, with a higher number week being a better outcome than a lower one. Elimination weeks ranged from 1 to 11 (for seasons 15-19) and 1 to 9 (for seasons 11-15) with the winner being coded ‘week 11’ (or ‘week 9’) and the runner-up being coded ‘week 10’ (or ‘week 8’).
Age required no coding and was entered directly from the biographical data. The average age was 26.7 years with a high of 36 and a low of 21.
For occupation, I decided to create a binary variable that would be equal to 1 if the contestant’s job required a college degree and 0 if it did not. If it was not obvious that the listed position required a college degree (i.e. ‘account manager’) then it was coded as a 0. This required some judgment on my part and it is possible contestants are incorrectly coded. Overall, 86 out of 238 (36%) contestants were coded as having a college degree.
For hometown, I decided to create a binary variable that would provide a proxy for ‘Southern’. I coded the hometown as 1 if the contestant was from a state that voted for Mitt Romney in the 2012 presidential campaign and as 0 if they were from a state that voted for Barack Obama- international contestants were coded as 0. Overall, 68 out of 238 (29%) contestants originated from a state that supported Romney in 2012.
Next, I wanted to use regression analysis to see how these variables (age, hometown, and occupation) were associated with the number of weeks a contestant remained on the show. Of course, there are many other omitted variables that impact longevity and the choice of ‘bachelor’ impacts the desired attributes of a contestant. This analysis is merely designed to identify past correlations.
First, I ran a regression of age on elimination week:
While the coefficient on age is negative (meaning that as age increases their longevity or week eliminated decreases), it is not statistically significant- even at the 10% level. This means that it is likely that if the true relationship between age and elimination week were 0 (i.e. they are not related) we would find an effect as large as the one we found 19.2% of the time. We therefore fail to reject this null hypothesis and can’t conclude that age impacts a contestant’s longevity on the show.
Second, I ran a regression of occupation on elimination week:
Again, while the coefficient on occupation is negative, it is not statistically significant at the 10% level- in fact this time it is not even close. This means that while there is a very loose negative correlation between holding a college degree and elimination week, it can’t be distinguished from a purely random result.
Third, I ran a regression of hometown on elimination week:
In this regression, there is a statistically significant positive relationship at the 1% level. What does this mean? It means that if being from a state that voted for Mitt Romney had no real impact on your elimination week, we would only find a result as large as what we found in 0.006% of samples. This means we can conclude the following: being from a state that voted for Mitt Romney in 2012 is associated on average with remaining on the Bachelor for an additional 1.085 weeks.
So while being older and educated may not have a true negative relationship with contestant longevity, it appears to have been a very good bet to choose contestants from politically conservative states over the past 8 seasons- and to continue choosing them in the future.
Please fill in the contact form below to have new articles emailed to you directly!