University of Illinois at Urbana-Champaign
Professor Ron Laschever, ECON 440, Spring 2010
1
Problem set #6
Some Thoughts and Answers to Extra Problems.
First, I know for some this was the first hands-on experience with regression (beyond a stats
course). I will truly sleep better at night knowing you graduated with at least running one
regression during your stay at Illinois.
It is beyond the scope of our class and one assignment to discuss all things related to regression
analysis, and I very much enjoyed the discussion we had in class about the results. As you add
more and more controls (more explanatory variables), you would very much mimic the studies
we discussed in class.
As always, you want to keep in mind what Mark Twain attributed (perhaps wrongfully) to
Disraeli (the then British Prime Minister): “There are three kinds of lies: lies, damned lies, and
statistics.”
You of course want to make sure you correctly run the regressions (and most people. did). The
more challenging part is to decide which variables to include (here, a model or theory is useful),
and how to interpret the results. The grading and evaluations of your answers is very lenient.
1)
Define a variable OVER42 which takes on the value 1 if someone is 42 or over, and 0
otherwise (41.99 and younger)
2)
Run a regression where Layoff (a 0,1 variable) is the dependent variable and OVER42 is
the independent variable.
See attached.
3)
Using only the results of the regression, were layoffs age-discriminatory? Why?
The size of the coefficient tells us the economics importance (28 percentage points more likely).
But, it also is statistically significant at the 5% level. We reject the null that layoff are blind to
age, in other words, it is consistent with a correlation between being over 42 and layoff
(remember we can never ACCEPT a null, only reject it or not).
4)
The company would try to argue there was cause for layoffs. Run a regression with
Layoff as the dependent variable and two independent variables OVER42 and RATING
(include both in the regression).
See attached.
5)
Using the results of the regression, could we argue that layoffs were performance-based
rather than age discriminatory? Why?
Now the over42 coefficient is no longer statistically significant at any conventional level. Ratings
has the expected sign (higher rating, less likely to be fired), and is statistically significant at
p<0.01 level. We can no longer reject the null that being over 42 doesn’t matter. We reject the
null that ratings didn’t matter. This is certainly evidence in favor of performance-based layoffs.
Some additional issues we mentioned in class are: economic significance (coefficient size) vs.
statistical significance (p-value); The role and importance of R-squared (might be helpful to
think which model fits better, but doesn’t actually convey much info on which of the variables
matter); Functional form (we used OLS, but perhaps logistical regression would be more
appropriate as we have a binary dependent variable); Is the error term independent of our
variables? If not, the results would be biased.