This preview shows pages 1–2. Sign up to view the full content.
Professor Green
Perils of Multiple Regression
Leastsquares regression rests on several assumptions about the causal process by which
the data were generated.
Becoming an intelligent consumer of statistical information requires
one to understand each assumption and how the results may be distorted if it is violated.
A. The independent variables are statistically independent of the disturbances.
This
assumption implies, among other things, that (1) no omitted predictors of Y are
correlated with X and (2) causation flows in just one direction from X —> Y.
B.
The independent variables are measured without error.
C.
Each of the disturbances is drawn from the same underlying distribution.
Thus,
disturbances are expected to have the same variance from one observation to the next.
D.
Each disturbance is statistically independent of all other disturbances.
Violations of (A) and (B) are potentially very serious: biased and possibly misleading regression
slopes, standard errors, and regression diagnostics.
Violations of (C) and (D) result in biased
standard errors and therefore inaccurate confidence intervals or hypothesis tests.
To fix ideas, let’s consider an example based loosely on the notorious
Bell Curve
.
In that book,
the authors argued that race affected incomes, even after controlling for education.
The political
implications of their argument were, and remain, quite explosive.
The results were based on
young people who had been surveyed in high school and tracked as they entered the labor
market thereafter.
For the sake of exposition, I have simplified the data analysis, while keeping
the basic patterns of the data intact.
Here are the independent variables.
First,
race
is coded 0 for whites and 1 for nonwhites.
Tally for Discrete Variables: RACE
RACE
Count
Percent
0
367
73.40
1
133
26.60
N=
500
Education is measured by years of schooling.
Clearly, this is a sloppy measure of educational
attainment (warming a chair does not an education make).
For the sake of the example, imagine
that we have two education variables: one that measures true educational attainment and one that
merely measures years of schooling.
This is the “true” measure of education.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 04/07/2008 for the course STAT 102 taught by Professor Jonathanreuningschererdonaldgreen during the Fall '05 term at Yale.
 Fall '05
 JonathanReuningSchererDonaldGreen
 Statistics

Click to edit the document details