Lecture 10 – Estimating and Using a Regression Line
Regression analysis involves two steps:
1) First, formulate theory
, or specify a model; for example, you have a theory that parents’ income
determines the grade point average of a student;
2) second, you test the theory or model
(we test to see whether the evidence is consistent with the
hypothesized theory, not whether the theory is proved true for all time.)
Regression analysis discovers whether there is a correlation,
between parent’s income and grade point
average, however it does not tell you whether the parent’s income causes
the grade point average.
underlying theory specifies causation, i.e., a change in parent’s income causes a change in grade point
This implies that changes in income should be correlated with changes in grade point average.
This is all the regression analysis can discover. Note, however, that if the causation were reversed, i.e., a
change in grade point average causes a change in income, then we would still find a correlation between
changes in grade point average and changes in income.
So finding that two variables are correlated is not
the same thing as proving that one causes the other. Regression analysis proves variables are correlated
but cannot prove causation
Regression analysis is just a type of statistical problem, as we will now show by demonstrating that a
regression problem can be put in the form of the population-random sample picture we have developed.
Specifying the Model
Step one in a regression problem is to specify the model.
As we have seen, the model specification is
determined by the economic theory.
Every theory has both dependent and independent variables. The
dependent variable depends on the independent variables, which are assumed to vary independently of the
dependent variable. The number of independent variables will vary from theory to theory.
In general terms,
if we let Y stand for the dependent variable and X stand for the independent variable, then we may write
the following general (implicit) equation form:
There is one requirement that must be met in order to use regression analysis.
This is that the functional
form of the model specification must be linear
. Notice that it is the equation that must be linear, not the
variables in the equation.
This distinction is illustrated as follows. The linear form of the above general
equation is given as follows:
In this equation,
, is the intercept, i.e., the value of
when each of the independent variables takes a
value equal to 0.
tells us how much the dependent variable,
, changes when there is a one unit change
holding all other independent variables constant
. To say that the equation must be linear is to say
-coefficients must enter in the form they appear in the above equation.
That is, each B is raised to
the first power, and only to the first power.