26
Lecture 10 – Estimating and Using a Regression Line
Regression analysis involves two steps:
1) First, formulate theory
, or specify a model; for example, you have a theory that parents’ income
determines the grade point average of a student;
2) second, you test the theory or model
(we test to see whether the evidence is consistent with the
hypothesized theory, not whether the theory is proved true for all time.)
Regression analysis discovers whether there is a correlation,
between parent’s income and grade point
average, however it does not tell you whether the parent’s income causes
the grade point average.
The
underlying theory specifies causation, i.e., a change in parent’s income causes a change in grade point
average.
This implies that changes in income should be correlated with changes in grade point average.
This is all the regression analysis can discover. Note, however, that if the causation were reversed, i.e., a
change in grade point average causes a change in income, then we would still find a correlation between
changes in grade point average and changes in income.
So finding that two variables are correlated is not
the same thing as proving that one causes the other. Regression analysis proves variables are correlated
,
but cannot prove causation
.
Regression analysis is just a type of statistical problem, as we will now show by demonstrating that a
regression problem can be put in the form of the population-random sample picture we have developed.
Specifying the Model
Step one in a regression problem is to specify the model.
As we have seen, the model specification is
determined by the economic theory.
Every theory has both dependent and independent variables. The
dependent variable depends on the independent variables, which are assumed to vary independently of the
dependent variable. The number of independent variables will vary from theory to theory.
In general terms,
if we let Y stand for the dependent variable and X stand for the independent variable, then we may write
the following general (implicit) equation form:
Y
F X
X
X
X
n
=
1
2
3
,
,
,
.....
b
g
.
There is one requirement that must be met in order to use regression analysis.
This is that the functional
form of the model specification must be linear
. Notice that it is the equation that must be linear, not the
variables in the equation.
This distinction is illustrated as follows. The linear form of the above general
equation is given as follows:
Y
B
B X
B X
B X
k
kt
=
+
+
+
t
t
t
+
0
1
1
2
2
....
.
In this equation,
, is the intercept, i.e., the value of
Y
when each of the independent variables takes a
value equal to 0.
tells us how much the dependent variable,
Y
, changes when there is a one unit change
in
,
B
0
B
i
X
i
holding all other independent variables constant
. To say that the equation must be linear is to say
that the
B
-coefficients must enter in the form they appear in the above equation.
That is, each B is raised to
the first power, and only to the first power.
This
preview
has intentionally blurred sections.
Sign up to view the full version.

This is the end of the preview.
Sign up
to
access the rest of the document.
- Spring '11
- YongJinPark
- Econometrics, Regression Analysis, Variance, Errors and residuals in statistics, Grade Point Average
-
Click to edit the document details