Polynomial Regression Models  I
• Original model:
• What if we create a new variable,
X
i
2
, i.e. the
value of
X
i
squared, and consider the multiple
regression model:
• This model can be estimated by regressing
Y
i
on
X
i
and
X
i
2
.
01
=
ii
i
YX
u
β
++
2
2
=
i
i
X
u
ββ
+
Polynomial Regression Models  II
• In this new model,
X
i
and
Y
i
have a nonlinear relationship
(unless
β
2
is zero!)
• Note that the interpretation of the coefficients is different than
before.
Specifically,
β
1
does not
measure the effect of a one
unit change in
X
i
on
Y
i
.
Why?
When
X
i
changes,
X
i
2
will
necessarily change. So the effect of a unit change in
X
i
on
Y
i
will depend on
both
β
1
and
β
2
.
• Assessing the meaning of the coefficients in this model is a
little tougher than before.
Let’s do it with an example.
2
2
ˆˆ
ˆ
ˆ
=
i
X
Polynomial Regression Models  III
• This model allows a nonlinear relationship
between test scores and average income.
• To estimate model in STATA,
– 1) Open the dataset (relevant variables are testscr
and avginc)
–2)
“gen avginc2 = avginc^2”
“regress testscr avginc avginc2”
2
2
Testscore =
avginc
avginc
i
i
u
+
Polynomial Regression Models  IV
. regress testscr avginc avginc2
Source 
SS
df
MS
Number of obs =
420
+
F(
2,
417) =
261.28
Model 
84599.2786
2
42299.6393
Prob > F
=
0.0000
Residual 
67510.3151
417
161.89524
Rsquared
=
0.5562
+
Adj Rsquared =
0.5540
Total 
152109.594
419
363.030056
Root MSE
=
12.724

testscr 
Coef.
Std. Err.
t
P>t
[95% Conf. Interval]
+
avginc 
3.850995
.3042617
12.66
0.000
3.252917
4.449073
avginc2 
.0423085
.0062601
6.76
0.000
.0546137
.0300033
_cons 
607.3017
3.046219
199.36
0.000
601.3139
613.2896

Polynomial Regression Models  V
• Note that in this estimated model, it is not even
obvious whether avginc positively or negatively
affects testscores.
Why?
One coefficient is positive
and the other coefficient is negative.
