For every additional car placed in service, estimate how much annual reve
nue will change.
Fox Rent A Car has 11,000 cars in service. Use the estimated regression
equation to predict annual revenue for Fox Rent A Car.
Answers:
ˆ
y
=

17
.
005
+
12
.
966
x
; increase by $12,966; $126 million
Tymon Sloczyński
Linear Regression
Introduction
Least squares
Inference
Residuals
Definition (
Residual
)
The difference between the observed value and the predicted value
of the dependent variable,
y
i

ˆ
y
i
.
∼
The error in using
ˆ
y
i
to
estimate
y
i
.
Definition (
Sum of squares due to error, SSE
)
SSE
=
(
y
i

ˆ
y
i
)
2
This object is minimized by OLS.
Tymon Sloczyński
Linear Regression
Introduction
Least squares
Inference
Further definitions
Suppose you do not have any good explanatory variables for
y
. How
would you make predictions in this situation? You would use
¯
y
.
Definition (
Total sum of squares, SST
)
SST
=
(
y
i

¯
y
)
2
Definition (
Sum of squares due to regression, SSR
)
SSR
=
(
ˆ
y
i

¯
y
)
2
Theorem (
Relationship between SSE, SST, and SSR
)
SST
=
SSR
+
SSE
Tymon Sloczyński
Linear Regression
Introduction
Least squares
Inference
Caution
These objects — SST, SSR, and SSE — are sometimes called dif
ferently, and this might be quite confusing. For example, the eco
nometrics textbook that is typically used at Brandeis,
Introduction
to Econometrics
by James H. Stock and Mark W. Watson, uses the
following terms:
Total sum of squares (TSS); so the term is the same but the
abbreviation is different (TSS instead of SST)
Explained sum of squares (ESS) instead of sum of squares due
to regression (SSR)
Sum of squared residuals (SSR) instead of sum of squares due
to error (SSE)
You must be very careful.
“SSR” refers to different objects in
these two textbooks.
Tymon Sloczyński
Linear Regression
Introduction
Least squares
Inference
Deviations about
ˆ
y
and
¯
y
Tymon Sloczyński
Linear Regression
Introduction
Least squares
Inference
Measuring goodness of fit
In many applications, we might be interested in evaluating whether
our estimated regression equation fits the data well — we wish to
measure
goodness of fit
.
How can we approach this?
Think about
two extremes
: perfect fit and worst fit.
Perfect fit:
every value of
y
i
lies on the estimated regression
line.
Then, for each
i
,
y
i

ˆ
y
i
=
0. Consequently,
SSE
=
0; hence,
SST
=
SSR
; hence
SSR
SST
=
1.
Worst fit:
larger values of
SSE
correspond to worse fit. Re
call that
SSE
=
SST

SSR
. So when does
SSE
attain its
maximum value? When
SSR
=
0; hence,
SSR
SST
=
0.
Thus, these two extremes correspond to
SSR
SST
=
1 and
SSR
SST
=
0.
Tymon Sloczyński
Linear Regression
Introduction
Least squares
Inference
Measuring goodness of fit — coefficient of determination
Definition (
Coefficient of determination
)
r
2
=
SSR
SST
Interpretation:
percentage of the total sum of squares that can be
explained by a given regression.
You've reached the end of your free preview.
Want to read all 54 pages?
 Winter '20
 Regression Analysis, Tymon Słoczyński