Now if the regression is done on the observed
X
(i.e. the error prone measurement), the regression
equation reduces to:
Y
i
=
β
0
+
β
1
X
i
+ (
i

β
1
δ
i
)
Now this violates the independence assumption of ordinary least squares because the new “error” term
is not independent of the
X
i
variable.
If an ordinary least squares model is fit, the estimated slope is biased (Draper and Smith, 1998, p.
90) with
E
[
b
β
1
] =
β
1

β
1
r
(
ρ
+
r
)
1 + 2
ρr
+
r
2
where
ρ
is the correlation between
ξ
and
δ
; and
r
is the ratio of the variance of the error in
X
to the error
in
Y
.
The bias is negative, i.e. the estimated slope is too small, in most practical cases (
ρ
+
r >
0
). This is
known as
attenuation
of the estimate, and in general, pulls the estimate towards zero.
c 2015 Carl James Schwarz
906
20150820
CHAPTER 14. CORRELATION AND SIMPLE LINEAR REGRESSION
The bias will be small in the following cases:
•
the error variance of
X
is small relative to the error variance in
Y
. This means that
r
is small (i.e.
close to zero), and so the bias is also small. In the case where
X
is measured without error, then
r
= 0
and the bias vanishes as expected.
•
if the
X
are fixed (the Berkson case) and actually used
2
, then
ρ
+
r
= 0
and the bias also vanishes.
The proper analysis of the errorinvariables case is quite complex – see Draper and Smith (1998, p.
91) for more details.
14.4.5
Obtaining Estimates
To distinguish between population parameters and sample estimates, we denote the sample intercept by
b
0
and the sample slope as
b
1
. The equation of a particular sample of points is expressed
b
Y
i
=
b
0
+
b
1
X
i
where
b
0
is the estimated intercept, and
b
1
is the estimated slope. The symbol
b
Y
indicates that we are
referring to the estimated line and not to a line in the entire population.
How is the best fitting line found when the points are scattered? We typically use the principle of
least squares
. The leastsquares line is the line that makes the sum of the squares of the deviations of
the data points from the line in the
vertical
direction as small as possible.
Mathematically, the least squares line is the line that minimizes
1
n
∑
Y
i

b
Y
i
2
where
b
Y
i
is the
point on the line corresponding to each
X
value. This is also known as the predicted value of
Y
for a
given value of
X
. This formal definition of least squares is not that important  the concept as expressed in
the previous paragraph is more important – in particular it is the SQUARED deviation in the VERTICAL
direction that is used..
It is possible to write out a formula for the estimated intercept and slope, but who cares  let the
computer do the dirty work.
The estimated intercept (
b
0
) is the estimated value of
Y
when
X
= 0
. In some cases, it is meaningless
to talk about values of
Y
when
X
= 0
because
X
= 0
is nonsensical. For example, in a plot of income vs.
year, it seems kind of silly to investigate income in year 0. In these cases, there is no clear interpretation
of the intercept, and it merely serves as a placeholder for the line.
The estimated slope (
b
1
) is the estimated change in
Y
per
unit
change in
X
. For every unit change
in the horizontal direction, the fitted line increased by
b