Stat 206: Linear Models
Lecture 2
Sept. 30, 2019

Simple Linear Regression Model (Review)
n
cases
(trials/subjects):
Y
i
– the value of the response variable in
the
ith
case;
X
i
– the value of the predictor variable in the
ith
case.
•
Model equation
:
Y
i
=
β
0
+
β
1
X
i
+
ε
i
,
i
=
1
, . . . ,
n
.
(1)
•
Model assumptions
:
•
i
s are uncorrelated, zero-mean, equal-variance random
variables:
E
(
i
) =
0
,
Var
(
i
) =
σ
2
,
i
=
1
, . . . ,
n
Cov
(
i
,
j
) =
0
,
1
≤
i
,
j
≤
n
.
•
Unknown parameters
:
•
β
0
– regression intercept;
β
1
– regression slope
•
σ
2
: error variance

Given
X
i
s, the distributions of the responses
Y
i
s have the following
properties:
•
The response
Y
i
is the sum of two terms:
•
which is
•
which has
•
i
s have constant variance
=
⇒
•
i
s are uncorrelated
=
⇒

In summary, the simple linear regression model says that the
responses
Y
i
are
•
•
whose means are
•
whose variances are
•
Moreover, two responses
Y
i
and
Y
j
(
i
,
j
)
are

Regression Function
y
=
β
0
+
β
1
x
•
A
•
β
1
is the
of the regression line: the change in
per unit change of
X
.
•
β
0
is the
of the regression line: the value of
E
(
Y
)
when
We will study how to model and fit the regression function from
data.

Figure:
Regression line:
y
=
β
0
+
β
1
x
0
1
2
3
4
5
0
1
2
3
4
5
6
7
x
y
y=beta_0+beta_1 x
{
1 unit of x
{
beta_1
{
beta_0

Least Squares Principle
For a given line:
y
=
b
0
+
b
1
x
, the
sum of squared vertical
deviations
of the observations
{
(
X
i
,
Y
i
)
}
n
i
=
1
from the corresponding
points on the line is:
•
(
X
i
,
b
0
+
b
1
X
i
)
is the point on the line with
as
the
i
th observation point
(
X
i
,
Y
i
)
.
•
The
least squares (LS) principle
is to fit the observed data by
the sum of squared vertical deviations.
LS line has the
sum of squared vertical deviations among
all straight lines.

Figure:
Illustration of LS principle
●
●
●
●
●
0.5
1.0
1.5
2.0
2.5
3.0
3.5
1
3
5
7
x
y
Q(3,0.5)= 5.539
y=3+0.5x
●
●
●
●
●
0.5
1.0
1.5
2.0
2.5
3.0
3.5
1
3
5
7
x
y
Q(2.5,1)= 3.041
y=2.5+x
●
●
●
●
●
0.5
1.0
1.5
2.0
2.5
3.0
3.5
1
3
5
7
x
y
Q(2,1)= 2.984
true regression line:
y=2+x
●
●
●
●
●
0.5
1.0
1.5
2.0
2.5
3.0
3.5
1
3
5
7
x
y
Q(2.09,1.07)= 2.659
LS line: y=2.09+1.07x
Which line has the smaller sum of squared vertical deviations, the
LS line (a.k.a. the fitted regression line) or the true regression line?

Least Squares Estimators
LS estimators of
β
0
, β
1
are the pair of values
b
0
,
b
1
that minimize
the function
Q
(
·
,
·
)
:
(
ˆ
β
0
,
ˆ
β
1
) =
argmin
b
0
,
b
1
Q
(
b
0
,
b
1
)
.

#### You've reached the end of your free preview.

Want to read all 26 pages?

- Summer '19