Chapter 11
Inferences for Regression Parameters
11.1 Simple Linear Regression (SLR) Model
This topic is covered in Chapter 2 (which we skipped). In these notes we are going to cover
sections 2.3 to 2.10 (which in a sense describe the relation between two variables and hence
treated early as descriptive statistics) as well the material in Chapter 11 that covers inference
about the regression parameters.
A simple linear regression model
is a mathematical relationship between two quantitative
variables, one of which, Y, is the variable we want to predict, using information on the
second variable, X, which is assumed to be non-random. The model is written as
0
1
Y
x
β
β
ε
=
+
+
.
In this model,
•
Y
is called the
response variable
or the
dependent variable
(because its value
depends to some extent on the value of X).
•
X
is called the
predictor
(because it is used to predict Y) or
explanatory variable
(because it explains the variation or changes in Y). It is also called the
independent
variable
(because its value does not depend on Y).
•
True regression line
is denoted as
0
1
Y
x
β
β
=
+
•
The
parameters of the true regression line are
the
constants, β
0
and β
1
•
β
0
is the intercept of the true regression line.
•
β
1
is the slope of the true regression line.
•
The
true
values of the regression parameters as well as the true regression line are
unknown.
The true regression line shows the deterministic relationship between X
and Y (since X is a non-random variable and β
0
as well as β
1
are (unknown)
constants.
•
The random error, ε,
incorporates the effects, on Y, of all variables (factors) other
than X, in such a way that their net effect is zero on the average.
•
An
observation on the i
th
unit in the population
, denoted by y
i
, is
0
1
i
i
i
y
x
β
β
ε
=
+
+
.
•
Here, ε
i
is the difference between the
observed value of Y
and the
value on the true
regression line that corresponds to X = x
i
.
•
The ε
i
are independent of each other and they all have the same normal distribution,
with mean zero and variance σ
2
, that is
ε
i
~ N
iid
(0, σ
2
).
•
As a result of the above property,
0
1
i
i
i
y
x
β
β
ε
=
+
+
are random variables, that
have normal distributions with mean μ
Y
that depends on the value of X and variance
σ
2
that is the same for all X values, i.e.,
2
0
1
|
~
(
,
)
i
i
i
Y X
y
x
N
β
β
ε
μ
σ
=
+
+
•
0
1
ˆ
ˆ
ˆ
y
x
β
β
=
+
is called the
prediction equation
and
ˆ
y
is the predicted value of Y
for X = x.
•
Residual =
ˆ
ˆ
i
i
i
i
y
y
e
ε
-
=
=
is the difference between the observed and predicted
value of Y.
STA 3032 Chapter 11 Page 1 of 20
This
preview
has intentionally blurred sections.
Sign up to view the full version.
•
The parameters of the true regression line are estimated the method of least squares,
(LSE), where sum of the squared residuals
2
1
n
i
i
e
=
∑
is minimized, subject to
1
0
n
i
i
e
=
=
∑
.
•
LSE of β
0
and β
1
are
1
1
0
0
1
ˆ
ˆ
Y
X
S
b
r
and
b
Y
b X
S
β
β
=
=
×
=
=
-
.

This is the end of the preview.
Sign up
to
access the rest of the document.
- Summer '09
- Statistics, Inferential Statistics, Linear Regression, Regression Analysis, Errors and residuals in statistics, crude oil, regression line
-
Click to edit the document details