STA 3024 Introduction to Statistics 2
Chapter 6: Multiple Linear Regression Analysis
As stated in chapter 3 and chapter 4, the table below summarizes the major materials
that we need to cover
Table 1: Methods to Investigate the Association between Variables
Explanatory Variable(s)
Response Variable
Method
Chapter 3
Categorical
Categorical
Contingency Tables
Chapter 4
Categorical
Quantitative
Analysis of Variance (ANOVA)
Chapter 5 and 6
Quantitative
Quantitative
Regression Analysis
Quantitative
Categorical
(not discussed)
This chapter deals with cases where both explanatory and response variables are quanti-
tative where we’ll use regression analysis to study the association between the two variables.
The regression methods that we’re studying in this chapter restricted to the
linear
regression
family (as opposed to
nonlinear
regression analysis).
If there’s only one quantitative explanatory variable, then we’ll study
simple linear re-
gression
.
If there are more than one explanatory variables, then we’ll introduce
multiple
linear regression
.
This chapter corresponds to chapter 13 in our textbook.
1

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
PART I - BACKGROUND
1.1 Background and Remarks
Sometimes we want to take into account more than one factor to explain/predict an
outcome. For example, there is no one factor that can determine NBA salaries but many of
them: average point per game, average rebound, steals, etc. That’s why the simple linear
regression need to be generalized to multiple linear regression.
Multiple linear regression model
relates the mean
μ
Y
of a quantitative response
variable
Y
to a
set
of
p
independent explanatory variables
X
1
, X
2
, . . . , X
p
.
The multiple linear regression equation for the population is
μ
Y
=
α
+
β
1
X
1
+
β
2
X
2
+
. . .
+
β
p
X
p
.
The sample prediction equation is
ˆ
Y
=
a
+
b
1
X
1
+
b
2
X
2
+
. . .
+
b
p
X
p
.
Here,
α
is a population parameter; it is the
Y
-intercept, or the value of
Y
when all of
the explanatory variables are 0. Correspond to the population parameter
α
is the sample
statistic
a
which is the estimate of
α
based on the sample data.
Furthermore,
β
1
, β
2
, . . . , β
p
are the coefficients of the independent variables
X
1
, X
2
, . . . , X
p
(they are population parameters).
Correspond to
β
1
, β
2
, . . . , β
p
are the sample statistics
b
1
, b
2
, . . . , b
p
which are the estimates of
β
1
, β
2
, . . . , β
p
based on the sample data.
For any particular combination of values of
X
1
, X
2
, . . . , X
p
, the value of
ˆ
Y
is the esti-
mate for
μ
Y
.
Note that we are dealing with
linear
regression. That is, the relationship between
μ
Y
and
each
of the explanatory variable
X
1
, X
2
, . . . , X
p
is linear.
Example:
Linear or Nonlinear?
•
μ
Y
=
α
+
β
1
X
1
+
β
2
X
2
+
. . .
+
β
p
X
p
.
•
μ
Y
=
e
α
+
log
(
β
1
)
*
log
(
X
1
) + log(
β
2
)
*
log
(
X
2
) +
. . .
+
log
(
β
p
)
*
log
(
X
p
)
•
μ
Y
=
e
α
+
β
1
X
1
+
β
2
X
2
+
...
+
β
p
X
p
•
μ
Y
=
cos
(
α
) +
sin
(
β
1
)
sin
(
X
1
) +
cos
(
β
2
)
cos
(
X
2
) +
. . .
+
sin
(
β
p
)
sin
(
X
p
)
•
μ
Y
=
cos
(
α
) +
sin
(
β
1
X
1
) +
cos
(
β
2
X
2
) +
. . .
+
sin
(
β
p
X
p
)
N
Remark:
In practice, we need to be careful when deciding how many explanatory vari-
ables should be included in a multiple regression model because additional explanatory

This is the end of the preview.
Sign up
to
access the rest of the document.