Multiple Regression
In multiple linear regression more than one explanatory variable is used to
explain or predict a single response variable.
Many of the ideas of linear
regression (one explanatory variable, one response variable) carry over to
multiple linear regression.
Multiple Linear Regression Model
The
statistical model for multiple linear regression
is
0
1 1
2 2
.....
p p
y
x
x
x
β
ε
=
+
+
+
+
+
p
is the number of explanatory variables in the model.
The
deviations/error,
,
are independent and normally distributed with mean
0 and standard deviation σ.
The parameters of the model are
0
,
1
,
2
,….
.,
p
, and σ.
So what do we do when we have more than one “X” variable?
1.
Look at the variables individually.
Graph (stem plot, histogram) each
variable, determine means, standard deviations, minimums, and maximums.
Are there any outliers?
2.
Look at the relationships between the variables
using the correlation and
scatter plots.
Do a scatterplot, determine a correlation between each pair of
data.
To determine a correlation between each pair, enter all the variables (the y
and all the x’s) into SPSS, the select Analyze>>Correlate>>Bivariate.
The
higher the correlation between 2 variables, the lower the Sig.(2tailed), the
better.
This will help you determine which are the stronger relationships
between the y and an x.
3.
Do a regression
to define the relationship of the variables.
I start with all
potential explanatory variables and the response variable, the regression
results will indicate/confirm which relationships are strong.
This is the most
common procedure, but another one will be discussed as well.
Page 1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
parameters
0
β
,
1
,
2
,….
.,
p
, and σ.
The sample has
n
observations.
Perform the multiple regression procedure on the data from the
n
observations.
0
1
2
,
,
,......
,
p
b
b b
b
denote the estimators of the population parameters
0
,
1
,
2
,….
.,
p
Another notation is
j
b
, the
j
th estimator of
j
, the
j
th population parameter,
where
j =
0, 1, 2, ….,
p,
and
p
is the number of explanatory variables in the
model.
For the ith observation, the predicted response is:
$
0
1 1
2
2
....
i
i
i
p ip
y
b
b x
b x
b x
=
+
+
+
+
The ith residual, the difference between the observed and predicted response
is:
i
e
= observed response – predicted response =
$
i
i
y
y

The method of least squares minimizes:
2
1
( )
n
i
i
e
=
∑
, or
2
(
)
i
i
y
y

∑
$
The parameter
2
σ
measures the variability of the response about the
regression equation.
It is estimated by:
2
2
1
i
s
e
n p
=
∑
 
The quantity
1 is the degree of freedom associated with
2
s
.
To determine/confirm which explanatory variables have strong relationships,
look at the slope tests and the ANOVA.
Page 2
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '08
 Staff
 Linear Regression, Regression Analysis, R Square, Sig., explanatory variables, SEB

Click to edit the document details