Regression
Regression:
Statistical technique for determining the bestfitting straight line for a set of data
•
The bestfitting straight line is called the
regression line
.
•
The regression line is a mathematical model of the relationship between the variables, and it can be
used to predict the value of Y (DV) from a known value of X (IV).
1. Equation of the regression line
Ŷ = bX + a
Ŷ= predicted value of Y; predicted score on DV
b = slope of regression line; the change in Y for each unit of increase in X
X = known value of X; known score on IV
a = yintercept; where the regression line crosses the Y axis; the value of Ŷ when X = 0
2. How is the bestfitting line determined?
The equation for slope and yintercept are derived from a calculusbased procedure called the
least squares
method
. In this method, the goal is to minimize the distance between the line and the actual data points. (The
squares
part of this term comes from squaring the distance.)
Y – Ŷ = error or residual (the difference between actual and predicted value of Y)
(
29
=

∑
2
ˆ
Y
Y
SS
error
= sum of squared error
The bestfitting regression line is defined as the line that has the smallest possible SS
error
3. Equations for slope and yintercept
You are NOT required to memorize the following equations. I’m providing these equations here so you know
that the values for
b
and
a
don’t just come out of thin air.
b =
(
29 (
29
(
29
∑
∑



2
X
Y
X
M
X
M
Y
M
X
a =
X
Y
bM
M

4. Using the Regression Line to Make Predictions
Example 1: We can use the data set I provided at the beginning of the unit on correlation to predict depression
score from optimism score. I calculated the slope and yintercept, and they are b = .44 and a = 16.79
We can plug those values into the regression equation to predict depression
Ŷ = .44X + 16.79
Notation note
: Ŷ is called Y hat.
Notation note
: M
X
is the mean of the X values,
and M
Y
is the mean of the Y values.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentPage
2
of
6
Ŷ = predicted depression score
X = known optimism score
a) What if someone has an optimism score of 14? What do you predict for depression?
Ŷ = .44X + 16.79 = .44(14) + 16.79 = 10.63
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '08
 Chow
 Regression Analysis, regression line

Click to edit the document details