1
Chapter 3 Supplement
Again, at the review, I deliberately did NOT cover some things for the sake of time.
I needed to
turn an 8 hour review into a 4 hour review.
That being the case, I will elaborate on a few more
things you should know from each chapter in my supplements.
r
, the correlation coefficient
As I said that the review,
r
measures how strong the linear
relationship between
x
and
y
, i.e.
between 2 quantitative variables.
Shown below are various values or
r
.
r
= –1
perfect (–)
linear rel.
r
= –.50
moderate (–)
linear rel.
r
= 0
no linear
relationship
r
= +.85
strong (+)
linear rel.
r
= +1
perfect (+)
linear rel.
A few notes about
r
:
•
r
has no units and is bounded between –1 and +1. The closer the points are to a line, the
closer
r
will be to –1
or +1.
•
The closer
r
is to – 1 or +1 the strong the relationship is between
x
and
y
.
[What’s considered strong depends on the field of study.]
•
r
must have the same sign as the slope.
If the slope is positive,
r
must be positive.
If the
slope is negative,
r
must be negative.
•
On the exam, you’ll either be given
r
, or
r
2
.
If you are given
r
2
, to find
r
take the
square root of
r
2
, i.e.
=
r
(sign of the slope)
2
r
.
•
Correlation does NOT imply causation!
Just because
x
and
y
are correlated (related), it
does NOT imply that changes in
x
cause changes in
y
.

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
2
•
r
is NOT affected by the units of
x
and
y
.
Suppose we wanted to predict a student’s
weight (
y
, in lbs.) based on his/her height (
x
, in inches).
If, for example, we converted
lbs. to kilograms and inches to centimeters, the correlation will NOT change.
•
Switching what we call
x
= the explanatory variable and
y
= the response will NOT
change
r
.
Note
:
Changing the units of
x
and/or
y
, or changing what we call
x
and
y
will change the slope
and the
y
-intercept of the regression line.
[but not
r
]
On the exam, your professor may give you several scatter plots and have you determine the
approximate values of
r
.
You can get practice guessing
r
at the following site.
http://www.stat.uiuc.edu/courses/stat100//java/GCApplet/GCAppletFrame.html
Shown below is a screen shot from the website.
We can see that Plots A, C and D have downward slopes, so we know the correlations must be
negative.
Since Plot A’s points are the most spread out, it implies that it is the one with the
weakest negative correlation, i.e. closest to 0.
So, the correlation for Plot A is –0.44.
We can see
that Plot C is slightly tighter than Plot D, so it’s correlation will be slighter closer to –1.
So, the
correlation for Plot C is –0.82 and the correlation for Plot D is –0.81.
By process of elimination, the correlation for Plot B must be +0.24.
This should make sense.
There is a weak (the points do NOT fall close to a line) positive (b/c the points are upward
sloping) relationship between
x
and
y
.