<Correlation coefficient>
Descriptive study of bivariate data
Univariate: score, height, number of heads,…
Bivariate: (height, weight), (age, income),…
Data:
11
2 2
(,)
,
,,
( ,)
nn
X
YX
Y
XY
K
Purpose of collecting bivariate data
- Are the variables related?
- What form of relationship is indicated by the data?: linear, quadratic,…
- Can we quantify the strength of their relationship?: strong, weak
- Can we predict one variable from the other?
Scatter diagram
- provides a visual display of a relationship
Ex) (2,5) (1,3) (5,6) (0,2)
<Correlation: a measure of linear relationship>
Book: 11.7 (p.590-593), Note: p.12-13
(Sample) Correlation coefficient
r = measure of strength of the linear relation between x and y based on data (=
ˆ
ρ
)
- Figure 11.19 (p.591)
- The value of r is always between -1 and +1
- r>0: the pattern of (x,y) values is a band that runs from lower left to upper right
- r<0: the pattern of (x,y) values is a band that runs from upper left to lower right
- r=+1: all (x,y) values lie exactly on a straight line with a positive slope (perfect positive linear