This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Correlation
Lecture 4 Bivariate Relationships Relations between two variables Univariate Distributions Bivariate Distributions How is one variable related to another variable? Distribution of scores on one variable Joint distribution of two variables in which the scores are paired Correlation The degree of association between two variables Examples Causation? Bad weather is correlated with car accidents Wearing sunblock is correlated with diagnosis of skin cancer. Correlation does not equal causation But, for cause to exist, there must be a correlation Example
My friend wanted to know about dating habits of undergraduate students. Specifically, she wanted to know about the degree to which people think men should pay on dates and whether this amount relates to successful dating. All (x,y) pairs stay together
Degree men should Dating pay (X) success (Y) 2 2 3 4 5 2.10 1.80 3.10 3.20 2.60 1 step: Graph the Data
st
Degree mean should pay (X) 2 2 3 4 5 Dating success (Y) 2.10 1.80 3.10 3.20 2.60
3.4 3.2 dating success (rating) 3.0 2.8 2.6 2.4 2.2 2.0 1.8 1.6 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 Degree people think men should pay for dates Scatterplots A graph of the bivariate frequency distribution Each point represents a pair of scores Variable 2 (X, Y) Variable 1 Step 2: How well are the two variables related? Correlation indicates the degree to which two variables are related. The variables are related in a linear fashion Drawing a line that represents the relation (line of best fit) 3.4 3.2 3.0 dating success (rating) 2.8 2.6 2.4 2.2 2.0 1.8 1.6 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 Degree people think men should pay for dates What is the relation between women paying and dating success?
3.4 3.2 3.0 2.8 2.6 2.4 DATESUC 2.2 2.0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 WOMENPAY Is there a relationship between the number of letters in a person's name and their dating success?
3.5 3.0 2.5 2.0 DATESUC 1.5 1.0 1 2 3 4 5 6 7 8 LETTERS Direction Positive correlation Negative correlation As one variable goes up, the other goes up As one variable goes down, the other goes down As one goes up, the other goes down As one goes down, the other goes up No relation between the two variables Zero correlation Positive correlation
25 20 15 10 5 0 0 1 2 3 4 5 6 Negative Correlation
30 25 20 15 10 5 0 0 1 2 3 4 5 6 Zero Correlation
25 20 15 10 5 0 0 1 2 3 4 5 6 Guess the direction of the correlation...
1. Time spent studying with scores on exams Height with weight Number of cigarettes smoked a day with life expectancy Color you prefer with IQ 1. 1. 1. Solutions & Explanation
1. Time spent studying with scores on exams positive; the scores on exams increase as time spent studying increase. Height and weight positive; as height increases, the weight increases. Number of cigarettes smoked a day and life expectancy negative; the more you smoke, life expectancy decreases. Color you prefer and IQ none; these variables are not related 1. 1. 1. Correlation Coefficient: r Descriptive statistic that expresses the degree of relation between two variables. Example: what is the relation between conscientiousness and longevity? Calculating r 3 methods listed in the book We will only use the definitional method Because we use Zscores, we use the population SD deviation formula (just N in the denominator) Calculation Example Properties of r Ranges from 1 to 1 Comes from a bivariate distribution There is no IV and DV per se, but we assign one variable as X and one as Y Interpreting r Direction & Magnitude Is it positive or negative And to what degree? r2 (or a silly interpretation of r) Strong, Medium, or Weak? r2 the coefficient of determination How much variance in Y is accounted for by X? Rule of thumb...only use r2 if you really know why you are using r2. Isn't very useful because people don't understand variance Size of the effect Look at the absolute value Values close to 1.00 are a strong positive correlation Values close to 1.00 are a strong negative correlation Values close to zero indicate a weak or nonexistent correlation Size of the Effect Book gives some prescriptions Here are mine... r < .10 = weak .10 < r < .30 = medium .30 < r < .50 = strong r > .50 = huge Example 1
Time spent studying with scores on exams (r = .08) Strength: small Direction: positive Explanation: There is a small positive correlation between time spent studying and exam scores such that as time spent studying increases, exam scores increase. Example 2
Height & Weight (r = .75) Strength: huge Direction: positive Explanation: There is a huge positive correlation between height and weight such that as height increases, weight increases. Example 3
Number of cigarettes smoked per day and life expectance (r = .28) Strength: Medium Direction: Negative Explanation: There is a medium negative relationship between smoking cigarettes and life expectancy such that as the number of cigarettes smoked per day increases, life expectancy decreases. Related variables with small r's Could be due to two things: Truncated Range Nonlinear relationships Scatterplots are useful for both of these Nonlinearity
20 18 16 14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 7 Truncated Range Range of the sample is much smaller than the range of the population
20 18 16 14 12 10 8 3.5 4 4.5 5 20 18 16 14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 7 5.5 6 6.5 ...
View
Full
Document
 Spring '10
 Ryne

Click to edit the document details