{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

9. Regression and correlation

# 9. Regression and correlation - 9 Linear Regression and...

This preview shows pages 1–8. Sign up to view the full content.

9. Linear Regression and Correlation Data: y: a quantitative response variable x: a quantitative explanatory variable (Chapter 8: Recall that both variables were categorical ; later chapters have multiple explanatory variables) For example (Wagner et al., Amer. J. Community Health , vol. 16, p. 189) y = mental health, measured with Hopkins Symptom List (presence or absence of 57 psychological symptoms) x = stress level (a measure of negative events weighted by the reported frequency and subject’s subjective estimate of impact of each event) We consider: Is there an association? (test of independence ) How strong is the association? (uses correlation ) How can we describe the nature of the relationship, e.g., by using x to predict y ? ( regression equation, residuals)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Linear Relationships Linear Function (Straight-Line Relation): y = α + β x expresses y as linear function of x with slope β and y- intercept α. For each 1-unit increase in x, y increases β units β > 0 Line slopes upward ( positive relationship) β = 0 Horizontal line ( y does not depend on x ) β < 0 Line slopes downward ( negative relation)
Example: Economic Level and CO2 Emissions OECD (Organization for Economic Development, www.oecd.org ): Advanced industrialized nations “committed to democracy and the market economy.” oecd-data file (from 2004) on p. 62 of text and at text website www.stat.ufl.edu/~aa/social/ Let y = carbon dioxide emissions (per capita, in metric tons) Ranges from 5.6 in Portugal to 22.0 in Luxembourg (U.S. = 19.8) mean = 10.4, standard deviation = 4.6 x = gross domestic product (GDP, in thousands of dollars per capita) Ranges from 19.6 in Portugal to 70.0 in Luxembourg (U.S. = 39.7) mean = 32.1, standard deviation = 9.6

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
The relationship between x and y can be approximated by y = 0.42 + 0.31x. At x = 0, predicted CO2 level y = 0.42 + 0.31 x = 0.42 + 0.31(0) = 0.42 (irrelevant, because no GDP values near 0) At x = 39.7 (value for U.S.), predicted CO2 level y = 0.42 + 0.31(39.7) = 12.7 (actual = 19.8 for U.S.) For each increase of 1 thousand dollars in per capita GDP, CO2 use predicted to increase by 0.31 metric tons per capita But, this linear equation is just an approximation. The correlation between x and y for these nations was 0.64, not 1.0 (It is even less, 0.41, if we remove the outlier observation for Luxembourg.)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Effect of variable coding? Slope and intercept depend on units of measurement. If x = GDP measured in dollars (instead of thousands of dollars), then y = 0.42 + 0.00031 x (instead of y = 0.42 + 0.31 x ) because a change of \$1 has only 1/1000 the impact of a change of \$1000 (so, the slope is multiplied by 0.001). If y = CO2 output in kilograms instead of metric tons (1 metric ton = 1000 kilograms), with x in dollars, then y = 1000(0.42 + 0.00031 x ) = 420 + 0.31x Suppose x changes from U.S. dollars to British pounds and 1 pound = 2 dollars. What happens?
Probabilistic Models In practice, the relationship between y and x is not “perfect” because y is not completely determined by x .

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern