{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

STAT101_Chap9

# STAT101_Chap9 - 9 Linear Regression and Correlation Data y...

This preview shows pages 1–9. Sign up to view the full content.

9. Linear Regression and Correlation Data: y: a quantitative response variable x: a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical ) For example (Wagner et al., Amer. J. Community Health , vol. 16, p. 189) y = mental health, measured with Hopkins Symptom List (presence or absence of 57 psychological symptoms) x = stress level (a measure of negative events weighted by the reported frequency and subject’s subjective estimate of impact of each event) We consider: Is there an association? (test of independence ) How strong is the association? (uses correlation ) How can we describe the nature of the relationship, e.g., by using x to predict y ? ( regression equation, residuals)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Linear Relationships Linear Function (Straight-Line Relation): y = α + β x expresses y as linear function of x with slope β and y- intercept α. For each 1-unit increase in x, y increases β units β > 0 Line slopes upward β = 0 Horizontal line β < 0 Line slopes downward
Example: Economic Level and CO2 Emissions OECD (Organization for Economic Development, www.oecd.org ): Advanced industrialized nations “committed to democracy and the market economy.” oecd-data file (from 2004) on p. 62 of text and at text website www.stat.ufl.edu/~aa/social/ Let y = carbon dioxide emissions (per capita, in metric tons) Ranges from 5.6 in Portugal to 22.0 in Luxembourg mean = 10.4, standard dev. = 4.6 x = GDP (thousands of dollars, per capita) Ranges from 19.6 in Portugal to 70.0 in Luxembourg mean = 32.1, standard dev. = 9.6

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
The relationship between x and y can be approximated by y = 0.42 + 0.31x. At x = 0, predicted CO2 level y = At x = 39.7 (value for U.S.), predicted CO2 level y = (actual = 19.8 for U.S.) For each increase of 1 thousand dollars in per capita GDP, CO2 use predicted to increase by metric tons per capita But, this linear equation is just an approximation, and the correlation between x and y for the OECD nations was 0.64, not 1.0. Scatterplot on next page.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Effect of variable coding? Slope and intercept depend on units of measurement. If x = GDP measured in dollars (instead of thousands of dollars), then y = because a change of \$1 has only 1/1000 the impact of a change of \$1000 (so, the slope is multiplied by 0.001). If y = CO2 output in kilograms instead of metric tons (1 metric ton = 1000 kilograms), with x in dollars, then y = Suppose x changes from U.S. dollars to British pounds and 1 pound = 2 dollars. What happens?
Probabilistic Models In practice, the relationship between y and x is not “perfect” because y is not completely determined by x . Other sources of variation exist. We let α + β x represent the mean of y- values, as a function of x. We replace equation y = α + β x by E( y ) = α + β x (for population) (Recall E( y ) is the “expected value of y”, which is the mean of its probability distribution.) e.g., if y = income, x = no. years of education, we regard E(y ) = α + β (12) as the mean

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
A regression function is a mathematical function that describes how the mean of the response variable y changes according to the value of an explanatory variable x.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 44

STAT101_Chap9 - 9 Linear Regression and Correlation Data y...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online