15_corr_regres2

15_corr_regres2 - Steps in regression analysis (so far)...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
Correlation & Regression, II 9.07 4/6/2004 regression line, y’=a+bx i vs. (y i –y i ’), as a scatter plot, for diagnostic purposes Steps in regression analysis (so far) Plot a scatter plot Find the parameters of the best fit Plot the regression line on the scatter plot Plot the residuals, x Residual Plots i i ’) against x i can reveal how well the linear equation explains the data significantly non-linear, or other oddities all What we like to see: no pattern 0 20 40 height ( ) 65 75 Plotting the residuals (y Can suggest that the relationship is The best structure to see is no structure at -20 -40 inches 60 70 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
If it looks like this, you did something 0 20 40 height ( ) 65 75 If there’s a pattern, it was inappropriate to fit a line (instead of some other function) 0 20 40 hei 60 70 75 wrong – there’s still a linear component! -20 -40 inches 60 70 -20 -40 ght (inches) 65 What to do if a linear function isn’t appropriate model, y’ = M(x), then plotting y vs. y’ and Coming up next… y’, given x Often you can transform the data so that it is linear, and then fit the transformed data. This is equivalent to fitting the data with a fitting that with a linear model. There are other tricks people use. This is outside of the scope of this class. Assumptions implicit in regression The regression fallacy Confidence intervals on the parameters of the regression line Confidence intervals on the predicted value Correlation 2
Background image of page 2
Assumption #1: your residual plot should not look like this: 0 20 40 hei 60 70 75 value (vs. “homoscedastic”, where it doesn’t depend on x) law. j • ( -20 -40 ght (inches) 65 Heteroscedastic data Data for which the amount of scatter depends upon the x- Leads to residual plots like that on the previous slide Happens a lot in behavioral research because of Weber’s Ask people how much of an increment in sound volume they can ust distinguish from a standard volume How big a difference is required (and how much variability there is in the response) depends upon the standard volume Can often deal with this problem by transforming the data, or doing a modified, “weighted” regression Again, outside of the scope of this class.) homoscedasticity Σ (y i –y i ’) 2 /N) = s y’ homoscedastic i . 0 20 -20 40 -40 i (i ) 60 65 70 75 0 20 -20 40 -40 height ( ) 60 70 Homoscedastic Heteroscedastic i ’, i . i some an underestimate. Why we care about heteroscedasticity vs. Along with the residual plots, we often want to look at the rms (root-mean-square) error for the regression: rms = sqrt( This gives us a measure of the spread around the regression line For this measure to be meaningful and useful, we want the data to be , i.e. we want the data to be spread out to the same degree for every value of x he ght nches inches 65 75 Here, rms error is a good measure of the amount of spread of the data about y for any value of x Here, rms error is not such a good measure of the spread -- for some x it will be an overestimate of spread, for 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Another assumption for regression
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 11/11/2011 for the course BIO 9.07 taught by Professor Ruthrosenholtz during the Spring '04 term at MIT.

Page1 / 17

15_corr_regres2 - Steps in regression analysis (so far)...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online