This preview shows page 1. Sign up to view the full content.
Unformatted text preview: in the data) can be broken into two parts: the portion that we can explain using the predictors (SSregression), and the portion that we cannot explain (SSresidual). Explained variability. Once we’ve worked out the total variability (SSY) and the portion explained by the predictors (SSregression), we can calculate their ratio. The ratio is the fraction of the total variability explained by the predictors, and it’s our final measure of how useful the regression was. The fraction of explained variability is called R2, because it extends the idea of the squared correlation, r2. Recall that when we’re doing a simple correlation (i.e., there’s only one predictor), r2 is the fraction of the variance in Y that can be explained by X. In other words, when there’s only one predictor, R2 and r2 are equal. R2 is just a more general concept that works when there are multiple predictors. If R2 is close to 1, then the regression explains most of the variability in Y, meaning that if we know the values of the predictors for some subject then we can confidently predict the outcome for that subject. If R2 is close to 0, then the predictors don’t give us much information about the outcome. (R2 is always between 0 and 1.) SSregression
(5) R2 =
SSY
Hypothesis testing: The effect of one predictor. Once we’ve run a regression to find the regression coefficients for a set of predictors, we can ask how reliable the regression !
coefficients are. The regression coefficients are statistics, meaning they’re computed from a sample (for each subject in our sample, we have measured all the Xis and Y). We can use the regression coefficients as estimates of the population, but as with all estimators, they are imperfect. If we gathered a new sample and ran the regression on the new data, we’d obtain somewhat different values for the regressi...
View
Full
Document
 Spring '08
 MARTICHUSKI
 Psychology

Click to edit the document details