This preview shows page 1. Sign up to view the full content.
Unformatted text preview: s; we just know that it needs to be in the regression equation so that the overall pattern of predictions can be shifted up or down as needed to match the true values of the outcome. Partitioning sums of squares. One important question with any regression equation is how well it does predicting the outcome. We answer this question in terms of how much variability in Y is explained by the regression, meaning how much uncertainty goes away when we use the regression equation to predict the outcome. Variability or uncertainty in this case is measured in sums of squares (SS), which are very similar to variance and mean squared error, except that we don’t divide by degrees of freedom (yet). The total variability in Y is called SSY (sum of squares for Y) and is defined as SSY = ∑(Y – MY)2 (2) As usual, MY represents the mean of the sample Y. Notice that if we divided SSY by n – 1, we would have the sample variance of Y. So, the sum of squares for Y is just like the variance for Y except that we don’t divide by n – 1 (i.e., it’s a sum instead of an average). As with variance, we can think of the sum of squares as a measure of uncertainty, or how much error we would expect to make if we had to guess the value of Y blindly. If we have no information about a subject, our best guess of their Y score is the mean, MY. Therefore (Y – MY)2 is our squared error, and SSY is the sum of the squared error over all subjects. If we do know something about a subject, then we can make a better prediction of their Y score than by blindly guessing the mean. This is what the regression equation does for us – ˆ
it uses all the predictors, Xi, to come up with the best possible prediction, Y . Once we have ˆ
Y , we can ask how well it does as an...
View
Full
Document
 Spring '08
 MARTICHUSKI
 Psychology

Click to edit the document details