Lecture 6, Part 2 - Correlation and Regression

SOCIOLOGY 005 Lecture 6 Correlation and Regression By doing things the easy way, we were quickly able to determine whether or not the variation in Y is significantly related to X We were also able to quickly calculate R 2 and Pearson’s r But the Shortcuts we took are not without cost These only work with bivariate regression (converting r to R 2 ) We don’t have a clear idea about what error really is Our measure of model fit told us nothing about the regression line itself in terms of inference and confidence intervals We must now look at error and the form of a regression equation in more detail Correlation and Regression If there was an a strong linear law in which only education affected number of crimes committed, our previous equation would reflect it. Y = a + bX Predicted Values and Error ˆ Y = a + bX Unfortunately, Education is but one factor which might be related to crime and as such we are likely to commit error trying to predict Y using X. This being the case, the real value for Y is not predicted and instead we typically note that the predicted value for Y is just that - A Predicted Value

Although we adjusted the equation to reflect that Y is a predicted value, we are still interest in the actual value for Y We can compare the predicted value of Y to the actual value of Y in order to better understand the relationship between X and Y This allows us to calculate the amount of error we made when predicting Y e i = Y i ! ˆ Y i Predicted Values and Error Since we are now taking error into account, we can rewrite our formula so the actual value of Y is predicted Y = a + bX + e This new equation is the formula for the linear regression model Predicted Values and Error
