Unformatted text preview: e actual data points (because your sample is not informative about what is going on far away from the sample data points). If xi = 0 is not sufficiently near the actual data, the intercept does not have much meaning by itself, it only serves to locate the regression line. ˆ • In this case, α = −23.5 implies that for a person with 0 years of schooling, the predicted hourly wage is ‐\$23.5. This does not make much sense. In our data, X is between 12 to 16. This is far away from 0. So the data do not have much information around X=0. Figure: Impact of the range of data on estimates Goodness of fit How do we measure how well the sample regression line fits the data? ˆˆ ˆ ˆ yi = yi + ui (by definition, ui = yi − yi ), so each yi is divided into an explained part ˆ ˆ ( yi ), and an unexplained part ( ui ). We can then define the following: • ∑ • sample variance of Y multiplied by n‐1) n ˆ ˆ ∑ i=1 ( yi − y )2 is the explained sum of squares (SSE) ( =total variation in yi , ˆ ˆ i.e., the sample variance of yi multiplied by n‐1. Recall that y = y ,) n...
