# And it is large in the positive or negative direction

and it is large in the positive or negative direction we should probably be concerned by its inclusion in the sample or look for some way to fix/account for it. In the case of DC, from practical knowledge, and what the text says there are pockets of great wealth and great poverty – in fact I believe that some areas of DC are the richest in the USA so you may get a bias on your OLS estimators through the inclusion of any DC observation. Taking DC out has the possibility of giving a more representative estimate of infmort than including DC due to its properties that you can’t necessarily observe. (3) Consider the graphical representation of a simple regression (scatterplot and fitted line), note that the problem of an outlier is NOT that it is “far away” from other observations but that it “pulls on the line.” “Pulls on the line” is a visual way to imagine the bias in the slope coefficients that can be introduced by including an “outlier.” 5

As you did in the assignment for Unit #2, compute the % change in the three estimated coefficients when DC is included in the model, i.e.: ~ β i ^ β i ^ β i 100 Where ~ β i is the estimate with DC included and ^ β i the estimate without. % Δ in the estimated coefficient ln(pcinc) 96.5 ln(physic) -365.87 ln(popul) -109.293 Based on the evidence from this table, does DC “pull on the line?” Explain. Based on the evidence in the table, DC does pull the line. This is show because there are very large percentage changes in each variable in both the positive and negative direction depending on the variable being observed. In fact you can see that by including DC if there is a percentage change in physicians per 100k civilian population the effect is -366%, which is no small percentage. This means that the fitted line is pulled in down pretty significantly by the inclusion of DC. Appendix: 6
