HW 1
1. Regress Y on X2 and X3 and a constant, and calculate the predictors and
residuals
^Yi(X2, X3) = b0 + b2Xi2 + b3Xi3
ei(Y|X2, X3) = Yi - ^Yi(X2, X3)
2. Regress X1 on X2 and X3 and a constant, and calculate the predictors and
residuals
^Xi1(X2, X3) =
Study Aid 2
Graduation Rate Example
SYSTAT does not automatically calculates DFFITS so I computed DFFITS with the
statement
>let dffits = student*sqr(leverage/(1-leverage)
In the graduation rate data n=51 and p=6 so that 2(p/n)1/2 = 2(6/51)1/2 = .686.
Loo
Study Aid 1
1. Standardized (aka Semi-Studentized) Residual ei*
The first improvement is the standardized residual, calculated by dividing the raw
residual by the estimated standard deviation of the residuals s=MSE1/2 as
ei* = ei/(MSE)1/2
This is simply t
Lecture 4 Notes
2. hii Measures the Leverage of an Observation
A larger value of hii indicates that a case has greater leverage in determining its own fitted
value ^Yi . This can be seen in 2 ways:
in the formula ^y = Hy hii represents the weight of obser
Lecture 3 Notes
In multiple regression X-outlying observations are identified using the hat matrix H.
1. Review of the Hat Matrix
We saw in Module 4 that the nx1 vector ^y of estimated (predicted, fitted) values of y is
obtained as
^Y = HY
where
H = X(X'X
Lecture 2 Notes
Example. In the graduation rates study units are U.S. states plus Washington DC. The
dependent variable (GRAD) is the rate of graduation from high school. The following
model is estimated
GRAD = CONSTANT + INC + PBLA + PHIS + EDEXP + URB.
Lecture 1 Notes
ADDED-VARIABLE PLOTS FOR FUNCTIONAL FORM & OUTLYING
OBSERVATIONS
1. Uses of Added-Variable Plots
Added-variables plots are also called partial regression plots and adjusted variable plots.
A partial regression plots is a diagnostic tool th
HW 3
1.The bootstrap is a computer-intensive method developed by Bradley Efron
and others to derive standard errors of estimate from information in the sample
and do statistical inference (hypothesis tests and confidence intervals) even in
nonstandard est
HW 2
1. look at the Graduation Rates file GRAD.SYD used in a previous
assignment. The dependent variable is GRAD, the state rate of graduation from
high school. We estimate the model
GRAD = CONSTANT + INC + PBLA + PHIS + EDEXP + URB.
The data and the regr
Study Aid 3
1. Get Rid of Them Outliers?
Outlying and influential cases should not be discarded automatically.
There are several situations:
if the case is the result of recording error and such, then
o
o
if possible, correct the observation
if not, disca