Unformatted text preview: SUMMARY OF WEEK 6 [STAT4610 Applied Regression Analysis] Cont’d on CHAPTER 2 Cont’d on Inferences in Regression 2) Relationship between t‐test and F‐test t t‐test is equivalent to F test in simple linear regression. t‐test is more adaptable since it could be used for one sided alternative hypotheses as well as two sided whereas F considers only two sided alternative. 3) General Linear Test Approach To test H0: β1 = 0 vs. H1: β1 ≠ 0, write the hypotheses H0: Reduced Model (RM) is adequate. H1: Full Model (FM) is adequate. That is; H0: Y= β0 +ε H1: Y=β0+ β1X1+ε (Full model). [SSE(RM) − SSE(FM)] (dfRM − dfFM ) ~ F F= 1,n- 2 SSE(FM) dfFM Test Statistic: dfRM = n - 1, dfFM = n - 2 dfRM − dfFM = 1 n ˆ SSE(FM) = ∑ ( y i − yi )2 , dfFM = n - 2 where i=1 n ˆ SSE(RM) = ∑ ( y i − β 0 )2 , dfRM = n - 1 i=1 If F0 > F(1‐α ; 1,n‐2) or p‐value < α Reject the null. This is identical to the ANOVA test statistic! *2 ⎛ ˆ β1 =⎜ ⎜ s2 / S xx ⎝ ⎞ ⎟ = F * ⎛ = MSR ⎞ ⎜ ⎟ ⎟ ⎝ MSE ⎠ ⎠ 2 1 FALL 10, DR. NEDRET BILLOR| Auburn University SUMMARY OF WEEK 6 [STAT4610 Applied Regression Analysis] 4) Coefficient of Determination: R2 = SSR / SSTO= 1 – SSE/ SSTO gives the proportion of total variation explained by the predictor X. Since 0 ≤ SSE ≤ SSTO ⇒ 0 ≤ R2 ≤ 1 The larger R2 is, the more the total variation of Y is reduced by introducing the predictor variable X. Relationship between correlation coefficient and coefficient of determination: rxy = ±√R2 CHAPTER 3 Diagnostics and Remedial Measures 1) Standard Regression Assumptions a. On the form of the model (Linearity) b. On the model errors (ε): Errors~NID(0,σ2), σ2 common variance. This implies four assumptions: • Normality assumption • All εi, i=1,2,…,n have zero mean. • Constant variance assumption (or homoscedasticity)) • Independent error assumption c. On predictors Three assumptions concerning the predictors: • predictors are nonrandom (fixed or selected in advance), • predictors are measured without error, (No tools. These assumptions can not be validated!) • predictors are assumed to be linearly independent of each other. d. On observations All observations are equally reliable and have approximately equal role in determining the regression results and in influencing conclusions. 2 FALL 10, DR. NEDRET BILLOR| Auburn University SUMMARY OF WEEK 6 [STAT4610 Applied Regression Analysis] 2) Residual : The most important quantity in detecting the model adequacy or deficiencies. Two types of residuals: ˆ Ordinary residual: ei = Yi − Yi , i = 1,2,..., n ei Standardized (Semi‐studentized) residual: e *i = , i = 1,2,...,n, MSE where MSE is approx. average variance. A large standardized residual is a candidate outlier (ei* > 3, say). Helpful in identifying outliers. 3) Plots based on residuals are important tools in detecting violations of the most of model assumptions. Informal diagnostic plots for checking assumptions • • • • • • • Residuals vs. predictor variable, Absolute or squared residuals vs. predictor variable, Residuals vs. fitted values, Residuals vs. time or any other sequence, Residuals vs. omitted x, Box plot of residuals, Normal probability plot of residuals. Diagnostic plots for detecting the aforementioned departures : • Residuals vs. predictor variable (for linearity, constant variance and outlier detection) • Absolute or squared residuals vs. predictor variable (for linearity, constant variance and outlier detection) • Residuals vs. fitted values (for linearity, constant variance and outlier detection) • Residuals vs, time or any other sequence (for independence of errors) • Residuals vs. omitted x • Box plot of residuals • Normal probability plot of residuals (for normality assumption) a) ideal b)need another predictor , c) non‐constant error variance (funnel shaped) 3 FALL 10, DR. NEDRET BILLOR| Auburn University SUMMARY OF WEEK 6 [STAT4610 Applied Regression Analysis]
• Sequence plot of residuals for checking the independency of model error terms ei ei 0 0 Time or observation number Time or observation number Ideal (i.e. errors are independent) • Normality Plots of residuals Errors are NOT independent. Normal scores Normal scores Normal scores Ordered e*i Ordered e*i Light tailed Heavy tailed Ordered e*i Ideal
Normal scores Normal scores Ordered e*i Positive skewed Ordered e*i Negative skewed 4 FALL 10, DR. NEDRET BILLOR| Auburn University ...
View Full Document
This note was uploaded on 10/12/2010 for the course STAT 4630 taught by Professor Billor during the Spring '10 term at Auburn University.
- Spring '10