This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: STAT5044: Regression and ANOVA The Solution of Homework #4 Inyoung Kim Problem# 1. (a) Call: lm(formula = Y ~ X1 + X2 + X3, data = data2) Residuals: Min 1Q Median 3Q Max264.05 110.7322.52 79.29 295.75 Coefficients: Estimate Std. Error t value Pr(>t) (Intercept) 4.150e+03 1.956e+02 21.220 < 2e16 *** X1 7.871e04 3.646e04 2.159 0.0359 * X21.317e+01 2.309e+010.570 0.5712 X3 6.236e+02 6.264e+01 9.954 2.94e13 *** Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 Residual standard error: 143.3 on 48 degrees of freedom Multiple RSquared: 0.6883, Adjusted Rsquared: 0.6689 Fstatistic: 35.34 on 3 and 48 DF, pvalue: 3.316e12 b 1 and b 2 give information about a linear relationship between X 1 and Y and between X 2 and Y, respectively. But, b 3 does NOT give any information about a linear relationship between X 3 and Y because X 3 is indicator variable. Since there is an evidence that b 3 is significantly different from 0, this result implies that the means of y values between two groups ( X 3 = 0 vs X 3 = 1) are significantly different. (b) Boxplot of resulds show that the distribution of residual is almost symmetric without outliears. (c) In scatter plot of fitted values, there are two different groups; one is around fitted values 4300 and the other is around 4900. In residual plot with X 1 , the spead of residuals is getting narrower as X 1 increases. In residual plot with X 3 , there is two different groups because X 3 is indicator variable. In residual plot with X 1 X 2 , it seems that there is a polynomial pattern between two. Normal QQ plot suggests that the distibution of residual has a heavy tail compared with normal distribution (d) In this plot, there is no special pattern which means constant variance. (e) The result of BrownForsythe test with significan level α = 0 . 01 suggests that the residuals between two groups are not different which implies constant variance of residuals. The following is the result of BrownForsythe test. Welch Two Sample ttest data: d1 and d2 1200100 100 200 300 Figure 1: boxplot for problem 2.2 4400 4600 4800 5000200100 100 200 300 fited(multilm) resid 250000 300000 350000 400000 450000200100 100 200 300 X1 resid 5 6 7 8 9200100 100 200 300 X2 resid 0.0 0.2 0.4 0.6 0.8 1.0200100 100 200 300 X3 resid 2 4 6 8 10200100 100 200 300 X23 resid21 1 2200100 100 200 300 Normal QQ Plot Theoretical Quantiles Sample Quantiles Figure 2: Scatter plots and normal QQplot boxplot for problem 2.3 2 10 20 30 40 50200100 100 200 300 c(1:52) resid Figure 3: time plot for problem 2.4 t = 1.2698, df = 48.775, pvalue = 0.2102 alternative hypothesis: true difference in means is not equal to 0 99 percent confidence interval:90.57614 32.34565 sample estimates: mean of x mean of y 95.6889 124.8042 (f) The F test result suggests that there is a statistical evidence of rejecting H : β 1 = β 2 = β 3 = 0 at the significant level α = 0 . 05 (pvalue=3.316e12). Based on t test of summary from muliple linear regression,...
View
Full Document
 Fall '11
 Staff
 Normal Distribution, Regression Analysis, Variance, Errors and residuals in statistics, X1

Click to edit the document details