hw4fall11sol - STAT5044: Regression and ANOVA The Solution...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STAT5044: Regression and ANOVA The Solution of Homework #4 Inyoung Kim Problem# 1. (a) Call: lm(formula = Y ~ X1 + X2 + X3, data = data2) Residuals: Min 1Q Median 3Q Max-264.05 -110.73-22.52 79.29 295.75 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.150e+03 1.956e+02 21.220 < 2e-16 *** X1 7.871e-04 3.646e-04 2.159 0.0359 * X2-1.317e+01 2.309e+01-0.570 0.5712 X3 6.236e+02 6.264e+01 9.954 2.94e-13 ***--- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 143.3 on 48 degrees of freedom Multiple R-Squared: 0.6883, Adjusted R-squared: 0.6689 F-statistic: 35.34 on 3 and 48 DF, p-value: 3.316e-12 b 1 and b 2 give information about a linear relationship between X 1 and Y and between X 2 and Y, respectively. But, b 3 does NOT give any information about a linear relationship between X 3 and Y because X 3 is indicator variable. Since there is an evidence that b 3 is significantly different from 0, this result implies that the means of y values between two groups ( X 3 = 0 vs X 3 = 1) are significantly different. (b) Boxplot of resulds show that the distribution of residual is almost symmetric without outliears. (c) In scatter plot of fitted values, there are two different groups; one is around fitted values 4300 and the other is around 4900. In residual plot with X 1 , the spead of residuals is getting narrower as X 1 increases. In residual plot with X 3 , there is two different groups because X 3 is indicator variable. In residual plot with X 1 X 2 , it seems that there is a polynomial pattern between two. Normal Q-Q plot suggests that the distibution of residual has a heavy tail compared with normal distribution (d) In this plot, there is no special pattern which means constant variance. (e) The result of Brown-Forsythe test with significan level = 0 . 01 suggests that the residuals between two groups are not different which implies constant variance of residuals. The following is the result of Brown-Forsythe test. Welch Two Sample t-test data: d1 and d2 1-200-100 100 200 300 Figure 1: boxplot for problem 2.2 4400 4600 4800 5000-200-100 100 200 300 fited(multilm) resid 250000 300000 350000 400000 450000-200-100 100 200 300 X1 resid 5 6 7 8 9-200-100 100 200 300 X2 resid 0.0 0.2 0.4 0.6 0.8 1.0-200-100 100 200 300 X3 resid 2 4 6 8 10-200-100 100 200 300 X23 resid-2-1 1 2-200-100 100 200 300 Normal Q-Q Plot Theoretical Quantiles Sample Quantiles Figure 2: Scatter plots and normal QQplot boxplot for problem 2.3 2 10 20 30 40 50-200-100 100 200 300 c(1:52) resid Figure 3: time plot for problem 2.4 t = -1.2698, df = 48.775, p-value = 0.2102 alternative hypothesis: true difference in means is not equal to 0 99 percent confidence interval:-90.57614 32.34565 sample estimates: mean of x mean of y 95.6889 124.8042 (f) The F test result suggests that there is a statistical evidence of rejecting H : 1 = 2 = 3 = 0 at the significant level = 0 . 05 (pvalue=3.316e-12). Based on t test of summary from muliple linear regression,...
View Full Document

Page1 / 10

hw4fall11sol - STAT5044: Regression and ANOVA The Solution...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online