Homework_2F08 - Copy

# Homework_2F08 - Copy - Homework 2 STAT 331 Fall 2008 1...

This preview shows pages 1–3. Sign up to view the full content.

Homework 2 STAT 331 Fall 2008 1. DESCRIPTIVE ABSTRACT: Data taken from the advertising pages of the Sunday Times a few years ago, presenting a Mercedes Benz for sale in the UK (mainly in and around London). The asking prices (in pounds sterling) are classified according to type/model of car, age of car (in six-month units based on date of registration), recorded mileage, and vendor. VARIABLE DESCRIPTIONS: 1. Case number 2. Asking price in pounds 3. Type/Model of car: 0=model 500, 1=450, 2=380, 3=280, 4=200 4. Age of car in six-month units, based on registration date 5. Recorded mileage (in thousands) 6. Vendor (0,1,2,3 are different dealerships, 4 means "sale by owner") Values are aligned and delimited by blanks. YOUR TASK: To forecast prices of a 2 years old Mercedes 500 with 60000 miles and being sold by owner and a 1 year old Mercedes 500 with 26000 miles and being sold by dealership or by owner. You should a) construct a linear regression of prices vs. available regressors (you can add any interaction terms); omit any non-important regressors if any and discuss your choice; report the summary output of the linear regression and discuss your findings, i.e. R- squared, F-statistic, significance of regressors; data <- read.table("Data.txt",header=TRUE) l<-lm(data\$Price~data\$Mod+data\$Age+data\$Mile+data\$Vend) Summary summary(l) Call: lm(formula = data\$Price ~ data\$Mod + data\$Age + data\$Mile + data\$Vend) Residuals: Min 1Q Median 3Q Max -5307.4 -995.4 -266.1 1235.5 4324.0 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 27669.415 937.312 29.520 < 2e-16 *** data\$Mod -2819.718 244.836 -11.517 5.85e-14 *** data\$Age -1445.027 229.018 -6.310 2.14e-07 *** 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
data\$Mile -7.631 40.293 -0.189 0.851 data\$Vend 291.241 238.171 1.223 0.229 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2002 on 38 degrees of freedom Multiple R-squared: 0.8301, Adjusted R-squared: 0.8122 F-statistic: 46.4 on 4 and 38 DF, p-value: 3.982e-14 We can see from the summary that the p-value for Mile and Vend are above 0.1 which shows that these two variables are no use. Therefore, I decided to run the regression model again but omitting the 2 variables Mile and Vend. Summary (Omitting Mile and Vendor) > l<-lm(data\$Price~data\$Mod+data\$Age) > summary(l) Call: lm(formula = data\$Price ~ data\$Mod + data\$Age) Residuals: Min 1Q Median 3Q Max -5174.2 -996.4 -186.5 1010.7 4040.1 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 27847.7 920.0 30.27 < 2e-16 *** data\$Mod -2815.7 239.7 -11.75 1.52e-14 *** data\$Age -1392.8 161.2 -8.64 1.10e-10 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1990 on 40 degrees of freedom Multiple R-squared: 0.8233, Adjusted R-squared: 0.8145 F-statistic: 93.22 on 2 and 40 DF, p-value: 8.756e-16 F-statistics > qf(0.99,2,40) [1] 5.178508 PRICE = 27847.7 - 2815.7*MOD - 1392.8*AGE + ε Since F-statistics 42.88>5.178508 therefore we can show with a 99% confidence level, we can reject H
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 12

Homework_2F08 - Copy - Homework 2 STAT 331 Fall 2008 1...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online