This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: STATISTICS 500 Fall 2009 Homework 10 - handed out Friday, 13 November 2009 on campus Friday, 20 Nov 2009, in lecture (11 am) or by e-mailto Chuanlong, firstname.lastname@example.org, no later than noon. off campus Monday, 23 Nov 2009, by 4 pm to Nicole Rembert, email: email@example.com or FAX: 515-294-4040 (please include cover page with Stat 500 / Nicole Rembert). 1. Model selection (practice): The modelsel1.txt data set contains four predictor variables and n = 50 observations. (a) Find the best model using stepwise selection, using entry = remove = 0 . 2. (b) Do you end up with the same model using entry = remove = 0 . 1?. (c) Use SAS to calculate Cp for all possible models. Identify the variables in the 3 models with the three smallest Cp values. (d) If you use AIC or BIC, do you select the same best model? The same best three models? 2. Model selection, Data analysis: The data in realestate.txt is set of residential house character- istics and sales prices from a few years ago. The variables are: id, sales price ($), house size (sq ft), # bedrooms, # bathrooms, AC (1 = yes), garage size (# cars), pool (1=yes), year built, quality (1 = high, 2 = medium, 3 = low), style (1 - 11 indicating architec- tural style), lot size (sq ft), highway (1=adjacent). Further description of the data is in Appendix C of Kutner, but that information is not necessary to do the problem. (Remember, there is a copy of Kutner available in the Stat Dept. Office). Ignore style and treat quality as a continuous variable (i.e. not 3 groups). Please construct a reasonable model to predict sales price (or some transformation of sales price) from some combination of size, bedrooms, bathrooms, ac, garage size, pool, year built, quality, lot size, and highway. Note: It is very difficult to know where to stop in a problem like this. If you work efficiently (and SAS is generally cooperative), you should spend no more than an hour on this problem. I do want you to check for and fix obvious problems, but you dont have to fix every idiosyncracy in the data. I am aware of at least one curious observation....
View Full Document
- Fall '08