hw5 - sls=.15 sle=.15 What model is selected Print out the...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
STA 6208 – HW #5 – Due 10/29/10 LPGA 2008 – Regression Analysis The dataset lpga1.dat contains statistics for the 2008 Ladies Professional Golf Association, containing the following variables: Golfer X 1 = Number of Rounds X 2 = Average Distance for Drives (Yards) X 3 = Percent of Fairways hit X 4 = Percent of Time on green in regulation X 5 = Average number of putts per round X 6 = Average number of sand traps hit per round X 7 = Percent of time making par when in sand Y = log(Prize Winnings per round ($)) 1) Download the dataset lpga1.dat , 2) Obtain the best models with p’ =2,…,8 in terms of R 2 , Adj-R 2 , C P , AIC,SBC (BIC in R) . 3) Plot each of these versus p’. 4) Which model do you select? 5) Run the stepwise regression: If using SAS: with significance levels to stay and enter (
Background image of page 1
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: sls=.15, sle=.15 ). What model is selected? Print out the results of this analysis. • If using R, based on using minimum BIC criterion 6) RPD: 7.1, 7.2, 7.3, 7.4, 7.13 Use your best model from the lpga1.dat dataset (part 4) on lpga2.dat to validate the model. Use the model set up in Example 7.9 to: 1. Obtain Predicted values for lpga2 dataset, based on the regression from the lpga1 dataset (be sure and use (natural) logarithm of Prize Winnings. 2. Obtain δ = P-Y for each of the golfers, as well as the mean and sd of δ 3. Conduct the t-test of H : Bias is 0 at α = 0.05 significance level. 4. Obtain the Mean Squared Error of Prediction (MSEP) 5. What proportion of MSEP is due to bias in the predicted values?...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online