hw5 - • If using SAS with significance levels to stay and...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
STA 6208 – HW #5 – Due 10/30/09 LPGA 2008 – Regression Analysis The dataset lpga1.dat contains statistics for the 2008 Ladies Professional Golf Association, containing the following variables: Golfer X 1 = Number of Rounds X 2 = Average Distance for Drives (Yards) X 3 = Percent of Fairways hit X 4 = Percent of Time on green in regulation X 5 = Average number of putts per round X 6 = Average number of sand traps hit per round X 7 = Percent of time making par when in sand Y = Prize Winnings per round ($) 1) Download the dataset lpga1.dat , 2) Obtain the best models with p’ =2,…,8 in terms of R 2 , Adj-R 2 , C P , SBC (BIC in R) . 3) Plot each of these versus p’. 4) Which model do you select? 5) Run the stepwise regression:
Background image of page 1
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: • If using SAS: with significance levels to stay and enter ( sls=.15, sle=.15 ). What model is selected? Print out the results of this analysis. • If using R, based on using minimum BIC criterion 6) RPD: 7.1, 7.2, 7.3, 7.4, 7.13 Use your best model from the lpga1.dat dataset (part 4) on lpga2.dat to validate the model. Use the model set up in Example 7.9 to: 1. Obtain Predicted values for lpga2 dataset, based on the regression from the lpga1 dataset 2. Obtain δ = P-Y for each of the golfers, as well as the mean and sd of δ 3. Conduct the t-test of H : Bias is 0 at α = 0.05 significance level. 4. Obtain the Mean Squared Error of Prediction (MSEP) 5. What proportion of MSEP is due to bias in the predicted values?...
View Full Document

This note was uploaded on 01/15/2012 for the course STA 6208 taught by Professor Park during the Spring '08 term at University of Florida.

Ask a homework question - tutors are online