HW4.Solutions - Biological Statistics II Biometry 3020 /...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Biological Statistics II Biometry 3020 / Natural Resources 4130 / Statistical Science 3200 Homework 4 Due Thursday March 11, 2010 Worth 20 points SENIC Data The primary objective of the Study on the Efficacy of Nosocomial Infection Control (SENIC Project) was to determine whether infection surveillance and control programs have reduced the rates of nosocomial (hospital acquired) infection in United States hospitals. This data set consists of a random sample of 113 hospitals selected from the original 338 hospitals surveyed. Each line of the data set has an identification number and provides information on 11 other variables for a single hospital. The data presented here are for the 1975-1976 study period. The data can be found on the course website as APPENC01.txt but may also be found on the data CD provided with the Kutner et al. text. Details on the study may also be found in Appendix C.1 on page 1348 of Kutner et al. You are being asked to examine these data to explore the effects that different factors have on the rates of infection (infection risk) for this population of hospitals. 1. (5 pts) Start off by regressing infection risk against length of stay, age, routine culture ratio, routine chest X-ray ratio, number of beds, medical school affiliation, and region. (No interactions at this point, no higher order effects.) Use the summary from the regression on the full model to outline what factors you think might be influential and which you do not. Provide a rationale for your choices. Provide plots of the response relative to each factor you’ve decided to include and in a table show the estimated coefficients associated with each significant factor and their associated standard error, t-values and p-values. senic.lm = lm(risk ~ stay + age + cult.ratio +  xray.ratio + beds + medschl + region, data  = APPENC01) summary(senic.lm)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Coefficients:               Value Std. Error t value Pr(>|t|)  (Intercept) -2.1841  1.4068    -1.5526  0.1235         stay  0.2328  0.0656     3.5504  0.0006          age  0.0120  0.0220     0.5481  0.5848   cult.ratio  0.0552  0.0105     5.2363  0.0000  xray.ratio   0.0137  0.0054     2.5377  0.0126        beds   0.0016  0.0006     2.6703  0.0088     medschl   0.3770  0.3182     1.1850  0.2387      region   0.2324  0.1049     2.2150  0.0289  Residual standard error: 0.9492 on 105 degrees of  freedom Multiple R-Squared: 0.5302  F-statistic: 16.93 on 7 and 105 degrees of freedom,  the p-value is 8.327e-015  It looks from this table alone that age and medschl  are not contributing much to the model. A stepwise  regression, library(MASS);stepAIC(senic.lm); would  show the same thing. This suggests a simpler model, 
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/17/2010 for the course STSCI 3200 taught by Professor Sullivan during the Spring '10 term at Cornell University (Engineering School).

Page1 / 11

HW4.Solutions - Biological Statistics II Biometry 3020 /...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online