This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: hw5_150_soi_f10.r
# soiutions for Hut 5 g
# R code # math 150 # Fa)? 2010 # 30 Hardin sepsis <— read.tab1e("sepsis.csv",header=T,sep=",")
attacthepsis)
# 1 sepsis.bu <— sepsis[(race==1)&(treat==0),J
sepsis.bu.iog <— gimeate ~ apache, fami1y="binomia1", data=sepsis.bu) summarstepsis.bu.1og)$coef # Estimate Std. Error 2 vaiue Pr(>zl)
# (Intercept) 0.80765133 0.6658397 1.212982 0.2251368
# apache _0.06147948 0.0351579 1.748668 0.0803485 The probabiiity of death for a given APACHE score (x) in black untreated patients
15: pr) = exp (—0.807 + 0.061 x) / (1 + exp (—0.807 + 0.061 x) )
p10t(apache,fate,pch=19,xiim=c(1,45),yiab="probabi1ity of dying", x1ab="APACHE
score" . 1ines(c(1:45g5 exp (—0.807 + 0.061*c(1:45)) / (1 + exp (0.807 + 0.061*C(1:45))),
ty= # 2_The odds ratio associated with a unit rise in APACHE score in untreated biack
patients is eAbeta for the appropriate_modei (above sepsis.bu.1og). beta 2 0.0614; eAbeta = 1.06341 A 95% CI for beta is: 0.06148 +— 1.96 * 0.035158 = (—0.00742968, 0.1303897) A 95% CI for eAbeta is: (eA—0.00742968, eAO.1303897) = C 0.9925979, 1.139272) We are 95% confident that your odds of death changes by a factor of between
0.9925979 and 1.139272 for a one unit increase in APACHE score. Note that the
number 1 is in ' the intervai, so I am onabie to ciaim whether the odds of death go up, down, 0r
remain the same for a one unit increase in APACHE score. Aiso note, however, that a sma11er ievei of confidence wouid have given me a significant resuit. so I
interpret these resuits to be suggestive if not significant. # 3 Yes, it is possibie to predict the probabiiity of deat for a treated biack patient
with an APACHE score of 50: Page 1 hw5_150_sol_f10.r
pi(50) = exp (—2.43 + 0.121*50) / (1 + exp (~2.43 + 0.121*50) ) = 0.974 A black treated patient with an APACHE score of 50 has a probability of dying equal
to 0.974. Notice, however that 50 is outside the range of observations used to fit the model. Typically we do not like to apply our models to values_outside the range
(Ehis is called extrapolation). However, we may sometimes apply it to xvalues
tat are only slightly larger than what we observed.
# S The APACHE score which gives median survival is the x value that gives eAO / (1+
eAO) = 1/2 as a probabi ity of surVival. This happens when: x = —b0/b1 = 0.80755133 / 0.06147948 = 13.137
3 paﬁient with an APACHE score of 13.137 has an equal probability of survival and
eat . Note, you can calculate the APACHE score for any probability of survival. For
example, let's say we wanted the 0.9 surVival value. 0.9 = eA(bO+b1 x) / [ 1+ eACb0+bl x) J
0.9 * [ 1+ eACb0+bl x) J = eACb0+b1 x)
0.9 = 0.1 * eACb0+b1 x) 9 = eACbO+b1 x) ln(9) = b0 + b1 x x = (ln(9)  b0) / b1 # 6 our model is that E[Y] = exp (alpha + beta * x) / (1 + exp (alpha + beta * x3) E{Y] is the average Y value in the population (expected value just means that "the
a¥erage in the population"). Because the response variable is binary, we can think
0 an average of 05 and ls as a proportion of is (or proportion of successes). Because a proportion is just an average, and our_particular_proportion is a function
of the exp anatory variables (the logit of the linear function), logistic regreSSion is a very appropriate description. In linear regression we are modeling the expected value (i.e., long run _average_)
for a response variable at every pOSSible X. Our model is reasonably sophisticated because we aren't simply finding Y—bar at each X and connecting the dots: Instead,
we believe that the true mean values fall on the line, so we use the entire set of data to model the relationship as a linear function. For logistic regression, we are modeling the expected proportion / probability
(which is an averagei) of success at every poSSible X. Again, we don't find p—hat
at every value and connet the dots. Instead, we believe that the true proportions lie on a
' Page 2  hw5_150_so1_F10.r
1ine re1ated by the Togit function of the true va1ue to a 1inear Function of the exp1anatory variab1e. # 7
(a) The response variabTe (y) is never norma11y distributed around the Tine.
(b) The errors wiii not have constant variance across aT1 vaiues of x (c) often, the best 1ine wi11 go outside the bounds of (0,1) which isn‘t meaningfu1
if we think of the predicted response as probabiiity of success. (d) The idea of the 1ine being the average response at a particu1ar vaiue of the
expianatory variabie (x) is NOT necessariiy vio1ated (see # 6 above). Page 3 ...
View
Full
Document
This note was uploaded on 02/17/2012 for the course MATH 151 taught by Professor Jo.h during the Fall '10 term at Pomona College.
 Fall '10
 JO.H
 Statistics

Click to edit the document details