Lecture 16 1. Risk and odds 2. Smoothing binary data 3. Logistic regression with binary response 4. Proc Logistic surprises Logistic regression using the SAS system: theory and application by Paul Allison, 2001, Wiley-SAS Categorical Data Analysis Using the SAS System, 2nd ed. by Stokes, Davis, and Koch, 2009, SAS Institute “Proc Logistic: Traps for the Unwary,” by P.L. Flom (on course website) 1 2 £ 2 Tables: Relative Risk, Odds Ratio group event Frequency| Row Pct | 0| 1| Total ---------+--------+--------+ 1| 141| 33| 174 |8 1 . 0 3| 18.97 | ---------+--------+--------+ 2| 187| 84| 271 | 69.00 | 31.00 | ---------+--------+--------+ Total 328 117 445 row percent = rate of events = risk of events Comparisons: • risk difference • risk ratio (relative risk) • odds ratio 2

Two different null hypotheses for risks 1. H 0 : p a ° p c = 0, risk difference is zero, Z -test, based on ( ˆ p a ° ˆ p c ), is equivalent to chi-square test Alternative test of risk differences when some cells have small counts: Fisher’s exact test. 2. Risk ratio , or relative risk =1 : H 0 : p a p c = 1 Null value for differences is 0. Null value for ratios is 1. 3 Odds odds = number of events number without event in sample Event No Event odds risk Group 1 A B A / BA /( A + B ) Group 2 C D C / DC /( C + D ) Relating odds to risk: odds = number with event number without event = number with event ± n number without event ± n = ˆ p 1 ° ˆ p For rare events, ˆ p º 0 and so the denominator is almost 1, and odds º risk. 4
Event No Event odds risk Group 1 A B A / BA /( A + B ) Group 2 C D C / DC /( C + D ) Odds are compared only by ratio, never by difference. Odds ratio is the odds in the top row divided by odds in the bottom row, which simpliFes to AD ± BC . Test whether population odds ratio is one, H 0 : OR = 1 by checking whether the 95% conFdence interval covers 1. 5 Comparing risks and odds in Proc Freq 6

group event Frequency| Row Pct | 0| 1| Total ---------+--------+--------+ 1| 141| 33| 1 7 4 | 81.03 | 18.97 | ---------+--------+--------+ 2| 187| 84| 2 7 1 | 69.00 | 31.00 | ---------+--------+--------+ Total 328 117 445 7 Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 7.9142 0.0049 Fisher’s Exact Test ---------------------------------- Cell (1,1) Frequency (F) 141 Left-sided Pr <= F 0.9985 Right-sided Pr >= F 0.0031 Table Probability (P) 0.0016 Two-sided Pr <= P 0.0057 Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ----------------------------------------------------------------- Case-Control ( Odds Ratio ) 1.9193 1.2138 3.0348 Cohort (Col1 Risk) 1.1743 1.0548 1.3075 Cohort (Col2 Risk) 0.6119 0.4291 0.8725 Sample Size = 445 8
Binary responses Event ± no-event, or 0 ± 1 responses are binary responses . Set-up: one trial in which Y takes values 1=event ± 0=no event. Chance of event = P [ Y = 1] = º = population event rate Chance of no event = P [ Y = 0] = 1 ° º Y has Bernoulli distribution (after Jakob Bernoulli, 1654–1705). mean of Y = º , standard deviation of Y = p º (1 ° º ). SD is a function of the mean, unlike Normal distribution. 9 Binary data comes as individual responses (0 or 1) and also as grouped responses: Grouped binary responses: If many trials under done with same chance of event, such as number of heads in n coin Fips, then results are grouped: X = k events in n trials.

