Handout23 - Lecture 23 1. Survival time 2. Censored...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lecture 23 1. Survival time 2. Censored observations 3. Proc Lifetest: Kaplan-Meier estimate of the survival distribution 4. Comparing survival distributions References: Collett (2003) Modelling Survival Data in Medical Research, 2nd ed. Allison (1995) Survival Analysis Using the SAS System. Cantor (2003) SAS Survival Analysis Techniques for Medical Research Der and Everitt, Chapter 12 1 Time-to-event or survival data In many situations, time until an event occurs is important: • New treatment for brain cancer: do patients survive longer than after standard treatment? • In the AHC, are men awarded tenure earlier and more often than women? • Time to graduation in MPH programs, compared between SPH divisions. Each individual has their own time Ti to the event. Unlike earlier analyses, aim is not point estimate (mean, slope, odds ratio) but the whole distribution of these times {Ti }. Much more to ask for, and harder to compare. 2 Outline: two main analyses for survival data 1. Estimate survivor function, compare survivor functions between groups. Proc Lifetest gives nonparametric product-limit (Kaplan-Meier) or lifetable estimate, draws graphs, tests for differences. Nice pictures, but no adjustments—only strata. Proc LifeReg gives regression adjustment but must specify parametric formula for survivor function; rarely used in health sciences. 2. Estimate ratio of hazard functions between groups, compare ratio to 1. Proc PHreg does proportional hazards regression to estimate ratio. No pictures (almost) but regression adjustment for fixed and time-varying predictors. What are survivor function and hazard? 3 Probability theory defines distribution by: • histogram of lifetimes, called density f (t ) • cumulative distribution function = cumulative area under histogram, starting from left. F (t ) = ￿t f (u )d u −∞ Survivor function S (t ) = 1 − F (t ). Percent without the event (still alive) at time t . Hazard function h (t ) = f (t ) chance of event at time t = S (t ) percent at risk at time t Hazard h (t ) gives the chance of event during a short interval after time t , for those who are at risk (alive) at time t . 4 Example: US Census Bureau synthetic cohort for 2002 Histogram (density) of times to death for 2002 US population, truncated at 101. 3.5 Percent of Deaths by age, 2002 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 20 40 60 80 100 US Population, Age in Years in 2002 E. Arias (2004) United States Life Tables, 2002 (National vital statistics reports; vol 53 no 6. Hyattsville, Maryland: National Center for Health Statistics.) 5 Hazard function h (t ) = age-specific death rate Age!Specific Death Rate per 100,000 0.20 0.15 0.10 0.05 0.00 0 20 40 60 US Population, Age in Years in 2002 6 80 Survivor function S (t ) = chance of surviving to age t 100 100 National Vital Statistics Reports, Vol. 53, No. 6, November 10, 2004 80 60 60 40 40 20 20 0 Percent Surviving 80 5 0 Figure 2. Percent surviving by age, race, and sex: United States, 2002 0 20 40 60 80 100 and black males and about females, the pattern of survival by age is similar. These groups have US Population, Age in Years in 2002 2.6 percent of white and black females survive to age 100. approximately the same median age at death of about 79 years. Plotting the percent surviving by age for the periods 1900–1902, However, white males have slightly higher survival rates than black 1949–51, and 2002 shows an increasingly rectangular survival curve females at the younger ages with 98.6 percent surviving to age 20 and (figure 3). That is, the survival curve has become increasingly flat in 79.9 percent surviving to age 65 compared with 98.1 percent and response to progressively lower mortality, particularly at the younger 78.5 percent, respectively, for black females. At the older ages, in contrast, black female survival surpasses white male survival. At age 7 ages, and increasingly vertical at the older ages. The survival curve for 1900–1902 shows a rapid decline in survival in the first few years of 85, white male survival is 29.2 percent compared with 33.6 percent for life and a relatively steady decline thereafter. In contrast, the survival black females. This crossover, which occurs at about age 72, is clearly curve for 2002 is nearly flat until about age 50 after which the decline shown in figure 2. The median age at death for black males is 72 years, in survival becomes more rapid. Improvements in survival between 11 years less than that for white females. 97.4 percent of black males 1900–1902 and 1949–51 occurred at all ages, although the largest survive to age 20, 65.7 percent to age 65, and 18.2 percent to age 85. improvements were among the younger population. Between 1949–51 By age 100, there is very little difference between the white and black populations in terms of survival. Somewhat example Another survivor function less than 1 percent of white and 2002, improvements occurred primarily for the older population. Figure 3. Percent surviving by age: Death-registration States, 1900–1902, and United States, 1949–51 and 2002 8 Censored observation times Common problem in survival data is that we don’t observe all event times: • we stop the study and analyze the data before everyone has had the event • a person leaves the study and we cannot find out whether they had the event In these cases, all we have is final time t 0 subject was known to be alive; we know only that T > t 0 The final time t 0 is called a censored observation, and it’s a lower bound for the unknown event time T . 9 Clinical study example: eligible participants were enrolled as soon as they volunteered, and recruitment lasted 2.5 years. The study ended on 1/1/2008. Subjects died (open circle), dropped out (triangle), or were still alive at study end (gray dot). ! ! ! ! ! start 1/1/2005 end 1/1/2006 1/1/2007 Calendar Time 10 1/1/2008 Analysis of clinical study example: each subject’s time is aligned to start at “study time” = 0. ! ! ! ! ! start 0.0 ! end 0.5 1.0 1.5 2.0 2.5 3.0 Time from Enrollment (years) * marks study enrollment, horizontal line indicates time participant was alive, deaths are indicated by an open circle, censoring by a gray dot. 11 No histogram of survival times with censored data We can draw a histogram of all the times t i If there are censored times, we know that t i < actual survival time. No correct place in histogram for censored observations, because they are lower bounds, not observed times. However, excluding them gives a biased histogram. Kaplan and Meier (1958) proposed break-through method to estimate survivor function S (t ) from partially censored data. 12 Stomach cancer example Survival times after treatments A or B for 89 patients with stomach cancer (source: Chapter 12, Der and Everitt). • 45 received treatment A: 38 died, 7 were censored • 44 received treatment B: 41 died, 3 were censored Obs censor 1 0 2 0 3 0 ... 87 1 88 0 89 0 days 17 185 542 trt A A A years 0.04654 0.50650 1.48392 1736 380 748 A B B 4.75291 1.04038 2.04791 days, years give times t patients were last known alive. censor = 0 if an event happened at t . censor = 1 if censored (no event yet). 13 Proc Lifetest: Kaplan-Meier (Product-Limit) estimate of survivor function ODS graphics on; Proc Lifetest data = stomach_cancer plots=(survival(atrisk=0 to 4 by 1)) TIME STRATA censoredsymbol="|" ; years * censor(1); trt ; run; ODS graphics off; TIME statement is like model statement, specifies response TIME length-of-time * event-status ( censored-value ) ; STRATA variable identifying treatment groups to be compared by test 14 plots=(survival(atrisk=0 to 4 by 1)) censoredsymbol="|" ; Sample sizes given at bottom. Need at least 10–15 in each group. 15 plots=( CL survival(atrisk=0 to 4 by 1)) censoredsymbol="|" ; 16 The LIFETEST Procedure Stratum 1: trt = A years 0.00000 0.04654 0.11499 0.12047 0.13142 0.16427 .... 3.32375 3.37303* 3.73990 3.98357* 4.33949* 4.44079* 4.45175* 4.75291* Survival Failure Survival Standard Error Number Failed Number Left 1.0000 0.9778 0.9556 0.9333 0.9111 0.8889 0 0.0222 0.0444 0.0667 0.0889 0.1111 0 0.0220 0.0307 0.0372 0.0424 0.0468 0 1 2 3 4 5 45 44 43 42 41 40 0.1750 . 0.1458 . . . . . 0.8250 . 0.8542 . . . . . 0.0572 . 0.0546 . . . . . 37 37 38 38 38 38 38 38 7 6 5 4 3 2 1 0 NOTE: The marked survival times are censored observations. 17 years : time t when survivor function starts a new value Survival : Kaplan-Meier (product-limit) estimate of the survivor function S (t ) for times to the right of time t Failure : Kaplan-Meier estimate of cumulative mortality, [1 − S (t )] = F (t ) ˆ Survival Standard Error : the pointwise standard error of the estimate S (t ) Number Failed : the total number of events Number Left : the number still under observation and at risk for the event 95% confidence interval for the estimated survivor function from the usual ˆ ˆ formula with a standard error (from output): S (t ) ± 1.965 ∗ SE{S (t )} 18 Stratum 1: trt = A Quartile Estimates Percent 75 50 25 Point Estimate 1.58795 0.69541 0.39425 Mean 1.34660 95% Confidence Interval [Lower Upper) 1.27036 . 0.52841 1.32512 0.20260 0.53388 Standard Error 0.19441 ˆ Median survival time is time t when S (t ) = 0.5, the survivor function equals 50%. ˆ If S (t ) = 0.5 over an interval, the median is midpoint of the interval. Mean survival time is area under the Kaplan-Meier survival curve. If the largest observed time in the data is censored, then this area is unspecified. Don’t report mean survival time if there is any censoring. 19 Summary of censoring in each group. Summary of the Number of Censored and Uncensored Values Stratum group Total Failed Censored Percent Censored 1 A 45 38 7 15.56 2 B 44 41 3 6.82 ------------------------------------------------------------------Total 89 79 10 11.24 Precision of estimates depends on the number of events (“Failed”) not the number of observations. 20 Tests to compare population survivor functions Lifetest compares population survivor functions S (t ) between groups listed in the STRATA statement. Null hypothesis: all groups have the same population survivor function; here, S A (t ) = S B (t ). • Log rank • Wilcoxon • Likelihood ratio test Ignore likelihood ratio test—it depends on strong assumption (exponential density) that is usually wrong. 21 Rank Statistics trt Wilcoxon 3.3043 -3.3043 A B Log-Rank 502.00 -502.00 Test of Equality over Strata Test Log-Rank Wilcoxon -2Log(LR) Chi-Square DF Pr > Chi-Square 0.5654 4.3162 0.3574 1 1 1 0.4521 0.0378 0.5500 Two usable tests disagree here. 22 All three tests are based on H0 : S 0(t ) = S 1(t ): • combine all groups to get a common event rate on each time interval • for each group in each interval, multiply event rate by sample size to get expected numbers of events e j k = expected numbers of events in group j during time period k d j k = observed numbers of events in group j at time k. 23 Log-rank test. test statistic is cumulative difference between observed and expected: dL = ￿￿ ￿ d 1k − e 1k . k Rank Statistics trt A B Log-Rank 3.3043 -3.3043 Wilcoxon 502.00 -502.00 Test statistic for A was +3.3043, indicating more deaths than expected. Test statistic for B was −3.3043, indicating fewer deaths than expected. Usually more sensitive test. Best test when the estimated survivor functions do not cross each other. Often the basis for sample size calculations. 24 Wilcoxon test. Sample-size weighted sum of differences between observed and expected events: dW = ￿ k n k (d 1k − e 1k ). Rank Statistics trt A B Log-Rank 3.3043 -3.3043 Wilcoxon 502.00 -502.00 Wilcoxon test gives more weight to the early part of the estimated survivor functions, where there is more information. Wilcoxon is less sensitive to late differences in survivor functions. Use Wilcoxon when estimated survivor functions cross each other. 25 Rank Statistics trt Wilcoxon 3.3043 -3.3043 A B Log-Rank 502.00 -502.00 Test of Equality over Strata Test Log-Rank Wilcoxon -2Log(LR) Chi-Square DF Pr > Chi-Square 0.5654 4.3162 0.3574 1 1 1 0.4521 0.0378 0.5500 Which test should we report? 26 Think about sample size as well as whether survivor curves cross. 27 ODS graphics on; Proc Lifetest data=two_years maxtime=2.0 plots=survival(atrisk=0 to 2 by .5) censoredsymbol="|"; time years * censor(1); strata trt ; run; ODS graphics off; 28 29 Proc Lifetest TEST statment Proc Lifetest also compares groups identified in the TEST statement. This is intended to test the effect of a continuous explanatory variable. When used with a categorical variable, such as treatment results are not the same as from STRATA. Use STRATA not TEST. 30 ...
View Full Document

This note was uploaded on 11/21/2011 for the course PUBH 6470 taught by Professor Williamthomas during the Fall '11 term at University of Florida.

Ask a homework question - tutors are online