This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Eco 572: Research methods in Demography The LeeCarter Model In this unit we will examine some features of the LeeCarter approach to forecasting U.S. mortality. The Mortality Surface The basic model seeks to summarize an ageperiod surface of logrates in terms of vectors a and b along the age dimension and k along the time dimension so that log m x t = a x + b x k t + e x t with restrictions such that the b's are normalized to sum to unity and the k's sum to zero, so the a's are average logrates. The vector a can be interpreted as an average ageprofile, the vector k tracks changes over time, and the vector b determines how much each age group changes when k changes. When k is linear on time each age group changes at its own exponential rate, but this is not a requirement of the model. The error term reflects ageperiod effects not captured by the model. Lee and Carter estimated the a's, b's and k's with U.S. mortality data for 1933 to 1987 using a singular value decomposition that will be illustrated below. In a second step they reestimate k so it predicts the correct total number of deaths each year. (This essentially changes the weight of each age group.) Estimates of k are obtained for each year from 1900 to 1989 (and then forecast into the future). The basic data used in the original paper consisted of rates up to ages 8084 and 85+. Because a large fraction of the population survives to age 80 (30% in 1987) they extended the model to older ages on the basis of work by Coale and Kisker and Coale and Guo showing that after age 80 mortality increases at a linearly declining rate, rather than a constant rate as implied in a Gompertz model. The Singular Value Decomposition The Human Mortality Database (HMD) has U.S. mortality rates for fiveyear age groups (0,14,59,...,105109,110+) and single calendar years for 1933 to 2001. We will use these data to examine the model. The file usm3301.dat has rates for both sexes combined for each year. Note that age is coded using the starting value to simplify numerical manipulation. We will reshape this file to a year by age matrix. . infile year age m using /// > http://data.princeton.edu/eco572/datasets/usm3301.dat, clear (1656 observations read) . reshape wide m, i(year) j(age) (note: j = 0 1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110) Data long > wide Number of obs. 1656 > 69 Number of variables 3 > 25 j variable (24 values) age > (dropped) xij variables: m > m0 m1 ... m110 The next step is to move the data to Mata, compute logs, and get the mean of each rate over the period 19331987....
View Full
Document
 Spring '06
 Rodriguez

Click to edit the document details