This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Eco 572: Research methods in Demography Rates and Standardization (Revised) We will work through the example in Preston et al, sections 2.2 and 2.3. (This revised version of the handout deemphasizes programming Stata and focuses on canned procedures.) Sample Data I copied the counts of midyear population and deaths by Age for Sweden and Kazakhstan from Table 2.1 into a text file which is available in the course website. The file is in "long" format and can be read into Stata using . infile str10 country str5 ageg pop deaths /// > using http://data.princeton.edu/eco572/datasets/preston21long.dat (38 observations read) The first thing we do is calculate the agespecific rates, diviing deaths by population and multiplying by 1000: . gen rates = 1000 * deaths / pop Crude death rates are just a weighted average of agespecific rates using the population in each age group as the weight. We can easily compute them in Stata using the tabstat command: . tabstat rates [fw=pop], by(country) Summary for variables: rates by categories of: country country  mean+ Kazakhstan  7.423042 Sweden  10.54756+ Total  8.470285 The interesting result here is that mortality appears to be lower in Kazakhstan than in Sweden. Standardized Rates Following Preston et al., we will standardize the rates using the unweighted average of the two population compositions as the standard. To do this we first compute the percent distribution for each country using egen , and then compute the average percent in each age: . egen pcpop = pc(pop), by(country) . egen avgcomp = mean(pcpop), by(ageg) You may want to list the data to verify that avg has the same values for the two countries. Now we can compute the standardized rate in one line: . tabstat rates [aw=avgcomp], by(country) http://data.princeton.edu/eco572/std.html (1 of 7) [2/11/2008 2:25:59 PM] Eco 572: Research methods in Demography Summary for variables: rates by categories of: country country  mean+ Kazakhstan  11.882 Sweden  7.374094+ Total  9.628045 The only difference is that I specified aw , an "analytic" weight, instead of fw a "frequency" weight, so Stata wouldn't complaint about noninteger weights. Both compute means the same way: multiply each observation by the weight, sum, and divide by the sum of the weights. Indirect Standardization Frequently we don't have agespecific rates but can easily obtain the age distribution. We can still do a form of standardization applying the rates of one country (or any other standard) to the two age distributions. Let us first create a variable that has the Swedish rates for both countries. We do this sorting by age and then country, and for each age we pick the rate of the second country: . bysort ageg (country): gen swrates = rates[2] The by command is a very powerful feature of Stata that can repeat a command for subgroups. The data must be sorted, but if you specify...
View
Full
Document
This note was uploaded on 02/12/2008 for the course ECON 572 taught by Professor Rodriguez during the Spring '06 term at Princeton.
 Spring '06
 Rodriguez

Click to edit the document details