This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Eco 572: Research methods in Demography Rates and Standardization (Revised) We will work through the example in Preston et al, sections 2.2 and 2.3. (This revised version of the handout de-emphasizes programming Stata and focuses on canned procedures.) Sample Data I copied the counts of mid-year population and deaths by Age for Sweden and Kazakhstan from Table 2.1 into a text file which is available in the course website. The file is in "long" format and can be read into Stata using . infile str10 country str5 ageg pop deaths /// > using http://data.princeton.edu/eco572/datasets/preston21long.dat (38 observations read) The first thing we do is calculate the age-specific rates, diviing deaths by population and multiplying by 1000: . gen rates = 1000 * deaths / pop Crude death rates are just a weighted average of age-specific rates using the population in each age group as the weight. We can easily compute them in Stata using the tabstat command: . tabstat rates [fw=pop], by(country) Summary for variables: rates by categories of: country country | mean-----------+---------- Kazakhstan | 7.423042 Sweden | 10.54756-----------+---------- Total | 8.470285---------------------- The interesting result here is that mortality appears to be lower in Kazakhstan than in Sweden. Standardized Rates Following Preston et al., we will standardize the rates using the unweighted average of the two population compositions as the standard. To do this we first compute the percent distribution for each country using egen , and then compute the average percent in each age: . egen pcpop = pc(pop), by(country) . egen avgcomp = mean(pcpop), by(ageg) You may want to list the data to verify that avg has the same values for the two countries. Now we can compute the standardized rate in one line: . tabstat rates [aw=avgcomp], by(country) http://data.princeton.edu/eco572/std.html (1 of 7) [2/11/2008 2:25:59 PM] Eco 572: Research methods in Demography Summary for variables: rates by categories of: country country | mean-----------+---------- Kazakhstan | 11.882 Sweden | 7.374094-----------+---------- Total | 9.628045---------------------- The only difference is that I specified aw , an "analytic" weight, instead of fw a "frequency" weight, so Stata wouldn't complaint about non-integer weights. Both compute means the same way: multiply each observation by the weight, sum, and divide by the sum of the weights. Indirect Standardization Frequently we don't have age-specific rates but can easily obtain the age distribution. We can still do a form of standardization applying the rates of one country (or any other standard) to the two age distributions. Let us first create a variable that has the Swedish rates for both countries. We do this sorting by age and then country, and for each age we pick the rate of the second country: . bysort ageg (country): gen swrates = rates The by command is a very powerful feature of Stata that can repeat a command for subgroups. The data must be sorted, but if you specify...
View Full Document
This note was uploaded on 02/12/2008 for the course ECON 572 taught by Professor Rodriguez during the Spring '06 term at Princeton.
- Spring '06