Unformatted text preview: 2/18/11 PADP 8120: Data Analysis and Sta4s4cal Modeling Comparing Groups Spring 2011 Angela Fer4g, Ph.D. Plan Last 4me: Significance tests using one sample E.g. one sample of seniors surveyed aGer health care reform This 4me: Significance tests comparing groups E.g. comparing a sample of women to a sample of men 1 2/18/11 Differences between 2 samples OGen we want to know if 2 groups are significantly different from each other w.r.t. some variable Comparing means Comparing propor4ons Comparing means example Let's go back to our prescrip4on sample Let's assume that we are interested in whether men and women have different numbers of prescrip4ons. We have 2 samples: 45 men and 55 women. The men have a mean number of prescrip4ons of 7, with a standard devia4on of 15. The women have a mean number of prescrip4ons of 10, with a standard devia4on of 15. 2 2/18/11 Do women have more prescrip4ons than men? To answer this, we es4mate the difference between the popula4ons (the parameter) using the difference between the sample means (the sta4s4c) We run a significance test on this sta4s4c, and work out whether our samples are likely to represent real differences between the popula4ons of men and women. The null hypothesis is that there is no difference in the mean number of prescrip4ons between men and women. Calculate the Z-score Standard error of difference between two estimates se = (se1 ) 2 + (se2 ) 2 =
2 s12 s2 + n1 n 2 Estimate of parameter - null hypothesis value Standard error of estimator (X women - X men ) - 0 X women - X men z= = 2 SE(X women - X men ) swomen s2 + men n women n men z= z= 10 - 7 15 15 + 55 45
2 2 = 3 = 1.00 4.09 + 5
SE=3 3 2/18/11 Graphically H0 mean = 0. SE = ~3. 68% of the distribution Difference between sample means = 3 i.e. SampleWomen is 3 greater than sampleMen Interpreta4on The p-value for a 2-sided test is 0.32 (1-68%). This value is high much higher than our 5% cut-off value. So, we fail to reject H0 that men and women do not differ in their prescrip4on use. Put another way: prescrip4on use among men and women is not sta4s4cally significantly different at the 5% level. 4 2/18/11 Comparing propor4ons example Say we want to compare the average rate of tonsillectomies in 2 similar small towns. We sample 500 adolescents from each town and ask their parents if they've had their tonsils out. In town 1, the propor4on is 0.40 In town 2, the propor4on is 0.25 Do the towns have different rates of tonsillectomies? To answer this, we es4mate the difference between the popula4ons (the parameter) using the difference between the sample propor4ons (the sta4s4c) We run a significance test on this sta4s4c, and work out whether our samples are likely to represent real differences between the popula4ons of the 2 towns. The null hypothesis is that there is no difference in the tonsillectomy rates of the 2 towns. 5 2/18/11 Calculate the Z-score Standard error of difference between two proportions se = (se1 ) 2 + (se2 ) 2 = ^ ^ ^ ^ 1 (1- 1 ) 2 (1- 2 ) + n1 n2 Standard error of difference if H 0 : 1 = 2 is true 1 1 n + 2n2 ^ ^ ^ se0 = (1- ) + where = 1 1 n1 + n 2 n1 n 2 Note: The SE is calculated with the pooled proportion, not the difference between the two proportions. z= ^ ^ ( 2 - 1 ) - 0 = se0 0.25 - 0.40 ^ ^ 2 - 1 1 1 ^ ^ (1- ) + n1 n 2 = -0.15 = -5.06 0.296 z= 2 0.325 * 0.675 * 500 Interpreta4on The p-value for a 2-sided test < 0.001. This value is very small much lower than our 5% cut-off value. So, we reject H0 that 2 towns do not differ in their tonsillectomy rates. Put another way: the tonsillectomy rate in town 1 is sta4s4cally significantly different from that in town 2 at the 5% level. 6 2/18/11 A note about effect size Even if something is sta4s4cally significantly different from no effect, the magnitude may be so small as to make it not economically significant We may want to compare several differences We may not want to deal with units (pounds, dollars, etc.) To deal with these issues, it is some4mes helpful to interpret the results rela4ve to standard devia4ons we call this the effect size. Effect size = y1 - y 2 s If you got an effect size of 2, then the difference between the means are about twice a standard devia4on, which is large. 7 ...
View Full Document
- Summer '11
- Standard Deviation, Mean, Statistical hypothesis testing, Statistical power