Lecture08

Course: SOC 549, Fall 2009
School: Ohio State
Lecture 8 Confidence intervals for the mean Sociology 549, Paul von Hippel 1 Review of central limit theorem (CLT) Turning the CLT around to get confidence intervals, e.g. Confidence interval for a mean: We are 95% confident that new sociology BAs (population) have an average starting salary (parameter) between \$27,369 and \$30,299 (interval) Overview Sociology 549, Paul von Hippel

certain How are we? Lecture 8 Confidence intervals for the mean Sociology 549, Paul von Hippel 1 Review of central limit theorem (CLT) Turning the CLT around to get confidence intervals, e.g. Confidence interval for a mean: We are 95% confident that new sociology BAs (population) have an average starting salary (parameter) between \$27,369 and \$30,299 (interval) Overview Sociology 549, Paul von Hippel 2 Central limit theorem for the mean: Review The sample mean is usually within a couple standard errors of the population mean. In 95% of all samples, Y is in Y 1.96 Y where Y = Y / N In 99% of all samples, Y is in Y 2.58 Y where Y = Y / N Sociology 549, Paul von Hippel 3 Central limit theorem for the mean: Using abbreviated normal table The sample mean is usually within a couple standard errors of the population mean. In Confidence% of all samples, Y is in Y Z Y where Y = Y / N where Z comes from the "z (standard normal)...table" Confidence z ... ... 95% 1.96 ... ... Sociology 549, Paul von Hippel Area from -Z to +Z - Z +Z 4 Central limit theorem for the mean: So what? If you know the population mean, the central limit theorem helps you guess the sample mean. But in practice you know the sample mean, and want to guess the population mean. Sociology 549, Paul von Hippel 5 Central limit theorem: The sample mean is usually within a couple standard errors of the population mean. In Confidence% of all samples, Turning around the central limit theorem: Normal confidence interval Y is in Y Z Y where Y = Y / N Confidence interval: The population mean is usually within a couple standard errors hopefully close of the sample mean. We are Confidence% confident that Y is in Y ZSY where SY = SY / N 6 Sociology 549, Paul von Hippel Normal confidence interval: Example #1 data National Association of Colleges and Employers Sample of N=92 Sociology BAs, graduating 2000 01 not a Random Sample but we'll pretend it is Variable: starting salary (Y) in thousands Cases: 38.0, 28.0, 28.0, 24.6, ... Y = 28.834 SY = 7.095 Sociology 549, Paul von Hippel 7 Normal confidence interval: Example #1 calculation If we want 95% confidence, then Z=1.96. Confidence z ... ... 95% 1.96 ... ... Y is in Y ZSY where SY = SY / N Y is in 28.834 1.96SY where SY = 7.095 / 92 = .7397 Y is in 28.834 1.96(.7397) Y is in 28.834 1.450 Y is between 27.384 and 30.284 Sociology 549, Paul von Hippel 8 Normal confidence interval: Example #1 interpretation We are 95% sure that the average salary in the population of new soc BAs, 200001, is between \$27.384K and \$30.284K Just an average. Doesn't mean 95% of individual salaries are in the interval. Sociology 549, Paul von Hippel 9 Point estimate Definitions best single-number guess at a parameter rarely right, often close \$28.834K is a point estimate for the average salary of new sociology BAs Y is a point estimate of Y Confidence interval range (interval) of guesses at a parameter width reflects uncertainty: vague but usually right (\$27.384K, \$30.284K) is a 95% confidence interval for the average salary of new sociology BAs Sociology 549, Paul von Hippel 10 Y ZSY is a confidence interval for Y The confidence interval covers the population mean unless you've got a weird sample. What is confidence? "95% confidence" = In 95% of all samples, the confidence interval includes the population mean. In 5% of all samples, the confidence interval fails to cover the mean. Do you know if you've got a weird sample? No! So 5% of the time you're wrong. typical samples 95% weird samples Sociology 549, Paul von Hippel weird samples 11 What is confidence? Is a 99% confidence interval narrower or wider than a 95% confidence interval? Sociology 549, Paul von Hippel 12 Problem with the normal confidence interval Central limit theorem: The sample mean is usually within a couple standard errors of the population mean. In Confidence% of all samples, Y is in Y Z Y where Y = Y / N Confidence interval: The population mean is usually within a couple standard errors not identical of the sample mean. We are Confidence% confident that Y is in Y ZSY where SY = SY / N 13 Sociology 549, Paul von Hippel The normal confidence interval assumes sY = Y But sY is just an estimate of Y Uncertainty (sampling error) in this estimate is not reflected in normal confidence interval Sampling error in sY Normal confidence interval is too certain, too confident too narrow We say it's right in 95% of all samples But it's actually right in < 95% Sociology 549, Paul von Hippel 14 The t distribution Normal distribution : How many true (population) standard errors Y = Y / N separate Y from Y ? t distribution : How many estimated (sample) standard errors SY = SY / N separate Y from Y ? Sociology 549, Paul von Hippel 15 t distribution The t distribution reflects uncertainty sY about as an estimate of Y If N is large, then sY is close to Y, so t distribution is close to standard normal distribution (black) Instead of N, though, we use degrees of freedom (df) df=N-1 Sociology 549, Paul von Hippel 16 t confidence intervals for the mean We are Confidence% confident that Y is in Y tsY where sY = sY / N t comes from your "z (standard normal) and t table" t Degrees of freedom df=N-1. Degrees of freedom (df) Confidence 90% 91% 92% 93% 94% 95% 96% 97% 98% 99% 99.9% z 1.64 1.70 1.75 1.81 1.88 1.96 2.05 2.17 2.33 2.58 3.29 100 1.66 1.71 1.77 1.83 1.90 1.98 2.08 2.20 2.36 2.63 3.39 50 1.68 1.73 1.79 1.85 1.92 2.01 2.11 2.23 2.40 2.68 3.50 40 1.68 1.74 1.80 1.86 1.94 2.02 2.12 2.25 2.42 2.70 3.55 30 1.70 1.75 1.81 1.88 1.95 2.04 2.15 2.28 2.46 2.75 3.65 25 1.71 1.76 1.82 1.89 1.97 2.06 2.17 2.30 2.49 2.79 3.73 Sociology 549, Paul von Hippel 17 Normal confidence intervals: Normal vs. t confidence intervals for the mean: Comparing the formulas Y is in Y Z Y t confidence intervals: where Y = Y / N where sY = sY / N Y is in Y tsY Differences: 1. Z vs. t (Appendix B vs. Appendix C) 1. vs. s (population vs. sample) 1. t uses degrees of freedom df=N-1. Z doesn't. t intervals are wider. Confidence level is more accurate. Sociology 549, Paul von Hippel 18 National Association of Colleges and Employers Confidence interval for the mean: Example #2 data Sample of N=92 Sociology BAs, graduating 2000 01 Variable: starting salary in thousands Cases: 38.0, 28.0, 28.0, 24.6, ... Y = 28.834 SY = 7.095 Sociology 549, Paul von Hippel 19 If we want 95% confidence, then since df=N1=91 (closest to 100), t=1.98. Confidence z 95% 1.96 t Degrees of freedom (df) 100 50 40 30 25 1.98 2.01 2.02 2.04 2.06 Confidence interval for the mean: Example #2 calculation Y is in Y tSY Y is in 28.834 1.98SY where SY = SY / N where SY = 7.095 / 92 Y is in 28.834 1.98(.740) Y is in 28.834 1.465 Y is between 27.369 and 30.299 Sociology 549, Paul von Hippel 20 Confidence interval for the mean: Example #2 interpretation We are 95% sure that the average salary in the population of new soc BAs, 200001, is between \$27,369 and \$30,299 Just an average. Doesn't mean 95% of individual salaries are in the interval. Sociology 549, Paul von Hippel 21 Normal vs. t confidence intervals for the mean: Comparing the results, Example #2 95% normal confidence interval: (\$27,384, \$30,284) 95% t confidence interval: (\$27,369, \$30,299) The t interval is wider. But just \$30 wider. df is large, so t is close to Z Here df=N-1=91. In student GSS, df=N-1=1427, so t even closer to Z. Sociology 549, Paul von Hippel 22 Sample of N=15 US adults from student GSS Variable: CHILDS ("How many children have you ever had?") Cases: 3,6,2,2, ... Confidence interval for the mean: Example #3 data small sample. Maybe t will make a difference. Y = 2.38 SY = 1.502 Sociology 549, Paul von Hippel ...

