02/23/11 Soc 210 summer 2010 1 Sociology 210 Lecture 13: Confidence Intervals and Chi-Square Statistic

02/23/11 Soc 210 summer 2010 2 Estimation Point estimate : a value of a statistic used to estimate an unknown population parameter A confidence interval is a range of values computed from the sample data that has a specified probability of containing the true value of the parameter being estimated. The probability that the confidence interval contains the parameter is called the confidence level (usually a number close to 1, e.g,. 0.99, 0.95, 0.90) A 95% confidence interval has 0.95 probability of containing the parameter If the parameter being estimated were μ , the 95% confidence interval might look like the following: 12.5 μ 30.2 What this means is that the interval between 12.5 and 30.2 has a 0.95 probability of containing μ .
02/23/11 Soc 210 summer 2010 3 Example to get Intuition Suppose we are advising a movie producer who has discovered a script that Hitchcock wrote but never made into a movie. The producer wants to figure out whether to make the movie, but needs some estimate as to how much money the movie might bring in at the box office. We have data from a sample of past Hitchcock movies. Let’s assume that this potential movie’s earnings would be drawn from the same population of movies as our sample, so we can use the sample to give the producer some advice. Our best point estimate would be the mean from this sample, but we also want to provide a range of values and our confidence level.

02/23/11 Soc 210 summer 2010 4 The Data (n = 20) Family Plot \$13,200,000 The Man Who Knew Too Much \$10,250,000 Frenzy \$12,600,000 The Trouble With Harry \$7,000,000 Topaz \$6,000,000 To Catch a Thief \$8,750,000 Torn Curtain \$13,000,000 Rear Window \$27,559,601 Marnie \$7,000,000 Dial M for Murder (1954) \$6,000,000 The Birds \$11,403,529 Strangers on a Train \$7,000,000 Psycho (1960) \$32,000,000 Notorious \$8,000,000 North by Northwest \$13,275,000 Spellbound \$7,000,000 Vertigo \$3,200,000 Suspicion \$4,500,000 The Wrong Man \$2,000,000 Rebecca \$6,000,000 Mean = \$10,286,907 SD = \$7,467,486 Hitchcock movies and their box office earnings
02/23/11 Soc 210 summer 2010 5 Example Continued In order to create a confidence interval, we’ll start with the mean (our best guess), and put a margin of error around it to create an interval (a range of numbers), and then attach a probability to that range (called the confidence level or γ ). Where do we get the margin of error and confidence level from? If we assume a normal distribution of movie earnings, then the distribution of sample means will have a t distribution (we don’t know σ in the population) We can pick a confidence level and use the t distribution to get the margin or error.

02/23/11 Soc 210 summer 2010 6 X X E 0 E E X + E X - Choose E such that the confidence interval contains the middle γ % of means in the distribution of sample means centered on Xbar X s t E * = t* -t* prob. = γ
02/23/11 Soc 210 summer 2010 7 Example Continued If we want to be 90% confident (γ = 0.90), then we choose t* based on 0.05 percent in each tail (use t distribution with df = n-1 = 20-1 = 19) t* = 1.729 We also need to calculate the SE for this distribution So our margin of error ( E ) is: And our 90% confidence interval is:

