This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Week eight session two: Conﬁdence Intervals (Chapter 19)
Professor Esfandiari The objective of this lecture is to... ° Revisit and clarify the relationship between the Central Limit Theorem, the
normal model, and the construction of conﬁdence intervals. ° Show you how to construct a conﬁdence interval for the population proportion
and how to interpret it ' Show you how to construct a conﬁdence interval for the population proportion
mean how to interpret it ' Clarify the meaning and calculation of margin of error ' Elaborate the effect of the size of the sample on the margin of error ° Calculate the sample size needed for a speciﬁc level of conﬁdence Calculation and interpretation of the confidence interval for
population proportion As it was elaborated in the previous lecture, based on CLT, the distribution of sample
proportions, for independent random samples of size 30 or larger, tend to follow the
normal distribution with mean equal to P (population proportion) and standard deviation equal to Wlpq/n . Since we do not have p, we cannot ﬁnd the true standard deviation of the
distribution of sample proportions. Instead we use pA to calculate the standard
error of proportions. SE (PA) = x/pAWn Example: Suppose we want to ﬁnd out what proportion of UCLA students who
endorse online teaching. We choose a random sample of 256 UCLA undergraduates and
ask them if they endorse online teaching. Let us say that one hundred answer “Yes”. Construct and interpret the 95% conﬁdence interval. P = f/N = 100/256 = 0.39 (lp) or q =1— 0.39 = 0.61 In the normal distribution 95% of the area is within Z = +/ 1.96
95% confidence interval for P would be p" +/— 1.96 * S(E) pA 303) PA = x/P"*61"/N
$03) pA = 0039* 0.61/256 = 0.03 95% CI = 0.39 +/— 1.96 * 0.03 = 0.39 +/— 0.06 = (0.33.; 0.45) Interpretation: We are 95% confidence that between 33% to 45% of the UCLA
undergraduates endorse online teaching. Formally we mean that 95% of the samples of this size will produce conﬁdence intervals
that capture the true population proportion. This is correct. But, not as clear to a non statistical audience. What we imply is that “we are 95% conﬁdent that the true population
proportion lies in this interval.” The concept of margin of error The formula for the 95% conﬁdence interval consists of two parts (pA) or the sample
estimate and +/_ Z * S(E) p". If we just use pA as an estimate of P, we would be doing a point estimate. A point
estimate is not useful and that is why we compute a confidence interval and we call it an interval estimate. The second part of the formula for the conﬁdence interval is referred to a margin of
error. Thus, when you come across statement such as: The proportion of American public who are worried about the economy is
65% + /— 3% 65% is the p" or the proportion in the sample and 3% is the margin of error;
making the conﬁdence interval (usually 95%) from 62% to 66%. Thus: Margin of error for conﬁdence interval for population proportion is
equal to: +/_ z * S(E) pA = +/— z * 1(FAWN The Z values will change depending upon the conﬁdence interval that one
intends to compute (see Z table): For 95% conﬁdence interval Z = +/— 1.64 or 1.65 or 1.645
For 95% conﬁdence interval Z = +/— 1.96 usually rounded to +/2
For 99% conﬁdence interval Z = +/— 2.57 or 2.58 These Z values are referred to as CRITICAL VALUE Assumptions for constructing the conﬁdence interval: Independence: The choice of participants in the sample should be
independent and the choice of one participant should not be dependent upon
choosing somebody else (example, a friend, a sister, etc.) Randomization condition: Conﬁdence intervals can only be computed for
random samples and not for samples of convenience or voluntary samples.
10% condition: n*p’\ and n*q/\ should be large enough >= 10 to justify the
use of the normal model. Calculation of sample size using the margin of error Suppose that you want to conduct a study at UCLA to estimate the
proportion of undergraduates who plan to go to graduate school. You would
like to have a margin of error equal to 3% and you want 95% conﬁdence
interval. The problem here is that we do not have a p". We are going to
let pA equal to 0.50 because this is the value that makes p"qA and thus 11
the largest size and thus we would not be underestimating the value of 11. ME = Z * WipM‘qA/n
To get n out of the square root, we will square both sides of the equation
MEAZ = {Z A2 * mMg"): 11
We will now multiply both sides of the equation by n
MEAZ * n= {Z “2 * (pM‘g/‘n * n n
11 gets cancelled from the right side MEA2 * n = {2 A2 * (p"*g’\)} n =_{Z A2 * (DA*gA)}
(ME)’\2 n =11.96A2 g 0.50*0.50)
0.03/‘2 n= 1067 Calculation and interpretation of the confidence interval for
population mean or 11. As it was elaborated in the previous lecture, based on CLT, the distribution of sample
means, for independent random samples of size 30 or larger, tends to follow the normal
distribution with mean equal to 11, (population mean) and standard deviation equal to O/w/N Since we do not have 0 (SD in population) in the majority of the time, we cannot ﬁnd the true standard deviation of the distribution of sample means. Instead we use S
(SD in the sample) to calculate the standard error of the mean. SE("X‘)=S/«W Example: Suppose we want to estimate the average annual salary of female physicians
who specialize in family medicine. We choose a random sample of 81 female physicians
who were hired in 2010. Let us assume that the average annual salary and standard
deviation for this sample was equal to 130K and 10K respectively. Construct and
interpret the 99% conﬁdence interval for the population mean. In the normal distribution 99% of the area is within Z = +/— 2.57
99% confidence interval for it would be Y +/ 2.57 * (SE) E (SE) 2 =10/ 1/8— = 10/9 = 1.11
95% CI=130+/— 2.57 * 1.11:130 +/— 2.85 (127.15,135.85) or ($ 127,150 to $135,850). Interpretation: We are 95% conﬁdent that the average annual salary for female
physicians who specialize in family medicine was between $ 127,150 to $ 135,850 in
2010. Formally we mean if we chose 100 random samples of size 81 from the population of
female physicians who specialize in family medicine and got hired in 2010, 95% of the
resulting conﬁdence interval would capture the true value of the population mean or 11.. However, such interpretation might be too technical for a non—statistical audience. Solution to the problems from the book from chapter 19 2. Margin of error.
He believes the true percentage of children who are exposed to leadbase paint is within
3% of his estimate, with some degree of conﬁdence, perhaps 95% conﬁdence. 5. Conclusions.
a) Not correct. This statement implies certainty. There is no level of conﬁdence in the statement.
b) Not correct. Different samples will give different results. Many fewer than 95% of samples are expected to have exactly 88% ontime orders. c) Not correct. A conﬁdence interval should say something about the unknown
population proportion, not the sample proportion in different samples. d) Not correct. We know that 88% of the orders arrived on time. There is no need to make an interval for the sample proportion.
e) Not correct. The interval should be about the proportion of ontime orders, not the days. 7. Conﬁdence intervals.
21) False. For a given sample size, higher conﬁdence means a larger margin of error.
b) True. Larger samples lead to smaller standard errors, which lead to smaller margins of error.
c) True. Larger samples are less variable, which makes us more conﬁdent that a given conﬁdence interval succeeds in catching the population proportion.
(1) False. The margin of error decreases as the square root of the sample size increases.
Halving the margin of error requires a sample four times as large as the original. 15. Contributions please.
a) Randomization condition: Letters were sent to a random sample of 100,000 potential donors.
10% condition: We assume that the potential donor list has more than 1,000,000 names. Success/Failure condition: np“ = 4,781 and nq“ = 95,219 are both much greater than 10,
so the sample is large enough. PA +/z* 1lpAQA/n = (4,781/100,00) +/1.96* 1/{4,781/100,00* 95,219/100,000}
#100,000 = (0.0465,0.0491) We are, 95% conﬁdent that the between 4.65% and 4.91% of potential donors would
donate. b) The conﬁdence interval gives the set of plausible values with 95% conﬁdence. Since
5% is outside the interval, it seems to be a bit optimistic. 26. Pregnancy. a) Independence assumption: There is no reason to believe that one woman’s ability to
conceive would affect others. Randomization condition: These women are not chosen at random. Assume that they
are representative of all women under 40 that had previously been unable to conceive.
10% condition: 207 women is less than 10% of all such women. Success/Failure condition: np“ = 49 and nq" = 158 are both greater than 10, so the
sample is large enough. Since the conditions are met, we can use a one—proportion z
interval to estimate the proportion of the births to women at the clinic. PA +/— z* 1lpAqA/n = (49/207) +/1.645* 1/[69/207)*(158/207) w/207
= (18.8,28.5) b) We are 90% conﬁdent that between 18.8% and 28.5% of women under 40 who are
treated at this clinic will give birth. c) About 90% of random samples of size 207 will produce conﬁdence intervals that
contain the true proportion of women under 40 who are treated at this clinic that will give
birth. (1) These data do not refute the clinics claim of a 25% success rate, since 25% is in the
clinic. '35. Graduation.
. a) AA ME=Z* I’l 0.06:1.645 W
,, . _ (1.645)2(0.25)(0.75)
(0.06)2
n z 141 people
b)
ME = 5‘ —q
n
0 04 = 1.645 [—(0'25X0'75)
I’L
_ (1.645)2(O.25)(0.75)
(0.04.)2
n z 318 people
c)
ME = z* E
n 0.03 =1.645 n (0.03)2
n z 564 people (1.645)2(0.25)(0.75) In order to estimate the proportion of non—graduates in
the 25to 30yearold age group to within 6% with 90%
confidence, we would need a sample of at least 141
people. All decimals in the final answer must be
rounded up, to the next person. (For a more cautious answer, let f7 = c} = 0.5. This
method results in a required sample of 188 people.) In order to estimate the proportion of nongraduates in
the 25to 30yearold age group to within 4% with 90%
conﬁdence, we would need a sample of at least 318
people. All decimals in the final answer must be
rounded up, to the next person. (For a more cautious answer, let f7 = c} = 0.5. This
method results in a required sample of 423 people.)
Alternatively, the margin of error is now 2/3 of the
original, so the sample size must be increased by a factor of 9 / 4. 141(9 / 4) w 318 people. In order estimate the proportion of nongraduates in the
25to 30yearold age group to within 3% with 90%
confidence, we would need a sample of at least 564
people. All decimals in the final answer must be
rounded up, to the next person. (For a more cautious answer, let I“? = t} = 0.5. This
method results in a required sample of 752 people.) Alternatively, the margin of error is now half that of the original, so the sample size must be increased by a
factor of 4. 141(4) z 564 people. ’ of error. A medical researcher estimates the tage of children who are exposed to leadbase
dding that he believes his estimate has a margin pr of about 3%. Explain what the margin of error . Conclusions. A catalog sales company promises to de
liver orders placed on the Internet Within 3 days. Follow
up calls to a few randomly selected customers show that
a 95% conﬁdence interval for the proportion of all orders
that arrive on time is 88% i 6%. What does this mean?
Are these conclusions correct? Explain. a) Between 82% and 94% of all orders arrive on time. b) 95% of all random samples of customers will show
that 88% of orders arrive on time. c) 95% of all random samples of customers will show
that 82% to 94% of orders arrive on time. d) We are 95% sure that between 82% and 94% of the or
ders placed by the customers in this sample arrived on
time e) On 95% of the days, between 82% and 94% of the or
ders will arrive on time. e the offer. First USA, a major credit card company, anning a new offer for their current cardholders. The I ffér Will give double airline miles on purchases for the ; t6 months if the cardholder goes online and registers the offer. To test the effectiveness of the campaign, f USA recently sent out offers to a random sample of 90 cardholders. Of those, 1184 registered. " ve a 95% conﬁdence interval for the true proportion of those cardholders who will register for the offer. the acceptance rate is only 2% or less, the campaign on’t be worth the expense. Given the conﬁdence in 1.
al you found, what would you say? I l 26' Pressman In 1998 a San Diego reproductive clini i
V‘ '. . c  é ported 49 live births to 207 women under the age of 43:e Who had previously been unable to conceive. 9'; Ia) Find a 90% confidence interval for th ‘*
v r A v r . _ e S  .
U I , ‘ . ‘ uccess rate at [ b) Interpret your. interval in this context.
» c) Explain what “90% conﬁdence” means, d) Do these data refute the clinic’s claim of
rate? Explain. I
What 5 the necess to cut the margin of error to 4%.
ary sample size? ...
View
Full
Document
This note was uploaded on 12/03/2011 for the course STATISTICS 10 taught by Professor Gould during the Fall '11 term at UCLA.
 Fall '11
 Gould

Click to edit the document details