This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Interval Estimation
Utku Suleymanoglu
UMich Utku Suleymanoglu (UMich) Interval Estimation 1 / 17 Introduction Introduction Point estimators give us a single number as guess for unknown population parameter
But we know that estimators are random variables, they can take diﬀerent values with
diﬀerent samples.
Maybe it is better to create a range estimate for the true population parameter.
Not a single number but an interval: a lower limit and upper limit.
Intervals constructed to provide range estimates for true population parameters.
Has the general formulae: Point Estimate ± Margin of Error
Again we will focus on X and P . Utku Suleymanoglu (UMich) Interval Estimation 2 / 17 Interval Estimation of Population Mean We will start with the population mean. We know x is “the” estimator for it. It is a good
¯
idea to base our interval on x .
¯
We will create intervals centered at x by adding and subtracting the margin of error from
¯
it. But how do we choose the margin of error?
Notice how margin of error determines the width of the interval. How big the interval
should be?
We will use a probabilistic way of doing this.
Conﬁdence Interval General Form
Suppose unknown µ (or any population parameter) is of interest. And we ﬁgured out
that there are two values, L and U such that when we calculate the probability
P (L < µ < U ) = 1 − α where α is in (0, 1). Then the interval (L, U ) is called the
100(1 − α)% conﬁdence interval.
Is this a good idea? We will see. . . .
But for now, let’s try to do what this asks for. Utku Suleymanoglu (UMich) Interval Estimation 3 / 17 Interval Estimation of Population Mean Case 1: Normally distributed population, σ known
We learned that if X ∼ N (µ, σ 2 ) then X ∼ N (µ, σ 2 /n).
¯ X −µ
So that Z = σ/√n has a standard normal distribution. Then we know that there is a
value zα/2 such that 1 − α = P (−zα/2 < Z < zα/2 )
X −µ
√ < zα/2 )
σ/ n
√
√
= P − zα/2 (σ/ n) < X − µ < zα/2 (σ/ n)
= P (−zα/2 < √
√
= P − X − zα/2 (σ/ n) < −µ < −X + zα/2 (σ/ n)
√
√
= P X − zα/2 (σ/ n) < µ < X + zα/2 (σ/ n)
=P L<µ<U √
√
where L = X − zα/2 σ/ n and U = X + zα/2 σ/ n
Utku Suleymanoglu (UMich) Interval Estimation 4 / 17 Interval Estimation of Population Mean zα/2 Deﬁnition
zα/2 is the zvalue such that it has a right tail probability of α/2. This means that the
probability to the left of zα/2 is the cumulative probability. So we are looking for a z such
that F (zα/2 ) = 1 − α/2.
P (−zα/2 < Z < zα/2 ) = 1 − α 1−α
α/2 −zα/2 α/2
0 zα/2 z Example: If α = 0.05, α/2 = 0.025. So we are interested in zα/2 = z0.025 . It is a value
such that F (z0.025 ) = 1 − 0.025 = 0.975. Do an inverse look up on the ztable to
calculate F −1 (0.975) = 1.96. So z0.025 = 1.96.
Utku Suleymanoglu (UMich) Interval Estimation 5 / 17 Interval Estimation of Population Mean We ﬁgured out the lower and upper bounds of the 100(1 − α)% conﬁdence interval.
Notice that they are in the format (as promised) x ± ME .
¯
Margin of error in this case is
σ
ME = zα/2 √
n
Let’s summarize this result before we discuss what it means:
CI for µ, Case 1: Normal Population, σ known
If population has a normal distribution and σ is known, a 100(1 − α)% conﬁdence
interval for µ can be constructed via:
σ
x ± zα/2 √
¯
n
where zα/2 is the z value such that F (z ) = 1 − α/2 or simply z value with upper tail
probability of α/2.
Now let’s do an example. Utku Suleymanoglu (UMich) Interval Estimation 6 / 17 Interval Estimation of Population Mean Example Suppose you have a normally distributed population with variance σ 2 = 4. You have a
sample of 400 observations, whose sample average is x = 3. Let’s build a 90% conﬁdence
¯
interval for µ.
If (1 − α) = 0.90, then α = 0.10 and α/2 = 0.05. What is the zα/2 ? Look up the table
for upper tail probability of 0.05: F (1.645) = 0.95, so zα/2 = 1.645.
We have one of the components of margin of error down. What is the other? SE of the
√
mean = σ/ n = 2/20 = 0.1
Then the margin of error is 1.645 × 0.1 = 0.1645. And our conﬁdence interval estimate is
(3 − 0.1645, 3 + 0.1645) = (2.8355, 3.1645)
Question: what does the conﬁdence interval mean? “The probability that µ is in this
interval is 90%”? NO! Utku Suleymanoglu (UMich) Interval Estimation 7 / 17 Interval Estimation of Population Mean Interpretation of Conﬁdence Interval Estimates Population parameter, µ, is unchanging. It is not probabilistic.
Once you build an interval with the single x at hand, µ is either inside or outside of
¯
it. You don’t know which is the case, but there is no probability to it.
So you don’t say “probability that µ is in my CI (2.8355, 3.1645) is 90%”.
Correct interpretation of CI’s are related to sampling distributions. Notice L and U
¯
are also functions of X , hence also random themselves.
We build diﬀerent CI’s with diﬀerent x ’s if we had diﬀerent samples.
¯
We had kept doing this (repeated sampling), we know that 90% of the CI’s we build
will have the correct µ in it.
That is why 100(1 − α)% is called the conﬁdence level. NOT because we have 90%
probability that µ is inside of the conﬁdence interval we just built.
But we are 90% conﬁdent that our interval contains the µ, because 90% of CIs built
this way will contain it.
At this point, I should draw a graph for you explaining the random sampling of CIs. Utku Suleymanoglu (UMich) Interval Estimation 8 / 17 Interval Estimation of Population Mean Case 2: Normally distributed population, σ unknown
We keep talking about unknown population parameters: µ is unknown, we are building
an interval estimate for it.
How realistic is that σ is known. Not really: we almost never know what σ is. So what
do we do?
We can use s , sample standard deviation, instead. We can do this, but need a
modiﬁcation to make it work.
It turns out the sampling distribution of x is not normal if you use s instead of σ .
¯
Sampling Distribution of x with unknown σ
¯
If x is the sample mean of a sample drawn from a population with N (µ, σ 2 ) distribution,
¯
then
t= x −µ
¯
√
s/ n has a Student’s t distribution with n1 degrees of freedom.
2 2
Here sx = sn is the estimate for the sampling variance. So sx =
¯
¯
standard error of the mean.
Utku Suleymanoglu (UMich) Interval Estimation s
√ n is estimated
9 / 17 Interval Estimation of Population Mean Student’s tdistribution is another continuous probability distribution.
It looks a lot like the standard normal distribution with symmetricity and mean at
zero.
It has a single parameter, called its degrees of freedom. As df increases, it converges
to N (0, 1).
We will rely on another table to calculate probabilities for it. Utku Suleymanoglu (UMich) Interval Estimation 10 / 17 Interval Estimation of Population Mean Utku Suleymanoglu (UMich) Interval Estimation 11 / 17 Interval Estimation of Population Mean CI for µ, Case 2: Normal Population, σ unknown
If population has a normal distribution and σ is unknown, but sample standard
deviation,s , is known, a 100(1 − α)% conﬁdence interval for µ can be constructed via:
√
x ± tα/2,n−1 s / n
¯
where tα/2,n−1 is tvalue with n − 1 degrees of freedom and α/2 upper tail probability.
Notice now the margin of error is a function of both x and s
¯ Utku Suleymanoglu (UMich) Interval Estimation 12 / 17 Interval Estimation of Population Mean Example Similar example: Suppose you have a normally distributed population with variance σ 2
unknown. You have a sample of 14 observations, whose sample average is x = 7. Sample
¯
standard deviation is measured to be 1.21. Let’s build a 95% conﬁdence interval for µ.
We need to get the margin of error. Let’s calculate tα/2,n−1 is ﬁrst. α = 0.05, so
α/2 = 0.025. n = 14, so n − 1 = 13. Let’s look up the table:
√
1.
We have tα/2,n−1 = 2.160. So margin of error is tα/2,n−1 s / n = 2.160 × √21 . It is
14
0.6985. So the conﬁdence interval:
(7 − 0.6985, 7 + 0.6985) = (6.3015, 7.6985)
Food for thought: suppose the 1.21 was σ not s . What would happen to the width of
our interval estimate? Why? Utku Suleymanoglu (UMich) Interval Estimation 13 / 17 Interval Estimation of Population Mean Case 3: Population Not Normally Distributed, σ unknown Above results relied on the population values having a normal distribution. If this is
violated, sampling distribution of x cannot be normal or tdistribution. So the conﬁdence
¯
intervals cannot be constructed as described.
Luckily, CLT comes to rescue. For large enough n, x is approximately normally
¯
distributed, so we can build conﬁdence intervals.
CI for µ, Case 3: Any Population, σ unknown
If population has unknown distribution and sample standard deviation is s , an
approximate 100(1 − α)% conﬁdence interval for µ can be constructed via:
√
x ± zα/2 s / n
¯
only if n > 30 Utku Suleymanoglu (UMich) Interval Estimation 14 / 17 Interval Estimation of Population Mean Sample Size and Desired Margin of Error
As we have seen, margin of error depends on sample size in all the cases.
For a more precise estimation, a small margin of error is desired.
Suppose you get to design your study and you want to keep the margin of error low.
Say, at 0.5.
You can do that, if you can collect enough data.
Think about the case where σ is known.
2
zα/2 σ 2
σ
ME = zα/2 √ → n =
ME 2
n So for example, if σ = 2, α = 0.05, so that zα/2 = 1.96, to ensure ME = 0.5, we need at
least
1.962 22
= 61.46
0.52
62 observations, (just to be on the safe sidewith 62, ME will be slightly smaller than
0.5).
n= Utku Suleymanoglu (UMich) Interval Estimation 15 / 17 Interval Estimation for Proportions Conﬁdence Invervals for p Just like µ, you can build conﬁdence intervals for any population parameters, as long as
you know its sampling distribution.
In the previous chapter, we saw that we can estimate proportions and the estimator has
an approximate normal distribution.
CI for p
If p is the estimate for population proportion of p , we can build a conﬁdence interval
¯
estimate p as:
p ± zα/2
¯ p (1 − p )
¯
¯
n Same spirit as before: example in the section. Utku Suleymanoglu (UMich) Interval Estimation 16 / 17 Interval Estimation for Proportions Additional Note for The Entire Chapter For practice:
Think about what it means to have a narrower or a wider CI using the correct
interpretation of CIs.
Think about what happens to CIs when one these increase while keeping others
constant, do they get narrower or wider? Think about the intiution, too.
α
n
σ or s
p (1 − p )
¯
¯ Utku Suleymanoglu (UMich) Interval Estimation 17 / 17 ...
View
Full Document
 Spring '08
 STAFF
 Normal Distribution, Utku Suleymanoglu

Click to edit the document details