This preview shows page 1. Sign up to view the full content.
Unformatted text preview: 833) distribution.
Before we get too far into the work, consider what must be done in terms of
selecting number of bins. Let’ choose bins to be 0-100, 101-200, 201-300, and
300+. Is this su¢ cient? The answer is no. If we look at the last bin, there is
one realization in it. This is not enough for us to capture the variability of the
bin. The following are the rules we impose:
1. Each bin must have an expected value of at least 5.
2. You must have enough bins so that there is at least one degree of freedom.
3. The data should be discrete (there is a continuous version of this test that
we will ignore for this course).
4. The size of the bins can have varying sizes.
Also the formula for degrees of freedom is # of Bins - # of Estimated Parameters - 1. For this it can be seen that we need at least three bins for this
example so satisfy our degrees of freedom condition. Also, we need exactly three
to meet the elements in the bin criteria. So we get bins
25, 47, 47 Bin 2
48, 59, 61, 66, 79, 91, 120 Bin 3
128, 199, 204, 217, 408 Now our hypothesis is that the data collected is governed by a Geometric
Distribution with parameter 0.0833. Our alternative hypothesis is that the data
is not governed by a Geometric Distribution with parameter 0.0833.
The test statistic is the same as the for the other version of the test so we need
to calculate the expected value of each bin. Which is done by E[Bin]=n P(Bin).
This is because being in any particular bin follows a multinomial distribution.
So for the geometric we have
P (Bin 1)
P (Bin 2)
P (Bin 3) = P (0 x 47) 0:3307
= P (48 x 124) 0:3178
= P (125 x 1) 0:3515 The code from R used to get these results are as follows:
From there, we can use the formula to get a test statistic by math:
T = 3
i=1 = ei )2 (oi
ei 15 0:3307)2
(7 15 0:3178)2
(5 15 0:3515)2
(3 5 Now we have enough information to …nish this example! We have a test statistic
of 1.8349 with 1 degree of freedom, so going back to the very original form of
P(T > t) =1
1:803) We end up with two possible solutions depending on what we do from here.
We can either use a simple R code to get a more exact answer or we can refer
to the 2 Tables for a wider approximation. Let’ compare both methods.
Using the R command > 1-pchisq(1.803,1) we yeild the value 0.1793502.
Now looking at the 2 Table with 1 degree of freedom we have that 0:5 <
P (T 1:803) < 0:9, which implies: 1 :05 > 1 P (T 1:803) > 1 :09 which
simpli…es to the range of (0:10; 0:50). The latter is a fairly large spread; however,
and more importantly, it shows that our Test Statistic will pass a hypothesis
test with a 10% cut o¤ without too much fuss over how close it was. The former
value is nice since it will give us a better indication over the exact magnitude
if we w...
View Full Document
- Winter '12