s340text-KStestOnly

# That the data follows a geo000833 distribution before

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 833) distribution. Before we get too far into the work, consider what must be done in terms of selecting number of bins. Let’ choose bins to be 0-100, 101-200, 201-300, and s 300+. Is this su¢ cient? The answer is no. If we look at the last bin, there is one realization in it. This is not enough for us to capture the variability of the bin. The following are the rules we impose: 1. Each bin must have an expected value of at least 5. 2. You must have enough bins so that there is at least one degree of freedom. 3. The data should be discrete (there is a continuous version of this test that we will ignore for this course). 4. The size of the bins can have varying sizes. Also the formula for degrees of freedom is # of Bins - # of Estimated Parameters - 1. For this it can be seen that we need at least three bins for this example so satisfy our degrees of freedom condition. Also, we need exactly three to meet the elements in the bin criteria. So we get bins Bin 1 25, 47, 47 Bin 2 48, 59, 61, 66, 79, 91, 120 Bin 3 128, 199, 204, 217, 408 Now our hypothesis is that the data collected is governed by a Geometric Distribution with parameter 0.0833. Our alternative hypothesis is that the data is not governed by a Geometric Distribution with parameter 0.0833. The test statistic is the same as the for the other version of the test so we need to calculate the expected value of each bin. Which is done by E[Bin]=n P(Bin). This is because being in any particular bin follows a multinomial distribution. So for the geometric we have P (Bin 1) P (Bin 2) P (Bin 3) = P (0 x 47) 0:3307 = P (48 x 124) 0:3178 = P (125 x 1) 0:3515 The code from R used to get these results are as follows: &gt; pgeom(47,0.00833) [1] 0.3306945 &gt; pgeom(124,0.00833)-pgeom(47,0.00833) [1] 0.3178285 &gt; 1-pgeom(124,0.00833) [1] 0.351477 From there, we can use the formula to get a test statistic by math: T = 3 X i=1 = ei )2 (oi ei 15 0:3307)2 (7 15 0:3178)2 (5 15 0:3515)2 + + 15 0:3307 15 0:3178 15 0:3515 1:8349 (3 5 Now we have enough information to …nish this example! We have a test statistic of 1.8349 with 1 degree of freedom, so going back to the very original form of the question: P(T &gt; t) =1 =1 P(T P(T t) 1:803) We end up with two possible solutions depending on what we do from here. We can either use a simple R code to get a more exact answer or we can refer to the 2 Tables for a wider approximation. Let’ compare both methods. s Using the R command &gt; 1-pchisq(1.803,1) we yeild the value 0.1793502. Now looking at the 2 Table with 1 degree of freedom we have that 0:5 &lt; P (T 1:803) &lt; 0:9, which implies: 1 :05 &gt; 1 P (T 1:803) &gt; 1 :09 which simpli…es to the range of (0:10; 0:50). The latter is a fairly large spread; however, and more importantly, it shows that our Test Statistic will pass a hypothesis test with a 10% cut o¤ without too much fuss over how close it was. The former value is nice since it will give us a better indication over the exact magnitude if we w...
View Full Document

Ask a homework question - tutors are online