This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Psych 100A Week 9 Discussion Notes Part 1: The ChiSquare Test November 27, 2009 Dawn Chen The chisquare 2 ( ) test is a statistical hypothesis testing procedure that is used for nominal data (i.e. when our data is the frequency of each of several categories). The null hypothesis being tested is that the observed frequencies of the categories are equal to their expected frequencies (which come from the research question). The 2 test statistic is calculated according to the following general formula: 2 2 ( ) obs O E E  = In this formula, O is the observed frequency for each category and E is the expected frequency for each category. Just like the t test statistic, the 2 test statistic has its own sampling distribution that is dependent on the df . From the above formula, we can see that the more the observed frequencies differ from the expected frequencies, the larger the value of 2 . obs If 2 obs is large enough, or more specifically, if it is greater than 2 , crit we can reject . H Just as in a ttest, the value of 2 crit is determined by and df . The ChiSquare Test for Goodness of Fit The first kind of 2 test introduced in lecture was the 2 test for goodness of fit. In this case, we have only one nominal variable, and we want to see how well our model for how frequently that variable should take on each of its possible values fits the actual data. In this test, the formula for 2 obs is sometimes rewritten as the following, in which k is the number of categories, i O is the observed frequency for the i th category, and i E is the expected frequency for the i th category: 2 2 1 ( ) k i i obs i i O E E = = Lets look at an example in which the 2 test for goodness of fit can be applied. Suppose that to investigate whether birth rates are higher in some months of the year than others, we record the birth months of 120 students in the class. The following chart shows the number of students born in each month: Jan. Feb. Mar. Apr. May June July Aug. Sep. Oct. Nov. Dec. 7 6 8 6 9 9 16 21 13 10 7 8 Since we want to find out whether some months have higher birth rates than others, the null hypothesis is that the number of people born in each month is the same across all months. Therefore, the expected frequency for each month is simply 120 (the total number of people) divided by 12 (the number of months), or 10. Applying the formula given above, we have: 2 2 2 2 2 1 ( ) (7 10) (6 10) (8 10) 22.6. 10 10 10 k i i obs i i O E E = = = + + + = Note that when there are more categories, we can expect 2 obs to be larger even if the differences between the observed and expected frequencies are just due to random chance....
View
Full
Document
 Spring '10
 Chen
 Statistics, ChiSquare Test

Click to edit the document details