This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Chi Square (χ2 ) PSY 2801: Summer 2010
Chi Square (χ2 ) Jeﬀ Jones
University of Minnesota Jeﬀ Jones Chi Square (χ2 ) Count Data
So, we can perform tests on Interval/Ratio Scale measures, and we can pretend that Ordinal Scale measures are Interval Scaled (they look numberey). What happens if, instead of having things in logical orders, we just have Nominal Scales? Count Data: 1, 2, 3 in Group 1. 1, 2, 3, 4, 5 in Group 2. Can we test anything about that? Jeﬀ Jones Chi Square (χ2 ) ChiSquare Test (χ2 )
Vocabulary in order to perform a χ2 test: Observed Frequency: This is how many people, things, mushrooms, cacti, baby belugas you observed in each cell. Expected Frequency: This is how many people, things mushrooms, cacti, baby belugas you expect (or think) will fall in each cell if life were just like the Null Hypothesis. Table: A grid that contains the conditions (L) and the observed or expected counts (C): L1 C1 L2 C2 L3 C3 L4 C4 Jeﬀ Jones Chi Square (χ2 ) Two Types of Tests
There are two types of χ2 tests students learn about in Introductory Psychology: Goodness of Fit: A distribution is speciﬁed (normal, uniform, etc.), and a researcher wants to test whether the data actually ﬁt the distribution. Let’s say we gathered 1000 IQ scores, and put them into z score form:
< −3 2 − 3 → −2 4 − 2 → −1 151 −1 → 0 335 0→1 354 1→2 141 2→3 13 3< 0 If the data really came from a Standard Normal Distribution, how many would we expect to ﬁnd in each cell?
Jeﬀ Jones Chi Square (χ2 ) Goodness of Fit
We know that if the data came from a Standard Normal Distribution, then these would be the probabilities of landing into the cells:
< −3 .0013 − 3 → −2 .0214 − 2 → −1 .1359 −1 → 0 .3413 0→1 .3413 1→2 .1359 2→3 .0214 3< .0013 And, to ﬁnd the Expected Frequency in Each Cell, we multiply the Probability of Landing Into Each Cell by the Total Number of People.
< −3 1.3 − 3 → −2 21.4 − 2 → −1 135.9
Jeﬀ Jones −1 → 0 341.3 0→1 341.3 1→2 135.9 2→3 21.4 3< 1.3 Chi Square (χ2 ) Two Types of Tests
Test of Independence: A contingency table (where people are assigned to nominal groups based on something) is speciﬁed, and a researcher wants to test whether the variables are Independent of each other: Let’s say we gathered 115 random people, asked them their Prefered Ice Cream Flavor (Variable 1) and asked them their Gender (Variable 2): Chocolate Vanilla Total Female 50 10 60 Male 15 40 55 Total 65 60 115 If Preferred Ice Cream Flavor were Independent from Gender, how many would we expect to ﬁnd in each cell?
Jeﬀ Jones Chi Square (χ2 ) Tests of Independence
Based on our chart, we know the probability of a randomly selected person being Female: 60 P (Female) = ≈ .5217 115 We know the probability of a randomly selected person being Male: 55 P (Male) = ≈ .4782 115 We know the probability of a randomly selected person liking Chocolate Ice Cream: 65 P (Chocolate) = ≈ .5652 115 And we know the probability of a randomly selected person liking Vanilla Ice Cream: 50 P (Vanilla) = ≈ .4348 115
Jeﬀ Jones Chi Square (χ2 ) Multiplicative Rule
Let’s repeat a slide from the Probability Lecture: Jeﬀ Jones Chi Square (χ2 ) Multiplicative Rule
Let’s repeat a slide from the Probability Lecture: If A and B are independent events then: P (A ∩ B ) = P (A) × P (B A) = P (A) × P (B ) Or rather, the probability of two events happening is the product of their probabilities if they are independent. This works because of our independence deﬁnition: P (AB ) = P (A) Thus, for our candy, with replacement, example: P (blue2 ∩ blue1) = P (blue2) × P (blue1) ≈ .545 × .545 ≈ .297
Jeﬀ Jones Chi Square (χ2 ) Test of Independence
How can we use the multiplicative rule to ﬁnd out expected counts? Well, if the variables were Independent then the probability of two events happening would be the product of the probability of each event happening. So, the Expected Probabilities would look something like this: Female .5217 × .5652 ≈ .2949 .5217 × .4348 ≈ .2268 .5217 Male .4783 × .5652 ≈ .2703 .4783 × .4348 ≈ .2079 .4782 Total .5652 .4348 1 Chocolate Vanilla Total Jeﬀ Jones Chi Square (χ2 ) Test of Independence
And, to ﬁnd the Expected Frequency in Each Cell, we multiply the Probability of Landing Into Each Cell by the Total Number of People (N = 115). Female .2949 × 115 ≈ 33.92 .2268 × 115 ≈ 26.08 60 Male .2703 × 115 ≈ 31.08 .2079 × 115 ≈ 23.92 55 Total 65 50 115 Chocolate Vanilla Total Jeﬀ Jones Chi Square (χ2 ) Test of Independence
And, to ﬁnd the Expected Frequency in Each Cell, we multiply the Probability of Landing Into Each Cell by the Total Number of People (N = 115). Female .2949 × 115 ≈ 33.92 .2268 × 115 ≈ 26.08 60 Male .2703 × 115 ≈ 31.08 .2079 × 115 ≈ 23.92 55 Total 65 50 115 Chocolate Vanilla Total Compare this with our original table: Female 50 10 60 Male 15 40 55 Total 65 60 115 Chocolate Vanilla Total Jeﬀ Jones Chi Square (χ2 ) Old MacDonald’s χ2 Statistic
Here’s the formula for the ChiSquare (χ2 ) Statistic:
C (Oi − Ei )2 χ (df ) = Ei 2 i =1 df is your degrees of freedom, Oi is the observed frequence in cell i , Ei is the expected frequency in cell i , C is your number of cells. Only 1 Row (Goodness of Fit): df = Columns  1 More than 1 Row (Independence): df = (Columns  1) × (Rows  1)
Jeﬀ Jones Chi Square (χ2 ) The χ2 Distribution
The χ2 Distribution is another Family of Distributions (just like the t distribution and the F distribution), depending on the degrees of freedom. The χ2 distribution is just like a Sum of Squared (Independent) z scores. The degrees of freedom let you know exactly how many squared zscores you are summing. Furthermore, just like the F test, the χ2 test is going to be a onetailed test. It will be similar to a twosided ztest, but all of the critical area will be in its upper tail. Jeﬀ Jones Chi Square (χ2 ) The χ2 Distribution
Thus, χ2 (1) (a ChiSquare Distribution with 1 degrees of freedom) is just like squaring scores from the Standard Normal Distribution:
1000 Scores from A ChiSquare Distribution 1000 Squared Scores from A Standard Normal Distribution 700 600 500 400 Frequency 300 Frequency
0 2 4 6 8 10 200 100 0 0
0 100 200 300 400 500 600 700 2 4 6 8 10 ChiSq Norm Squared Based on this logic, for df = 1, what would be your χ2 , where crit α = .05?
Jeﬀ Jones Chi Square (χ2 ) The χ2 Distribution
The χ2 Distribution starts very positively skewed, but as the degrees of freedom increase, the distribution starts to appear Normally Distributed Itself (another instance of the CLT following us everywhere). The shape becomes more symmetric as the degrees of freedom increases. Unlike the F distribution, the mean of the χ2 distribution shifts dramatically as the degrees of freedom increase (the F distribution mean is always around 1). Jeﬀ Jones Chi Square (χ2 ) The χ2 Distribution
1000 Scores from A ChiSquare Distribution (df = 1)
700 200 Frequency 0 2 4 6 8 10 12 14 0 0 50 100 150 1000 Scores from A ChiSquare Distribution (df = 9) Frequency 0 100 300 500 5 10 15 20 25 ChiSq ChiSq 1000 Scores from A ChiSquare Distribution (df = 20)
200 Frequency 10 20 30 ChiSq 40 50 0 50 100 150 1000 Scores from A ChiSquare Distribution (df = 40) Frequency 0 50 150 250 20 30 40 50 ChiSq 60 70 80 Jeﬀ Jones Chi Square (χ2 ) Steps for χ2 Hypothesis Tests Here are the basic steps for a ChiSquare Test:
1 2 3 4 5 6 Form null and alternative hypothesis (as always). Choose α level (as always; we’ll keep α = .05). Gather Observed Counts Calculate Expected Counts Calculate χ2 Statistic Check the probability of the test statistic occurring (given what?) Choose to reject or not reject H0 . 7 Jeﬀ Jones Chi Square (χ2 ) Step 1: Setting Hypotheses
For the Goodness of Fit test, we are Checking to see if the data come from a Normally Distributed Population: H0 : Data come from Normal Population H1 : Data do not come from Normal Population For the Test of Independence: H0 : Ice Cream Preference is Independent from Gender H1 : Ice Cream Preference is not Independent from Gender Notice, H0 is that the data ﬁt the assumed model. It always is testing whether the data ﬁt the assumed model; however in a χ2 test, you often want your data to ﬁt the assumed model, so Rejecting H0 is usually bad.
Jeﬀ Jones Chi Square (χ2 ) Steps 3 & 4: Gather Counts  Goodness of Fit
Here are our observed counts (given before)
< −3 2 − 3 → −2 4 − 2 → −1 151 −1 → 0 335 0→1 354 1→2 141 2→3 13 3< 0 Here are our expected counts (found before):
< −3 1.3 − 3 → −2 21.4 − 2 → −1 135.9 −1 → 0 341.3 0→1 341.3 1→2 135.9 2→3 21.4 3< 1.3 Jeﬀ Jones Chi Square (χ2 ) Step 6: Calculate χ2  Goodness of Fit
Well, the equation for the χ2 statistic is just like the song says: χ2 (df ) = So, in this case:
χ2 (8 − 1) =
C (Oi − Ei )2 Ei i =1 (2 − 1.3)2 (4 − 21.4)2 (151 − 135.9)2 + + 1.3 21.4 135.9 (335 − 341.3)2 (354 − 341.3)2 (141 − 135.9)2 + + + 341.3 341.3 135.9 (13 − 21.4)2 (0 − 1.3)2 + + 21.4 1.3 = 0.38 + 14.15 + 1.68 + 0.12 + 0.47 + 0.19 + 3.30 + 1.30 χ2 (7) = 21.59
Jeﬀ Jones Chi Square (χ2 ) Step 7: Decision/Conclusion  Goodness of Fit
If we check the χ2 Distribution with 7 degrees of freedom, we will ﬁnd that the α = .05 critical value is: χ2 = 14.06 crit Furthermore, since the χ2 test is a onesided test, we will only reject if our test statistic exceeds our critical value. Jeﬀ Jones Chi Square (χ2 ) Step 7: Decision/Conclusion  Goodness of Fit
If we check the χ2 Distribution with 7 degrees of freedom, we will ﬁnd that the α = .05 critical value is: χ2 = 14.06 crit Furthermore, since the χ2 test is a onesided test, we will only reject if our test statistic exceeds our critical value. Thus: Since 14.06 < 21.58 = χ2 , p < α, we reject H0 and conclude that the data Do Not come from a Normal Distribution. One of the problems with the χ2 test is that it is very sensitive to departures from the distribution, so even though our data are sort of Normally Distributed, it’s not enough for the test.
Jeﬀ Jones Chi Square (χ2 ) Steps 3 & 4: Gather Counts  Independence
Here are our observed counts (given before)
Chocolate Vanilla Total Female 50 10 60 Male 15 40 55 Total 65 60 115 Here are our expected counts (found before):
Chocolate Vanilla Total Female 33.92 26.08 60 Male 31.08 23.92 55 Total 65 50 115 Jeﬀ Jones Chi Square (χ2 ) Step 6: Calculate χ2  Independence
Well, the equation for the χ2 statistic is just like the song says: χ2 (df ) = So, in this case:
χ2 ((2 − 1) × (2 − 1)) =
C (Oi − Ei )2 Ei i =1 (50 − 33.92)2 (10 − 26.08)2 + 33.92 26.08 (15 − 31.08)2 (40 − 23.92)2 + + 31.08 23.92 2 χ (1 × 1) = 7.62 + 9.91 + 8.32 + 10.81 χ2 (1) = 36.66 Jeﬀ Jones Chi Square (χ2 ) Step 7: Decision/Conclusion  Independence
If we check the χ2 Distribution with 1 degree of freedom, we will ﬁnd that the α = .05 critical value is: χ2 = 3.84 crit Furthermore, since the χ2 test is a onesided test, we will only reject if our test statistic exceeds our critical value. Jeﬀ Jones Chi Square (χ2 ) Step 7: Decision/Conclusion  Independence
If we check the χ2 Distribution with 1 degree of freedom, we will ﬁnd that the α = .05 critical value is: χ2 = 3.84 crit Furthermore, since the χ2 test is a onesided test, we will only reject if our test statistic exceeds our critical value. Thus: Since 3.84 < 36.66 = χ2 , p < α, we reject H0 and conclude that the variables are Not Independent from each other. This case looks more obvious, that there is some dependence between whether you are a male or female and the kind of ice cream you like. However, the test is still pretty sensitive.
Jeﬀ Jones Chi Square (χ2 ) χ2 Assumptions Here are some χ2 assumptions:
1 No small expected frequencies.
The total number of subjects should be at least 20. The expected cell frequencies should be at least 5. This is because the statistic we have is only approximately χ2 distributed, so in order for the test to work well, we need a lot of people 2 Independence of Observations
This just means that each individual is only in one cell of the table. Jeﬀ Jones ...
View
Full Document
 Summer '08
 GUYER
 Normal Distribution, Chisquare distribution, CHI SQUARE, Jeff Jones

Click to edit the document details