chisquare - Chi Square (χ2 ) PSY 2801: Summer 2010 Chi...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chi Square (χ2 ) PSY 2801: Summer 2010 Chi Square (χ2 ) Jeff Jones University of Minnesota Jeff Jones Chi Square (χ2 ) Count Data So, we can perform tests on Interval/Ratio Scale measures, and we can pretend that Ordinal Scale measures are Interval Scaled (they look number-ey). What happens if, instead of having things in logical orders, we just have Nominal Scales? Count Data: 1, 2, 3 in Group 1. 1, 2, 3, 4, 5 in Group 2. Can we test anything about that? Jeff Jones Chi Square (χ2 ) Chi-Square Test (χ2 ) Vocabulary in order to perform a χ2 test: Observed Frequency: This is how many people, things, mushrooms, cacti, baby belugas you observed in each cell. Expected Frequency: This is how many people, things mushrooms, cacti, baby belugas you expect (or think) will fall in each cell if life were just like the Null Hypothesis. Table: A grid that contains the conditions (L) and the observed or expected counts (C): L1 C1 L2 C2 L3 C3 L4 C4 Jeff Jones Chi Square (χ2 ) Two Types of Tests There are two types of χ2 tests students learn about in Introductory Psychology: Goodness of Fit: A distribution is specified (normal, uniform, etc.), and a researcher wants to test whether the data actually fit the distribution. Let’s say we gathered 1000 IQ scores, and put them into z -score form: < −3 2 − 3 → −2 4 − 2 → −1 151 −1 → 0 335 0→1 354 1→2 141 2→3 13 3< 0 If the data really came from a Standard Normal Distribution, how many would we expect to find in each cell? Jeff Jones Chi Square (χ2 ) Goodness of Fit We know that if the data came from a Standard Normal Distribution, then these would be the probabilities of landing into the cells: < −3 .0013 − 3 → −2 .0214 − 2 → −1 .1359 −1 → 0 .3413 0→1 .3413 1→2 .1359 2→3 .0214 3< .0013 And, to find the Expected Frequency in Each Cell, we multiply the Probability of Landing Into Each Cell by the Total Number of People. < −3 1.3 − 3 → −2 21.4 − 2 → −1 135.9 Jeff Jones −1 → 0 341.3 0→1 341.3 1→2 135.9 2→3 21.4 3< 1.3 Chi Square (χ2 ) Two Types of Tests Test of Independence: A contingency table (where people are assigned to nominal groups based on something) is specified, and a researcher wants to test whether the variables are Independent of each other: Let’s say we gathered 115 random people, asked them their Prefered Ice Cream Flavor (Variable 1) and asked them their Gender (Variable 2): Chocolate Vanilla Total Female 50 10 60 Male 15 40 55 Total 65 60 115 If Preferred Ice Cream Flavor were Independent from Gender, how many would we expect to find in each cell? Jeff Jones Chi Square (χ2 ) Tests of Independence Based on our chart, we know the probability of a randomly selected person being Female: 60 P (Female) = ≈ .5217 115 We know the probability of a randomly selected person being Male: 55 P (Male) = ≈ .4782 115 We know the probability of a randomly selected person liking Chocolate Ice Cream: 65 P (Chocolate) = ≈ .5652 115 And we know the probability of a randomly selected person liking Vanilla Ice Cream: 50 P (Vanilla) = ≈ .4348 115 Jeff Jones Chi Square (χ2 ) Multiplicative Rule Let’s repeat a slide from the Probability Lecture: Jeff Jones Chi Square (χ2 ) Multiplicative Rule Let’s repeat a slide from the Probability Lecture: If A and B are independent events then: P (A ∩ B ) = P (A) × P (B |A) = P (A) × P (B ) Or rather, the probability of two events happening is the product of their probabilities if they are independent. This works because of our independence definition: P (A|B ) = P (A) Thus, for our candy, with replacement, example: P (blue2 ∩ blue1) = P (blue2) × P (blue1) ≈ .545 × .545 ≈ .297 Jeff Jones Chi Square (χ2 ) Test of Independence How can we use the multiplicative rule to find out expected counts? Well, if the variables were Independent then the probability of two events happening would be the product of the probability of each event happening. So, the Expected Probabilities would look something like this: Female .5217 × .5652 ≈ .2949 .5217 × .4348 ≈ .2268 .5217 Male .4783 × .5652 ≈ .2703 .4783 × .4348 ≈ .2079 .4782 Total .5652 .4348 1 Chocolate Vanilla Total Jeff Jones Chi Square (χ2 ) Test of Independence And, to find the Expected Frequency in Each Cell, we multiply the Probability of Landing Into Each Cell by the Total Number of People (N = 115). Female .2949 × 115 ≈ 33.92 .2268 × 115 ≈ 26.08 60 Male .2703 × 115 ≈ 31.08 .2079 × 115 ≈ 23.92 55 Total 65 50 115 Chocolate Vanilla Total Jeff Jones Chi Square (χ2 ) Test of Independence And, to find the Expected Frequency in Each Cell, we multiply the Probability of Landing Into Each Cell by the Total Number of People (N = 115). Female .2949 × 115 ≈ 33.92 .2268 × 115 ≈ 26.08 60 Male .2703 × 115 ≈ 31.08 .2079 × 115 ≈ 23.92 55 Total 65 50 115 Chocolate Vanilla Total Compare this with our original table: Female 50 10 60 Male 15 40 55 Total 65 60 115 Chocolate Vanilla Total Jeff Jones Chi Square (χ2 ) Old MacDonald’s χ2 Statistic Here’s the formula for the Chi-Square (χ2 ) Statistic: C ￿ (Oi − Ei )2 χ (df ) = Ei 2 i =1 df is your degrees of freedom, Oi is the observed frequence in cell i , Ei is the expected frequency in cell i , C is your number of cells. Only 1 Row (Goodness of Fit): df = Columns - 1 More than 1 Row (Independence): df = (Columns - 1) × (Rows - 1) Jeff Jones Chi Square (χ2 ) The χ2 Distribution The χ2 Distribution is another Family of Distributions (just like the t -distribution and the F -distribution), depending on the degrees of freedom. The χ2 distribution is just like a Sum of Squared (Independent) z -scores. The degrees of freedom let you know exactly how many squared z-scores you are summing. Furthermore, just like the F -test, the χ2 test is going to be a one-tailed test. It will be similar to a two-sided z-test, but all of the critical area will be in its upper tail. Jeff Jones Chi Square (χ2 ) The χ2 Distribution Thus, χ2 (1) (a Chi-Square Distribution with 1 degrees of freedom) is just like squaring scores from the Standard Normal Distribution: 1000 Scores from A Chi-Square Distribution 1000 Squared Scores from A Standard Normal Distribution 700 600 500 400 Frequency 300 Frequency 0 2 4 6 8 10 200 100 0 0 0 100 200 300 400 500 600 700 2 4 6 8 10 Chi-Sq Norm Squared Based on this logic, for df = 1, what would be your χ2 , where crit α = .05? Jeff Jones Chi Square (χ2 ) The χ2 Distribution The χ2 Distribution starts very positively skewed, but as the degrees of freedom increase, the distribution starts to appear Normally Distributed Itself (another instance of the CLT following us everywhere). The shape becomes more symmetric as the degrees of freedom increases. Unlike the F -distribution, the mean of the χ2 distribution shifts dramatically as the degrees of freedom increase (the F -distribution mean is always around 1). Jeff Jones Chi Square (χ2 ) The χ2 Distribution 1000 Scores from A Chi-Square Distribution (df = 1) 700 200 Frequency 0 2 4 6 8 10 12 14 0 0 50 100 150 1000 Scores from A Chi-Square Distribution (df = 9) Frequency 0 100 300 500 5 10 15 20 25 Chi-Sq Chi-Sq 1000 Scores from A Chi-Square Distribution (df = 20) 200 Frequency 10 20 30 Chi-Sq 40 50 0 50 100 150 1000 Scores from A Chi-Square Distribution (df = 40) Frequency 0 50 150 250 20 30 40 50 Chi-Sq 60 70 80 Jeff Jones Chi Square (χ2 ) Steps for χ2 Hypothesis Tests Here are the basic steps for a Chi-Square Test: 1 2 3 4 5 6 Form null and alternative hypothesis (as always). Choose α level (as always; we’ll keep α = .05). Gather Observed Counts Calculate Expected Counts Calculate χ2 Statistic Check the probability of the test statistic occurring (given what?) Choose to reject or not reject H0 . 7 Jeff Jones Chi Square (χ2 ) Step 1: Setting Hypotheses For the Goodness of Fit test, we are Checking to see if the data come from a Normally Distributed Population: H0 : Data come from Normal Population H1 : Data do not come from Normal Population For the Test of Independence: H0 : Ice Cream Preference is Independent from Gender H1 : Ice Cream Preference is not Independent from Gender Notice, H0 is that the data fit the assumed model. It always is testing whether the data fit the assumed model; however in a χ2 test, you often want your data to fit the assumed model, so Rejecting H0 is usually bad. Jeff Jones Chi Square (χ2 ) Steps 3 & 4: Gather Counts -- Goodness of Fit Here are our observed counts (given before) < −3 2 − 3 → −2 4 − 2 → −1 151 −1 → 0 335 0→1 354 1→2 141 2→3 13 3< 0 Here are our expected counts (found before): < −3 1.3 − 3 → −2 21.4 − 2 → −1 135.9 −1 → 0 341.3 0→1 341.3 1→2 135.9 2→3 21.4 3< 1.3 Jeff Jones Chi Square (χ2 ) Step 6: Calculate χ2 -- Goodness of Fit Well, the equation for the χ2 statistic is just like the song says: χ2 (df ) = So, in this case: χ2 (8 − 1) = C ￿ (Oi − Ei )2 Ei i =1 (2 − 1.3)2 (4 − 21.4)2 (151 − 135.9)2 + + 1.3 21.4 135.9 (335 − 341.3)2 (354 − 341.3)2 (141 − 135.9)2 + + + 341.3 341.3 135.9 (13 − 21.4)2 (0 − 1.3)2 + + 21.4 1.3 = 0.38 + 14.15 + 1.68 + 0.12 + 0.47 + 0.19 + 3.30 + 1.30 χ2 (7) = 21.59 Jeff Jones Chi Square (χ2 ) Step 7: Decision/Conclusion -- Goodness of Fit If we check the χ2 Distribution with 7 degrees of freedom, we will find that the α = .05 critical value is: χ2 = 14.06 crit Furthermore, since the χ2 test is a one-sided test, we will only reject if our test statistic exceeds our critical value. Jeff Jones Chi Square (χ2 ) Step 7: Decision/Conclusion -- Goodness of Fit If we check the χ2 Distribution with 7 degrees of freedom, we will find that the α = .05 critical value is: χ2 = 14.06 crit Furthermore, since the χ2 test is a one-sided test, we will only reject if our test statistic exceeds our critical value. Thus: Since 14.06 < 21.58 = χ2 , p < α, we reject H0 and conclude that the data Do Not come from a Normal Distribution. One of the problems with the χ2 test is that it is very sensitive to departures from the distribution, so even though our data are sort of Normally Distributed, it’s not enough for the test. Jeff Jones Chi Square (χ2 ) Steps 3 & 4: Gather Counts -- Independence Here are our observed counts (given before) Chocolate Vanilla Total Female 50 10 60 Male 15 40 55 Total 65 60 115 Here are our expected counts (found before): Chocolate Vanilla Total Female 33.92 26.08 60 Male 31.08 23.92 55 Total 65 50 115 Jeff Jones Chi Square (χ2 ) Step 6: Calculate χ2 -- Independence Well, the equation for the χ2 statistic is just like the song says: χ2 (df ) = So, in this case: χ2 ((2 − 1) × (2 − 1)) = C ￿ (Oi − Ei )2 Ei i =1 (50 − 33.92)2 (10 − 26.08)2 + 33.92 26.08 (15 − 31.08)2 (40 − 23.92)2 + + 31.08 23.92 2 χ (1 × 1) = 7.62 + 9.91 + 8.32 + 10.81 χ2 (1) = 36.66 Jeff Jones Chi Square (χ2 ) Step 7: Decision/Conclusion -- Independence If we check the χ2 Distribution with 1 degree of freedom, we will find that the α = .05 critical value is: χ2 = 3.84 crit Furthermore, since the χ2 test is a one-sided test, we will only reject if our test statistic exceeds our critical value. Jeff Jones Chi Square (χ2 ) Step 7: Decision/Conclusion -- Independence If we check the χ2 Distribution with 1 degree of freedom, we will find that the α = .05 critical value is: χ2 = 3.84 crit Furthermore, since the χ2 test is a one-sided test, we will only reject if our test statistic exceeds our critical value. Thus: Since 3.84 < 36.66 = χ2 , p < α, we reject H0 and conclude that the variables are Not Independent from each other. This case looks more obvious, that there is some dependence between whether you are a male or female and the kind of ice cream you like. However, the test is still pretty sensitive. Jeff Jones Chi Square (χ2 ) χ2 Assumptions Here are some χ2 assumptions: 1 No small expected frequencies. The total number of subjects should be at least 20. The expected cell frequencies should be at least 5. This is because the statistic we have is only approximately χ2 distributed, so in order for the test to work well, we need a lot of people 2 Independence of Observations This just means that each individual is only in one cell of the table. Jeff Jones ...
View Full Document

Page1 / 28

chisquare - Chi Square (χ2 ) PSY 2801: Summer 2010 Chi...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online