12_ChiSquare_Sp08_BB

12_ChiSquare_Sp08_BB - Outline Chi Square Overview Goodness...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Outline Chi Square Overview Goodness of Fit Test of Association/Independence Nonparametric Statistics Non-parametric statistics: Non No assumptions about the shape of the population. Examples: 2, Binomial, Sign Test Chi-Square Tests Hypothesis testing procedure for nominal variables Focus on number of people/items in each category (e.g., hair color, political party, gender) Nominal & ordinal data For this class, we'll only be using tests for nominal we' data. Compare how well an observed distribution fits an expected distribution 1 Chi-Square Tests Expected distribution can be based on A theory Prior results Assumption of equal distribution across categories When Do We Use Chi Square? Data in the form of frequency counts in different categories Nominal data No systematic rank or order relationship among levels E.g.: hair color, religion, major, profession, ethnic group, attachment style Outline Overview Goodness of Fit Test of Association/Independence Chi-Square Test for Goodness of Fit Single nominal variable Can have many different categories representing this variable but only one nominal variable. For example... example... 2 Chi-Square (2 ) Statistic Obtaining Expected Frequencies Expected frequencies: multiplying the proportion in the population of interest times the number in your sample If a population is not known, then in some cases you compare observed to an equal number of people in each category Chi-Square (2 ) Distribution For 2 Goodness of Fit, df = number of categories minus 1 The cutoff chi-square values for various chisignificance levels are found on tables The shape varies as the degrees of freedom changes. Chi-Square (2 ) Distribution 3 Chi-Square Distribution Chi-Square (2 ) Distribution Compare obtained chi-square chito a chi-square distribution chiDoes mismatch between observed and expected frequency exceed what would be expected by chance alone? 2 Goodness of Fit Test Nominal Data One Variable e.g., look/not look into a mirror e.g., favorite candy Reese's Peanut Butter cups Reese' Hershey's bar Hershey' Almond Joy Butterfinger Mr. Goodbar GoF Hypothesis Testing 1) H0: Expected frequencies and observed frequencies are equal (E=O) H1: Expected and observed frequencies are not equal (EO) (E Remember... Remember... Expected distribution can be based on A theory Prior results Assumption of equal distribution across categories 4 GoF Example 2) Nature of DV nominal Chi-square goodness of fit Chidf = (# categories 1) = 1 4) GoF Example Type 1 error rate = .05 Type 2 error rate = don't worry about it! don' 3) Appropriate test statistic GoF Example 5) Determining sample size - Each cell must have an expected value of at least 5 1) 2) 3) 4) 5) Calculating Chi-Square (2 ) GoF Determine the actual, observed frequency in each category Determine the expected frequencies in each category In each category, take Observed minus Expected frequencies Square each of these differences Divide each squared difference by the expected frequency for its category 6) Collect data 7) Calculate test statistic 5 Chi-Square Statistic @death ~@ death Observed Expected 8 10 12 10 2 = (O - E ) 2 E 7) 2 (1, N=20) = _______ = (8 - 10) (12 - 10) + 10 10 2 = 0.80 2 2 Our 2 observed value is less than our critical value so we retain the null hypothesis, p > 0.05. GoF Example 8) Calculate effect size don't worry about don' for GoF. 9) Decision makingmaking2 (1, N = 20) = 0.80, p > .05 - Compare to critical value of 3.84- does it 3.84exceed the CV? - No - retain H0 - Oscar is not the death cat Just like what we found with the binomial! Important Point We can use both the binomial and the ChiChiSquare Goodness of Fit test here because: we're dealing with Nominal Data. we' we have a dichotomous variable (e.g., present at death, not present at death). and (for the GoF) we have an expected value of 5 or more in each cell! 6 Chi Sq Example: Theory of Mind Understanding of the mental world how individuals think about desires, intentions, emotions, and beliefs. "A belief about a belief." belief." Observed Expected Chi-Square (2 ) Goodness of Fit Variable = Performance on ToM Task (pass or fail) We expect 80% of the 4-year-olds to pass. 4- yearWhat are your expected values? Pass 32 ? Fail 18 ? (O E)2 E N = 50 Chi-Square (2 ) Goodness of Fit We expect 80% of the 4-year-olds to pass. 4- yearWhat are your expected values? Steps for Hypothesis Testing H0 : O=E H1: OE O Nominal Data 3. Chi Square Goodness of Fit 1. 2. Pass Observed Expected 32 40 Fail 18 10 H0 : observed # = expected # H1 : observed # expected # Nominal Data 1 Variable (O E)2 E N = 50 Type I: = 0.05; Skip Type II Sample Size = 50; skip power 6. Collect Data 4. 5. 7 Steps for Hypothesis Testing 7. = Statistical Significance? 2 = (O - E ) 2 E (32 - 40) 2 (18 - 10) 2 + 40 10 = 1.60 + 6.40 = 8.00 Our 2 observed value is greater than our critical value so we reject the null hypothesis, p < 0.05. GoF Example 8) Calculate effect size don't worry about don' for GoF. 9) Decision makingmaking2 (1, N = 50) = 8.00, p < .05 - Compare to critical value of 3.84- does it 3.84exceed the CV? - Yes reject the null Steps to Hypothesis Testing We have statistical significance, therefore we reject the null hypothesis. There are less 44year-olds passing the Theory of Mind task than yearexpected, 2(1,N=50) = 8.00, p <0.05. Pass Observed Expected 32 40 Fail 18 10 8 Outline Overview Goodness of Fit Test of Association/Independence Chi-Square Test for Independence Two nominal variables Independence means no relation between variables To determine degrees of freedom... freedom... Contingency table Lists number of observations for each combination of categories To determine expected frequencies... frequencies... df = ( N Column - 1)( N Rows - 1) R E = (C ) N Effect Size for Chi-Square For 22 chi-square, effect size is the phi 2 chicoefficient Correlation between two nominal variables Small = .10 Medium = .25 Large = .40 Chi Sq Example: Theory of Mind Let's take the same Theory of Mind Let' problem as before but adding another variable. Task performance: Pass Fail Age: 3-years-old, 4-years-old. 3- years4- years- = 2 N 9 Contingency Table: Observed Values Values Pass Fail Chi Square Example: ToM Test of Association 2 or more NOMINAL variables Are the variables correlated? Is there a relationship? 4-yearolds 40 80% 10 20% n = 50 100% Pass 4-yr-olds yr40 15 On to the Steps... Fail 10 35 3-yearolds 15 35% 35 65% n = 50 3-yr-olds yr- 100% N = 100 100% Example H0 : = 0 H1: 0 Nominal Data 3. Chi Square Test of Association 1. 2. 6. 7. Example Collect Data Statistical Significance? Nominal Data 2 Variables Type I: = 0.05; = ? 5. Sample Size = 100 4. = MEI N - 1 = (0.25) 99 = 2.49 (use 2.40) Power = 0.67 = 0.33 2 = (O - E ) 2 E 10 Contingency Table Pass Fail Example Collect Data 7. Statistical Significance? 6. Expected = 4-year-olds 40 10 n = 50 row * column N 2 = (O - E ) 2 E 3-year-olds 15 35 n = 50 (40 - 27.5) 2 (10 - 22.5) 2 (15 - 27.5) 2 (35 - 22.5) 2 + + + 27.5 22.5 27.5 22.5 = 25.2525 = N = 100 Contingency Table Pass Fail Example 6. 7. 4-year-olds 40 10 n = 50 Collect Data Statistical Significance? Expected = row * column N 2 = (O - E ) 2 E 3-year-olds 15 35 n = 50 (40 - 27.5) 2 (10 - 22.5) 2 (15 - 27.5) 2 (35 - 22.5) 2 + + + 27.5 22.5 27.5 22.5 = 25.2525 = 2 2 N = 100 11 df = (#rows-1)(#categories-1) = 1 Contingency Table Pass Fail 4-year-olds 40 10 n = 50 3-year-olds 15 35 n = 50 N = 100 df = (#rows-1)(#categories-1) = 1 2 (1, N = 100) = 25.2525, p < 0.05 Reject the null hypothesis Our 2 observed value is greater than our critical value so we reject the null hypothesis, p < 0.05. 2Crit 3.841 25.2525 12 Example 8) - Steps to Hypothesis Testing We have both statistical & practical significance, therefore we reject the null hypothesis. FourFouryear-olds are more likely to pass the Theory of yearMind task compared to the 3-year-olds, 3- year2(1,N=100) = 25.2525, p <0.05, OBS=0.5025. Calculate the Observed Effect Size Compare to MEI (0.25) Have practical significance! OBS = = 2 N ( S RC ) 25.2525 100(1) = 0.5025 Contingency Table Pass Fail 4-year-olds 40 10 n = 50 In Conclusion... n = 50 3-year-olds 15 35 N = 100 13 When to use Chi Square Tests Nominal Data. Expected Frequencies are > 5 in each cell. Goodness of fit: "Good fit" between the observed and fit" expected? Assumptions of the Chi Square Independence of Observations & Variables Inclusions of NonNonOccurrences Observed Pass 3-year-olds 15 4-year-olds 40 55 Test of Association/Independence: Are the two variables "associated" with each associated" other? Expected 27.5 27.5 Multiple Comparisons If you have a test with more than one degree of freedom, then its difficult to know where the effect lies. Are they all different or just 1 & 2? Where does the difference lie? 1 X Y 2 3 14 ...
View Full Document

This note was uploaded on 04/07/2008 for the course PSY 031 taught by Professor Dicorcia during the Spring '08 term at Tufts.

Ask a homework question - tutors are online