This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Stat 250 Gunderson Lecture Notes Relationships between Categorical Variables Chapter 15: Chi‐Square Analysis Inference for Categorical Variables Chapter 15 teaches us the inference techniques for analyzing count data. The three main tests described in the text that we will cover are: 1. Goodness of Fit Test: this test is for assessing if a particular discrete model is a good fitting model for a discrete characteristic, based on a random sample from the population. E.g. Has the model for the method of transportation (drive, bike, walk, other) used by students to get the class changed from that for 5 years ago? 2. Test of Homogeneity: this test is for assessing if two or more populations are homogeneous (alike) with respect to the distribution of some discrete (categorical) variable. E.g. Is the distribution of opinion on legal gambling the same for adult males versus adult females? 3. Test of Independence: this test helps us to assess if two discrete (categorical) variables are independent for a population, or if there is an association between the two variables. E.g. Is there an association between satisfaction with the quality of public schools (not satisfied, somewhat satisfied, very satisfied) and political party (Republican, Democrat, etc.) The first test is discussed in Section 15.3 in the text but we will cover it first as it is the one‐
sample test for count data. The other two tests (homogeneity and independence) are actually the same test. Although the hypotheses are stated differently and the underlying assumptions about how the data is gathered are different, the steps for doing the two tests are exactly the same. Section 15.1 does not fully distinguish between these two tests (just briefly on page 638) but we will emphasize the difference between them. All of these tests are based on an X 2 test statistic that, if the corresponding H0 is true and the assumptions hold, follows a chi‐square distribution with some degrees of freedom, written 2 (df ) . So our first discussion is to learn about the chi‐square distribution ‐ what the distribution looks like, some facts, how to use Table A.5 to find various percentiles. 205 The Chi‐Square Distribution General Shape: If we have a chi‐square distribution with df = degrees of freedom, then the ... Mean is equal to df Variance is equal to 2(df) Standard deviation is equal to √[2(df)] These facts will serve as a useful frame of reference for making decision. Table A.5 provides some upper‐tail percentiles for chi‐square distributions. Try It! Consider the 2 (4) distribution. a. What is the mean for this distribution? ___4____ b. What is the median for this distribution? ___3.36_______ c. How likely would it be to get a value of 4 or even larger? Draw a picture to help show it. Area is between 0.25 and 0.50. d. How likely would it be to get a value of 10.3 or even larger? Draw a picture to help show it. Area is between 0.025 and 0.05. This is how bounds for a p‐value will be found 206 The BIG IDEA The data consists of observed counts. We compute expected counts under the H0 ‐ these counts are what we would expect (on average) if the corresponding H0 were true. Compare the observed and expected counts using the X 2 test statistic. The statistic will be a measure of how close the observed counts are to the expected counts under H0. If this distance is large, we have support for the alternative Ha. With this in mind, we turn to our first chi‐square test of goodness of fit. We will derive the methodology for the test through an example. An overall summary of the test will be presented at the end. Test of Goodness of Fit: Helps us assess if a particular discrete model is a good fitting model for a discrete characteristic, based on a random sample from the population. Goodness of Fit Test Scenario: We have one population of interest, say all cars exiting a toll road that has four booths at the exit. Question: Are the four booths used equally often? Data: 1 random sample of 100 cars, we record one variable X, which booth was used (1, 2, 3, 4). The table below summarizes the data in terms of the observed counts. Observed # cars Booth 1 26 Booth 2 20 Booth 3 28 Booth 4 26 Note: This is only a one‐way frequency table, not a two‐way table as will be in the homogeneity and independence tests. We use the notation k = the number of categories or cells, here k 4 . The null hypothesis: Let pi = proportion of cars using booth i H0: p1 = 0.25 , p2 = 0.25 , p3 = 0.25 , p4 = 0.25 . Ha: ___ not all probabilities specified in H0 are correct _______________ The null hypothesis specifies a particular discrete model (mass function) by listing the proportions (or probabilities) for each of the k outcome categories. The one‐way table provides the OBSERVED counts. Our next step is to compute the EXPECTED counts, under the assumption that H0 is true. 207 How to find the expected counts? There were 100 cars in the sample and 4 booths. If the booths are used equally often, H0 is true, then we would expect ... 25 cars to use Booth #1 How did you get the 25? 25% of 100 (np) ... 25 cars to use Booth #2 ... 25 cars to use Booth #3 ... 25 cars to use Booth #4 Expected Counts Ei np i Enter these expected counts in the parentheses in the table below. Observed Counts (Expected Counts)} Booth 1 Booth 2 Booth 3 Number of cars 26 ( 25 ) 20 ( 25 ) 28 ( 25 ) Booth 4 26 ( 25 ) The X 2 test statistic Next we need our test statistic, our measure of how close the observed counts are to what we expect under the null hypothesis. X2 O E 2 26 252 20 252 28 252 26 252 E
25
25
25
25 (1 25 9 1) / 25 36 / 25 1.44 Do you think a value of X 2 1.44 is large enough to reject H0? Let's find the p‐value, the probability of getting an X 2 test statistic value as large or larger than the one we observed, assuming H0 is true. To do this we need to know the distribution of the X 2 test statistic under the null hypothesis. If H0 is true, then X 2 has the 2 distribution with degrees of freedom = k – 1 . allcells 208 Find the p‐value for our tollbooth example: Observed X 2 test statistic value = 1.44 df = 4 – 1 = 3 . Sketch distribution to find bounds… p‐value is > 0.50. Are the results statistically significant at the 5% significance level? NO Conclusion at a 5% level: It appears that .... the 4 booths are used equally often Aside: Using our frame of reference for chi‐square distributions. Recall that if we have a chi‐square distribution with df degrees of freedom, then the mean is equal to df , and the standard deviation is equal to 2(df ) So, if H0 were true, we would expect the X 2 test statistic to be about 3 give or take about sqrt(2*3) = 2.45 . Since we reject H0 for large values of X 2 , and we only observed a value of 1.44 , even less than expected under H0, we certainly do not have enough evidence to reject H0. Goodness of Fit Test Summary Assume: We have 1 random sample of size n . We measure one discrete response X that has k possible outcomes Test: H0: A specified discrete model for X p1 p10 , p 2 p 20 , , pk pk 0 Ha: The probabilities are not as specified in the null hypothesis. Test Statistic: X 2 observed  expected 2 expected
where expected E i np i 0 If H0 is true, then X 2 has a 2 distribution with ( k 1) degrees of freedom, where k is the number of categories. The necessary conditions are: at least 80% of the expected counts are greater than 5 and none are less than 1. Be aware of the sample size (pg 656). 209 Try It! Crossbreeding Peas For a genetics experiment in the cross breeding of peas, Mendel obtained the following data in a sample from the second generation of seeds resulting from crossing yellow round peas and green wrinkled peas. n = 556 Yellow Round Yellow Wrinkled Green Round Green Wrinkled 315 101 108 32 312.75 104.25 104.25 34.75 556(9/16) = 312.75, etc. Do these data support the theory that these four types should occur with probabilities 9/16, 3/16, 3/16, and 1/16 respectively? Use = 0.01. H 0 : p1 9/16 , p 2 3/16 , p3 3/16 , p 4 1/16 . X2 315 312.752 101 104.252 108 104.252 32 34.752
312.75 104.25 104.25 34.75 0.47 The p‐value is > 0.50 so we cannot reject the null hypothesis. The data do not refute the theory. In fact, the results look almost too good – Mendel had a fictitious assistant, perhaps fictitious data too? Or did the assumptions not hold? Or did we just observe a very unusual. Try It! Desired Vacation Place The AAA travel agency would like to assess if the distribution of desired vacation place has changed from the model of 3 years ago. A random sample of 928 adults were polled by the polling company Ipsos during this past mid‐May. One question asked was “Name the one place you would want to go for vacation if you had the time and the money.” The table below displays the model for the distribution of desired vacation place 3 years ago and the observed results based on the recent poll. 1 = Hawaii 2 = Europe 3 = Caribbean 4 = Other Totals Model 10% 40% 20% 30% 100% 3 years ago Obs Counts 124 (92.8) 390 (371.2)
125 (105.6) 289 (278.4) 928 from poll a. Give the null hypothesis to test if there has been a significant change in the distribution of desired vacation place from 3 years ago. H0: p1 = 0.10, p2 = 0.40, p3 = 0.20, p4 = 0.30 b. The observed test statistic is 10.2 (CORRECTION X2 should be almost 31) and the corresponding p‐value is 0.02 (CORRECTION p‐value is less than 0.001 from Table A.5). Interpret this p‐value in terms of repeated random samples of 928 adults. If repeated random samples of n = 928 adults were obtained and if the distribution of desired vacation place has not changed, we would expect to see an X2 statistic of 31 or larger in less than 0.1% of the repetitions. Note: the underlined phrase is saying “if the null hypothesis were true”. The biggest discrepancies between the observed counts and expected counts under the null were for Hawaii and then from Caribbean. 210 Test of Homogeneity: Helps us to assess if the distribution for one discrete (categorical) variable is the same for two or more populations. Test of Homogeneity Scenario: We have two populations of interest ‐ say preschool boys and preschool girls. Question: Is Ice Cream Preference the same for boys and girls? Data: 1 random sample of 75 preschool boys, 1 random sample of 75 preschool girls; the two random samples are independent. The table below summarizes the data in terms of the observed counts. Observed Counts: Ice Cream Preference Boys Girls Vanilla (V) 25 26 Chocolate (C) 30 23 Strawberry (S) 20 26 75 75 Note: The column totals here were known in advance, even before the ice cream preferences were measured. This is a key idea for how to distinguish between the test of homogeneity and the test of independence. The null hypothesis: H0: The distribution of ice cream preference is the same for the two populations, boys and girls. A more mathematical way to write this null hypothesis is: H0: P X i  population j P X i for all i, j where X is the categorical variable, in this case, ice cream preference. As we can see, the null hypothesis is stating that the distribution of ice cream preference does not depend on (is independent of) the population we select from since the two distributions are the same. The null hypothesis looks like: P A  B P A , which is one definition of independent events, from our previous discussion of independence. This is why the test of homogeneity (comparing several populations) is really the same as the test of independence. The assumptions are different however. For our homogeneity (comparing several populations) test, we assume we have independent random samples, one from each population, and we measure 1 discrete (categorical) response. For the independence test (discussed later) we will assume we have just 1 random sample from 1 population, but we measure 2 discrete (categorical) responses. 211 Getting back to our ICE CREAM ... The two‐way table provides the OBSERVED counts. Our next step is to compute the EXPECTED counts, under the assumption that H0 is true. How to find the expected counts? Let's look at those who preferred Strawberry first. Strawberry: Since there were 46 children who preferred Strawberry overall, if the distributions for boys and girls are the same (H0 is true), then we would expect 23 of these children to be boys and the remaining 23 of these children to be girls. Note that our sample sizes were the same, 75 boys and 75 girls, 50% of each. If they were not 50‐50, we would have to adjust the expected counts accordingly. Let’s do the same for the Vanilla and Chocolate preferences. Chocolate: Since there were 53 children who preferred Chocolate overall, if the distributions for boys and girls are the same (H0 is true), then we would expect 26.5 of these children to be boys and the remaining 26.5 of these children to be girls. Vanilla: Since there were 51 children who preferred Vanilla overall, if the distributions for boys and girls are the same (H0 is true), then we would expect 25.5 of these children to be boys and the remaining 25.5 of these children to be girls. Enter these expected counts in the parentheses in the table below. Observed Counts (Expected Counts) Ice Cream Preference
Boys
Girls
Total Vanilla (V) 25( 25.5 )
26 ( 25.5 )
51 Chocolate (C) 30( 26.5 )
23 ( 26.5 ) 53 Strawberry (S) 20( 23 )
26 ( 23 )
46 Total 75
75
150 A Closer Look at the Expected Counts: Let's look at how we actually computed an expected count so we can develop a general rule: If H0 were true (i.e., no difference in preferences for boys versus girls), then our best estimate of the P(a child prefers vanilla) = 51/150. Since we had 75 boys, under no difference in preference, we would expect 75 x (51/150) to prefer vanilla. That is, the expected number of boys preferring vanilla = 150 . This quick recipe for computing the Total n
expected counts under the null hypothesis is called the Cross‐Product Rule. ( 75 )( 51) (row total)(column total) 212 The X 2 test statistic Next we need to compute our test statistic, our measure of how close the observed counts are to what we expect under the null hypothesis. 25 25.52 26 25.52 30 26.52 23 26.52 20 232 26 232
X2 25.5
25.5
26.5
26.5
23
23 1.73
There are 6 cells in the table, so 6 terms to add up in the test statistic. The larger the test statistic, the “bigger” the differences between what we observed and what we would expect to see if H0 were true. So the larger the test statistic, the more evidence we have against the null hypothesis. Is a value of X 2 1.73 large enough to reject H0? We need to find the p‐value, the probability of getting an X 2 test statistic value as large or larger than the one we observed, assuming H0 is true. To do this we need to know the distribution of the X 2 test statistic under the null hypothesis. If H0 is true, then X 2 has the 2 distribution with degrees of freedom = (r – 1)(c – 1) Brief motivation for the degrees of freedom formula: If you knew that 50% were boys you would know there were 50% girls (c – 1) If you know say 70% liked choc or van you would know 30% liked straw (r – 1) Find the p‐value for our ice cream example: Observed X 2 test statistic value = 1.73 df = (3 – 1)(2 – 1) = 2 Decision at a 5% significance level: (circle one) Reject H0 Fail to reject H0 Make a sketch and find the bounds for the p‐value … 0.25 < p‐value < 0.50. Conclusion: It appears that .... The distribution of ice cream preference is the same for the populations of boys and girls represented by these samples. 213 Test of Homogeneity Summary (Comparison of Several Populations) Assume: We have C independent random samples of size n1 , n2 ,..., nc from C populations. We measure 1 discrete response X that has r possible outcomes. Test: H0: The distribution for the response variable X is the same for all populations. Test Statistic: X 2 observed  expected2 expected where expected (row total)(column total) Total n If H0 is true, then X 2 has a 2 distribution with ( r 1)(c 1) degrees of freedom. The necessary conditions are: at least 80% of the expected counts are greater than 5 and none are less than 1. Try It! What is your Decision? For a chi‐square test of homogeneity, there are 3 populations and 4 possible values of the discrete characteristic. If H0 is true, that is, the distribution for the response is the same for all 3 populations, what is the expected value of the test statistic? The test statistic is X2. If H0 is true the test statistic will have a chi‐squared distribution with (3‐1)(4‐1)=6 degrees of freedom. So if H0 is true, we would expect the test statistic to be about 6. 214 Try It! Treatment for Shingles An article had the headline “For adults, chicken pox vaccine may stop shingles”. A clinical trial was conducted in which 420 subjects were randomly assigned to receive the chicken pox vaccine or a placebo vaccine. Some side effects of interest were swelling and rash around the injection site. Consider the following results for the swelling side effect. Treatment Group * Swelling Status Crosstabulation
Count Treatment
Group vaccine
placebo Total Swelling Status
minor
swelling
no swelling
42
134
32
142
74
276 major
swelling
54
16
70 Total
230
190
420 ChiSquare Tests Pearson ChiSquare
Likelihood Ratio
LinearbyLinear
Association
N of Valid Cases Value
18.571
19.556
17.696 2
2 Asymp. Sig.
(2sided)
.00009
.00006 1 .00003 df 420 a. Give the name of the test to be used for assessing if the distribution of swelling status is the same for the two treatment populations. Chi‐squared test of homogeneity b. Based on the above data, among those chicken pox vaccinated subjects, what percent had major swelling around the injection site? 54/230 = 0.2348 c. Based on the above data, among those placebo vaccinated subjects, what percent had major swelling around the injection site? 16/190 = 0.0842 d. Assuming the distribution of swelling status is the same for the two treatment populations, how many chicken pox vaccinated subjects would you expect to have major swelling around the injection site? Show your work. (230 x 70)/420 = 38.33 e. Use a level of 0.05 to assess if the distribution of swelling status is the same for the two treatment populations. Test Statistic Value: __18.571 _ p‐value: __0.00009__ Thus, the distribution of swelling status (circle your answer): does does not appears to be the same for the two treatment populations. 215 Test of Independence: Helps us to assess if two discrete (categorical) variables are independent for a population, or if there is an association between the two variables. Test of Independence Scenario: We have one population of interest ‐ say factory workers. Question: Is there a relationship between smoking habits and whether or not a factory worker experiences hypertension? Data: 1 random sample of 180 factory workers, we measure the two variables: X = hypertension status (yes or no) Y = smoking habit (non, moderate, heavy) The table below summarizes the data in terms of the observed counts. Observed Counts: X= Hyper Yes Status No Y= Smoking
Mod
36
26
62 Non
21
48
69 Habit
Heavy
30
19
49 87 93 180 Get the row and column totals. Note: neither the row nor the column totals were known in advance before measuring hypertension and smoking habit. We only know the overall total of 180. The null hypothesis: H0: There is no association between smoking habit and hypertension status for the population of factory workers. (or The two factors, smoking habit and hypertension status, are independent for the population.) One more mathematical way to write this null hypothesis is: H0: P X i and Y j P( X i) P(Y j ) The null hypothesis looks like: P ( A and B ) P ( A) P ( B ) , which is one definition of independent events, from our previous discussion of independence. 216 Getting back to our FACTORY WORKERS … The two‐way table provides the OBSERVED counts. Our next step is to compute the EXPECTED counts, under the assumption that H 0 is true. The expected counts and the test statistic are found the same way as for the homogeneity test. Cross‐Product Rule: Expected Counts (row total)(column total) Total n Compute and enter these expected counts in the parentheses in the table below. Observed Counts (Expected Counts): X= Hyper Yes Status No Y=
Non Smoking
Mod Habit
Heavy 21 ( 33.35 ) 48 ( 35.65 ) 69 36 ( 29.97 ) 26 ( 32.03 ) 62 30 (23.68 ) 19 (25.32 ) 49 87 93 180 2 The X test statistic Our measure of how close the observed counts are to what we expect under the null hypothesis. 21 33.352 36 19.97 2 30 23.682 48 35.652 26 32.032 19 25.322
X2 33.35
29.97
23.68
35.65
32.03
25.32 14.5 Do you think a value of X 2 14.5 is large enough to reject H0? The next step is to find the p‐value, the probability of getting an X 2 test statistic value as large or larger than the one we observed, assuming H0 is true. To do this we need to know the distribution of the X 2 test statistic under the null hypothesis. If H0 is true, then X 2 has the 2 distribution with degrees of freedom = (r‐1)(c‐1) 217 Aside: Using our frame of reference for chi‐square distributions. If H0 were true, we would expect the X 2 test statistic to be about 2 give or take about sqrt(2*2) = 2 . About how many standard deviations is the observed X 2 value of 14.5 from the expected value under H0? What do you think the decision will be? (14.5 – 2)/2 = 6.25 about 6 standard deviations above the expected value under H0. Find the p‐value for our factory worker example: Observed X 2 test statistic value = 14.5 df = 2 Find the p‐value and use it to determine if the results are statistically significant at the 1% significance level. Sketch the distribution to show the bounds are: p‐value < 0.001 So the results are statistically significant at the 1% level. Conclusion at a 1% level: It appears that .... there is an association between smoking and hypertension for the population of factory workers represented by this sample. Test of Independence Summary Assume: We have 1 random sample of size n . We measure 2 discrete responses: X which has r possible outcomes and Y which has c possible outcomes. Test: H0: The two variables X and Y are independent for the population. Test Statistic: X 2 observed  expected2 expected
(row total)(column total)
where expected Total n If H0 is true, then X 2 has a 2 distribution with ( r 1)(c 1) degrees of freedom. The necessary conditions are: at least 80% of the expected counts are greater than 5 and none are less than 1. 218 Relationship between Age Group and Appearance Satisfaction Are you satisfied with your overall appearance? A random sample of 150 women were surveyed. Their answer to this question (Yes or No) was recorded along with their age category (1 = under 30, 2 = 30 to 50, and 3 = over 50). SPSS was used to generate the following output from the data. Are You Satisfied? * Age Group Crosstabulation
Count Are You Satisfied? Yes
No Total under 30
38
10
48 Age Group
30 to 50
30
29
59 over 50
34
9
43 Total
102
48
150 ChiSquare Tests Pearson ChiSquare
Likelihood Ratio
LinearbyLinear
Association
N of Valid Cases Value
13.149
13.039
.018 df
2
2 Asymp. Sig.
.001
.001 1 .893 150 a. Give the name of the test to be used for assessing if there is a relationship between age group and appearance satisfaction. __ Chi‐squared test of independence ______________________ b. Assuming there is no relationship between age group and appearance satisfaction, how many old women (over 50) would you expect to be satisfied with their appearance? Show your work. (102)(43)/150 = 29.24 c. Assuming there is no relationship between age group and appearance satisfaction, what is the expected value of the test statistic? The expected value is 2 = degrees of freedom. d. Use a level of 0.05 to assess if there is a significant relationship between age group and appearance satisfaction. Test Statistic Value: ___13.149____ p‐value: __0.001 __ Thus, there (circle your answer): does does not appear to be an association between age group and appearance satisfaction. 219 2x2 Tables – a special case of the two proportion z test Section 15.2 (pages 646‐652) discuss some of the special circumstances that apply to analyzing 2x2 tables. The z‐test for comparing two population proportions is the same as the chi‐square test provided the alternative is two‐sided. The z‐test would need to be performed for one‐
sided alternatives. When the conditions for the z‐test or chi‐square test are not met (sample sizes too small) there is another alternative test called the Fisher’s Exact Test. Stat 250 Formula Card: Chi‐Square Tests
Test of Independence & Test of Homogeneity Test for Goodness of Fit Expected Count Ei expected npi 0 Expected Count row total column total
E expected total n Test Statistic 2 O E E Test Statistic
2 (observed expected)
expected 2 2 df = (r – 1)(c – 1) O E 2
E (observed expected) 2
expected df = k – 1 If Y follows a 2 df distribution, then E(Y) = df and Var(Y) = 2(df). 220 Additional Notes A place to … jot down questions you may have and ask during office hours, take a few extra notes, write out an extra practice problem or summary completed in lecture, create your own short summary about this chapter. 221 ...
View
Full
Document
This note was uploaded on 12/31/2011 for the course STATS 250 taught by Professor Gunderson during the Winter '10 term at University of Michigan.
 Winter '10
 Gunderson
 Statistics

Click to edit the document details