This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Business Statistics
Fifth Edition Ken Black Chapter 12:
Analysis of Categorical Data Not to be reproduced without the written permission of RB Alt 12.1 CHI—SQUARE GOODNESS—OF—FIT TEST [PAGE 468] 0 Chi—Squared tests use Count Data 0 Goodness—ofFit Test for k Proportions 0 Concept: Are the proportions for k mutually exclusive and
completely exhaustive categories equal to speciﬁed values? 0 Case 1: The population proportions are equal [Page 470471] Examgle:
The Hershey Company wants to determine if customers have a preference for any of the following three candy bars:
A. Mr. Goodbar B. Hershey’s Milk Chocolate C. Krackel From a random sample of 140 consumers, it was found that 43 preferred Mr.
Goodbar, 53 preferred Hershey’s Milk Chocolate, and 44 preferred Krackel Test the null hypothesis that customers have no preference for any of the
three candy bars. Null Hypothesis: H0: pA = , p3 = 9 PC = ___._ . Procedure: Compare the actual or observed counts (Oi) in each category
with the expected counts (Ei), assuming H0 is true. 0 Black’s notation: f0 forO and fe forE k 0 —E. 2
 Goodness—ofFit Test Statistic: 12 = ELL175;)
i=1 ‘ I 0 Degrees of Freedom for 12 test statistic: k l o Reject H0 if 12 > 22a, k19 obtained from Table AS or reject H0 if p—value < a. Not to be reproduced without the written permission of RB Alt 2 Example:
a. Calculate the expected frequencies, assuming that consumers have no preference Thenullhypothesisis: HOZpA=l/3, pB=l/3, pC=l/3 _l ypothesized Expected
Candy Bar) I'_ro.ortions Counts (Ei)
_I
__'I 46.66 = 140(.33) A
I
I  ounts (Bi)  ounts (Oi)
arm:
“—
M U‘I Rejection Region (R.R.): Conclusion: 0 In Minitab: Stat > Tables > Chi—Square Goodness—of—F it Test (One variable)
Observed Counts (Enter in C1 or any column)
Equal proportions Not to be reproduced without the written permission of F .B Alt 3 o Minitab output follows ChiSquare GoodnessofFit Test for Observed Counts in Variable: 0 Test Contribution
Category Observed Proportion Expected to ChiSq
A 43 0.333333 46.6667 0.288095
B 53 0.333333 46.6667 0.859524
C 44 0.333333 46.6667 0.152381 N DF ChiSq PValue
140 2 1.3 0.522 Conclusion using p—Value: 0 Caveat: Conservative requirement: Bi 2 5. Not to be reproduced without the written permission of F .B Alt 4 0 Case 2: The population proportions have speciﬁc values [page 469]
Example: From historical data, such as sales records, the Hershey Company
knows that 30% of its customers prefer Mr. Goodbar, 50% prefer Hershey’s
Milk Chocolate, and 20% prefer Krackel. Suppose that marketing analysts recently sample customers and ﬁnd that 49
prefer Mr. Goodbar, 97 prefer Hershey’s Milk Chocolate and 54 prefer Krackel. Have current preferences for these products changed from the known
historical preferences? Null Hypothesis: H0: PA = , PB = , PC = Use a signiﬁcance level of 5%. Value of the Test Statistic: 1/ = 7.01 R.R.: Reject H0 if 12 > Conclusion: 0 In Minitab:
Stat > Tables > ChiSquare Goodnessof—Fit Test (One variable) Observed Counts (Enter in C1 or any column)
Speciﬁc proportions (Enter in C2 or any column) Not to be reproduced without the written permission of RB Alt 5 o The Minitab output follows ChiSquare GoodnessofFit Test for Observed Counts in Variable: 0 Test Contribution
Category Observed Proportion Expected to ChiSq
1 49 0.3 60 2.01667
2 97 0.5 100 0.09000
3 54 0.2 40 4.90000 N DF ChiSq P—Value
200 2 7.00667 0.030 Conclusion using p—value: Action: Not to be reproduced without the written permission of F.B Alt 6 Example: Is the percentage of each color in "M&M's"® packages as stated? An online search revealed the following: "M&M's"® Milk Chocolate
Candies: 30% brown, 20% each of yellow and red and 10% each of orange,
green and blue. ...” Is this true? Null Hypothesis: H0
The frequencies for a sample of 200, together with the alleged proportions, follow. Color F reguency Speciﬁed Blue 13 0.1
Brown 28 0.3
Green 64 0.1
Red 32 0.2
Orange 52 0.1
Yellow 11 0.2 Test Contribution
Category Observed Proportion Expected to ChiSq
1 13 0.1 20 2.4500
2 28 0.3 60 17.0667
3 64 0.1 20 96.8000
4 32 0.2 40 1.6000
5 52 0.1 20 51.2000
6 11 0.2 40 21.0250 N DF Chi~Sq P—Value
200 5 190.142 0.000 Dishrm not /
ChFSqmre. «i=5 Conclusion: Not to be reproduced without the written permission of RB Alt 7 0 Testing a Population Proportion by Using the ChiSquare Test [Page 474]. Example: A market research ﬁrm interviewed a large number of potential
automobile buyers by phone. One of the questions asked was whether the individual
would prefer 0% ﬁnancing or a $2000 rebate on the price of the car. Of the 1586
customers interviewed, 847 stated a preference for the 0% ﬁnancing option. State the null and alternative hypotheses (assuming that individuals are equally
divided in preferring the 0% ﬁnancing), the value of the test statistic, the rejection
region and your conclusion. Use a signiﬁcance level of 5%. p = population proportion of individuals preferring 0% ﬁnancing. H0:p vs. Hazp f7 —— p0 .534 — p0 (1—p0)/n:‘/ /1586 T.S.:Z= 22.71 R.R.: At the .05 level, reject H0 if lZl > 1.96 (Table A5) ____.:_—Ri——»__ 1.96 2.71 Conclusion: H0: p = .50 at the .05 level of signiﬁcance. The Minitab output follows:
Test and CI for One Proportion
Test ofp = 0.5 vs p not = 0.5 Sample X N Samplep 95% Cl ZValue P—Value
l 847 1586 0.534048 (0.509498, 0.558598) 2.71 0.007 Using the normal approximation. Not to be reproduced without the written permission of F .B Alt 8 0 Using Chi—Square Approach 0 E
0% Financing 847 793 =
Rebate 739 793 Value of the Test Statistic: [2 =
R.R.: Reject H0 if 12 > Conclusion: The Minitab output follows: ChiSquare Goodnessof—Fit Test for Observed Counts in Variable: 0 Test Contribution Category Observed Proportion Expected to ChiSq
1 847 0.5 793 3.67718 2 739 0.5 793 3.67718 N DF ChiSq PValue
1586 1 7.35435 0.007 Not to be reproduced Without the written permission of PB Alt 9 12.2 Contingency Analysis: ChiSquare Test of Independence [Page 479] 0 Data are crossclassiﬁed according to two factors 0 Contingency table — presentation of sample data according to two factors. Example 1: A personnel director for a large, research—oriented ﬁrm categorizes
colleges and universities as most desirable, good, adequate, and undesirable for
purposes of hiring their graduates. Data are collected on a random sample of 156
recent graduates and each is rated by a supervisor as outstanding, average or poor.
The results follow. Rating
School Outstanding Average Poor
Most Desirable 21 25 2
Good 20 3 6 1 O
Adequate 4 14 7
Undesirable 3 8 L 6 Example 2: A ﬁnance company wishes to learn whether marital status has any
bearing on whether or not a new car loan becomes delinquent within the ﬁrst year.
The marital and loan status of a random sample of 950 loans is summarized in the
following table: Marital Status
Unmarried Married Total
Loan Delinquent 29 47 76
Status Not Delinquent 384 490 874
Total 413 537 950 Not to be reproduced without the written permission of F .B Alt 10 Example 3: The table below is from a survey of American attitudes where 1397 randomly sampled Americans have been cross—classiﬁed both by their attitude to the
death penalty and their attitude to gun registration. Favors Death
Penalty
Yes No Total
Favors
ﬁg Yes 784 236 1020
Regulation No 31 1 66 377
Total 1095 302 1397 o Are the two factors inde endent ? o “The chisquare test of independence can be used to analyze any level of data
measurement, but it is particularly useful in analyzing nominal data.” [Page 479] 0 Generic Contingency Table Factor B Factor A 2 @ o If events A and B are independent, then P(A and B) = 0 To calculate expected frequencies, assume that Factors A and B are independen 0 Test Statistic: c r (0 _ E 2 2 if if)
0 df=(r—l)(c—~1) Not to be reproduced without the written permission of F .B Alt 1] Example 2: A ﬁnance company wishes to learn whether marital status has any
bearing on whether or not a new car loan becomes delinquent within the ﬁrst year.
The marital and loan status of a random sample of 950 loans is summarized in the
following table: Marital Status
Unmarried Married Total
Loan Delinquent L 29 47 l 76
Status Not Delinquent 384 490 874
Total 413 537 950 a. Precisely state the null and alternative hypogs. Ho: Loan status and marital status are events.
Ha: Loan status and marital status are E events. b. Estimate the probability that a randomly selected individual has a nondelinquent loan.
Please express your answer as a fraction. P (Non—Delinquent Loan) = E = 0.92 c. Estimate the probability that a randomly selected individual is married.
Please express your answer as a fraction. P (Married) = E = 0.565 d. If the null hypothesis is true, estimate the probability that a randomly selected
individual has a non—delinquent loan (NDL) and is married (M). Also, determine the
expected ﬁequency for this cell (nondelinquent loan and married)? P (NDL and M) = E] = 0.52 9 E22 = 950 (0.52) = 494.04 6. What is the value of the term in the chisquare statistic corresponding to this cell (non
delinquent loan and married)? 2 _ (022 4222)2 = (490—49404)2 _ = 0.033
In 522 494.04 Not to be reproduced without the written permission of F .B Alt 12 f. Use the MTB output to state the value of the test statistic, the rejection region and
your conclusion. Use a signiﬁcance level of 5%. ChiSquare Test: U, M Expected counts are printed below observed counts
ChiSquare contributions are printed below expected counts U M Total
1 29 47 76
33.04 42.96
0.494 0.380 2 384 490 874
379.96 494.04
0.043 0.033
Total 413 537 950 ChiSq = 0.950, DF = l, PValue = 0.330 Reject H0 if 12 = 0.950 > E] = 3.84 ) (Do, Do not) Reject H0 _, 51 Not to be reproduced without the written permission of RB Alt 13 Example 3: The table below is from a survey of American attitudes where 1397
randomly sampled Americans have been cross—classiﬁed both by their attitude to the death penalty and their attitude to gun registration. Favors Death
PenalQ
Yes No Total
Favors
_G_1_1_r_z Yes 784 23 6 1020
Regulation No 3 l 1 66 3 77
Total 1095 302 1397 a. For a randomly selected individual, what is the probability that the individual
“Favors Gun Registration” ? P (Favors Gun Registration) = E] = .7301 b. For a randomly selected individual, what is the probability that the individual
“Favors Death Penalty” ? P (Favors Death Penalty) = I'VE—l = .7838 c. If attitudes are independent, what is the probability that a randomly selected
individual “Favors Gun Registration” and “Favors Death Penal ” P (Favors Gun Registration and Favors Death Penalty) =( ) ( ) = .5723 d. If attitudes are independent, what is the expected frequency for the (Yes, Yes)
cell? 1511 = 1397 (.5723) = 799.46 e. Show how the corresponding term in the chi—square statistic is calculated for
the (Yes, Yes) cell only. 2 _(011_Eii)2 : E] 111—
E11 Not to be reproduced without the written permission of F .B Alt 14 f. Use the MTB output to state the value of the test statistic, the rejection region and
your conclusion based on the test statistic approach. Use a signiﬁcance level of 5%. ChiSquare Test: Yes, No Expected counts are printed below observed counts
Chi—Square contributions are printed below expected counts Yes No Total
1 784 236 1020
799.50 220.50 0.300 1.089 2 311 66 377
295.50 81.50
0.813 2.947 Total 1095 302 1397 ChiSq = 5.150, DF = l, PValue = 0.023 El
@ T.S.:
RR:
Conclusion: (Do, Do not) Reject H0: Factors are independent g. Precisely stgour conclusion based on the pvalue approach? Since p = .05, (Reject, Do Not Reject) H0: Factors are independent El Not to be reproduced without the written permission of RB Alt 15 Example 1: A personnel director for a large, researchoriented ﬁrm categorizes
colleges and universities as most desirable, good, adequate, and undesirable for
purposes of hiring their graduates. Data are collected on a random sample of 156
recent graduates and each is rated by a supervisor as outstanding, average or poor.
The results follow. Rating
School Outstanding Average Poor
Most Desirable 21 25 2
Good 20 3 6 1 0
Adequate 4 l4 7
Undesirable 3 8 6 ChiSquare Test: 0, A, P
Expected counts are printed below observed counts
ChiSquare contributions are printed below expected counts 0 A P Total
1 21 25 2 48 14.77 25.54 7.69
66 ‘ 2.629 0.011 4.212 2 20 36 10
20.31 35.12 10.58
0.005 0.022 0.031 3 4 14 7 25
7.69 13.30 4.01
1.772 0.037 2.237 4 3 8 6 17 5.23 9.04 2.72
0.951 0.121 3.938 Total 48 g 25 156 ChiSq 15.967 DF = 6, PValue = 0.014
2 cells with expected counts less than 5. o In Minitab: Stat > Tables > ChiSquare Test (Two—Way Table in Worksheet)
Observed Counts (Enter data in as many columns as necessary) Not to be reproduced without the written permission of F .B Alt 16 ...
View
Full Document
 Fall '08
 staff

Click to edit the document details