CSSS/SOC/STAT 221: Statistical Concepts and Methods for the Social Sciences 1Name: __________________________________________ Collaborators: __________________________________________ Problem Set 4: Using a chi-square test to identify differences in survival rates between different populations Early on the morning of 15 April 1912, the ocean linerRMSTitanicsank in the North Atlantic, on the ship’s first voyage. Tragically, approximately 68% of the ship’s passengers and crew perished. The table below describes the frequencies of those who survived the catastrophe and those who perished, separated according to passenger class (1st, 2nd, and 3rdclass, plus crew). Observed counts (𝑂?,? ) PerishedSurvivedRow totals 1st class1222032ndclass1671183rdclass528178Crew673212Column totals14907112201 The overall survival proportion was approximately 32% (????𝑙?? = 711 2201⁄ ), but some groups fared better than others: 62% of the 1st-class passengers and 41% of the 2nd -class passengers survived, in contrast to 25% of the 3rd-class passengers and 24% of the ship’s crew. Given this pattern in the sample, it is reasonable to wonder if different groups had unequal access to the means of survival. Conversely, we might wonder if every group was exposed to the same probability of survival, in which case the pattern of variability we see in the sample would be the result of random sampling error alone. We can treat the probability of survival𝜋as the focus of a null hypothesis. If survival is independent of passenger class, then we can set our null value to????𝑙?? and stat our hypothesis as— 𝐻0: 𝜋1??= 𝜋2??= 𝜋3??= 𝜋???𝑤= ? ???𝑙?? —which we can test with a chi-square test for independence. The first step is to calculate the expected counts of those who would have perished and those who would have survived for each group if the null hypothesis were true. The following table gives expected frequencies for each combination of levels of the passenger class variable and the survival outcome variable,𝐸?,? . These were calculated by multiplying the corresponding row total??and column total?? in the above table, then dividing by the total sample size 2201: 𝐸?,? = ??× ? ? ? Expected counts for the crew are missing. 325 285 706 885