Chapter 6 Lecture

CHAPTER 6 Two Categorical Variables’ Relationship We tried to see if a relationship existed between two categorical variables in chapter 2 by looking at counts and percents in a two- way table. What does it mean for two categorical variables to be related? If two categorical variables are related, it means that the probability that an individual falls into a particular category for one variable is linked to the category they fall into for the other variable. Let’s say that we wanted to determine if there is a relationship between religion (Christian, Jewish, Muslim, Other) and smoking. When we test if there is a relationship between these two variables, we are trying to determine if individuals of a particular religion are more or less likely to smoke. If that is the case, then we can say that Religion and Smoking are related or associated . Skip 6.2 Confounding variables can exist when we look at the relationship between two categorical variables too (Simpson’s Paradox)! Statistically significant – when the difference between our sample data and previous population information is large enough to be unlikely to have occurred How can we test for statistical significance? We will use several different methods of inference in this class. One method of inference is a hypothesis test. All hypothesis tests follow these basic steps: 1. Determine the null and alternative hypotheses. 2. Verify necessary data conditions, and if they are met, summarize the data into an appropriate test statistic. 3. Assuming that the null hypothesis is true, find the p-value.

