# 2.5 - Data analysis for two-way tables Two-way tables...

Analysis of two-way tables Data analysis for two categorical variables

Data analysis for two-way tables Two-way tables Relationships between categorical variables Marginal distributions Conditional distributions Simpson’s paradox
Two-way tables organize data about two categorical variables. Two-way tables    Joint distribution Total = 175,229 thousand Often replace the counts with proportions (percentages). 4 Χ 3 table; 12 cells

Marginal distributions We can look at each categorical variable separately in a two-way table by studying the row totals and the column totals. They represent the marginal distributions , expressed in counts or percentages (They are written as if in a margin.) 2000 U.S. census 0.1590 0.3314 0.2538 0.2558 0.2156 0.4648 0.3196
The marginal distributions can then be displayed on separate bar graphs, typically expressed as percents instead of raw counts. Each graph represents only one of the two variables, completely ignoring the second one.

The two marginal distributions can be determined from the joint distribution, but the marginal distributions alone are not sufficient to recover the joint distribution unless the two variables are independent (to be defined and discussed later). We need conditional distributions to study the relationship between two categorical variables, including their possible independence.
Parental smoking Does parental smoking influence the smoking habits of

