Categorical data and contingency tables Kerby Shedden Department of Statistics, University of Michigan April 8, 2011 2 / 22
Proportions Suppose we have an independent and identically distributed (iid) sample X 1 , . . . , X n of binary responses. For example, each X i may be an individiual’s response to a yes/no question in a survey. The distribution of each X i is characterized by the “success probability” p P ( X i = 1) . The mean and variance of each X i are EX i = p var ( X i ) = p (1 - p ) . The mean and variance of ¯ X are E ¯ X = p var ( ¯ X ) = p (1 - p ) / n . 3 / 22

We can form an approximate 95% conﬁdence interval for p in the same way we would form a CI for the expected value EX : ¯ X ± σ/ n or ˆ p ± 2 p ˆ p (1 - ˆ p ) / n . 4 / 22
