# note15 - STAT5044: Regression and Anova Inyoung Kim 1 / 48...

This preview shows pages 1–9. Sign up to view the full content.

STAT5044: Regression and Anova Inyoung Kim 1 / 48

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Outline 1 Categorical data analysis 2 Three measures of relationship between categorical variables 3 Testing Independent in two way contingency table 2 / 48
Describing Contingency Tables Introduce tables that display relationships between categorical variables. Deﬁne parameters that summarize their association. Parameters are used to compare groups on the proportions to responses in the outcome categories. odds ratio has special importance, appearing as a parameter in models discussed later. Primary focus is present parameters for nominal and ordinal multicategory variables. 3 / 48

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Contingency Tables Let X and Y denote two categorical response variables, X with I categories and Y with J categories. Classiﬁcations of subjects on both variables have IJ possible combinations. The response ( X , Y ) of a subject chosen randomly from some population have a probability distribution. 4 / 48
Contingency Tables A rectangular table having I rows for categories of X and J columns for categories of Y displays this distribution. The cells of the table represent the IJ possible outcomes. When the cells contain frequency counts of outcomes for a sample, the table is called a contingency table , a term introduced by Karl Pearson (1904). Another name is cross-classiﬁcation table A contingency table with I rows and J columns is called an I × J table. 5 / 48

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Contingency Tables: Example This table is a 2 × 3 contingency table Myocardial Infarction Fatal Nonfatal No Attack Attack Attack Placebo 18 171 10,845 Aspirin 5 99 10,933 This table is from a report on the relationship between aspirin use and heart attacks by the Physicians’ Health Study Research Group at Harvard Medical School. Of the 11,034 physicians taking a placebo, 18 suffered fatal heart attacks over the course of the study, whereas of the 11,037 taking aspirin, 5 had fatal heart attacks. Question: Does Aspirin cause more Fatal attack than Placebo? 6 / 48
Independence of Categorical Variables When both variables are response variables, descriptions of the association can use their joint distribution, the conditional distribution of Y given X, or the conditional distribution of X given Y. The conditional distribution of Y given X relates to the joint distribution by π j | i = π ij π i + , all i and j Two categorical response variables are deﬁned to be independent if all joint probabilities equal the product of their marginal probabilities, π ij = π i + π + j for i = 1 ,..., I and j = 1 ,..., J . 7 / 48

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Notation Column Row 1 2 Total 1 π 11 π 12 π 1 + ( π 1 | 1 ) ( π 2 | 1 ) (1.0) 2 π 21 π 22 π 2 + ( π 1 | 2 ) ( π 2 | 2 ) (1.0) Total π + 1 π + 2 1.0 Notation for joint, conditional, and marginal distributions for the 2 × 2 case. The cell frequencies are denoted
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 01/02/2012 for the course STAT 5044` taught by Professor Staff during the Fall '11 term at Virginia Tech.

### Page1 / 48

note15 - STAT5044: Regression and Anova Inyoung Kim 1 / 48...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online