Categorical Data Analysis
•
Independent (Explanatory) Variable is
Categorical (Nominal or Ordinal)
•
Dependent (Response) Variable is
Categorical (Nominal or Ordinal)
•
Special Cases:
–
2x2 (Each variable has 2 levels)
–
Nominal/Nominal
–
Nominal/Ordinal
–
Ordinal/Ordinal
Contingency Tables
•
Tables representing all combinations of
levels of explanatory and response
variables
•
Numbers in table represent
Counts
of the
number of cases in each cell
•
Row and column totals are called
Marginal counts
Example – EMT Assessment of Kids
•
Explanatory Variable
– Child Age (Infant,
Toddler, Preschool,
Schoolage,
Adolescent)
•
Response Variable –
EMT Assessment
(Accurate, Inaccurate)
Assessment
Age
Acc
Inac
Tot
Inf
168
73
241
Tod
230
73
303
Pre
254
53
307
Sch
379
58
437
Ado
652
124
776
Tot
1683
381
2064
Source: Foltin, et al (2002)
2x2 Tables
•
Each variable has 2 levels
–
Explanatory Variable – Groups (Typically
based on demographics, exposure, or Trt)
–
Response Variable – Outcome (Typically
presence or absence of a characteristic)
•
Measures of association
–
Relative Risk (Prospective Studies)
–
Odds Ratio (Prospective or Retrospective)
–
Absolute Risk (Prospective Studies)
2x2 Tables  Notation
Outcome
Present
Outcome
Absent
Group
Total
Group 1
n
11
n
12
n
1.
Group 2
n
21
n
22
n
2.
Outcome
Total
n
.1
n
.2
n
..
Relative Risk
•
Ratio of the probability that the outcome
characteristic is present for one group,
relative to the other
•
Sample proportions with characteristic from
groups 1 and 2:
.
2
21
2
^
.
1
11
1
^
n
n
n
n
=
=
π
π
Relative Risk
•
Estimated Relative Risk:
2
^
1
^
π
π
=
RR
95% Confidence Interval for Population Relative Risk:
21
2
^
11
1
^
96
.
1
96
.
1
)
1
(
)
1
(
71828
.
2
)
)
(
,
)
(
(
n
n
v
e
e
RR
e
RR
v
v
π
π

+

=
=

Relative Risk
•
Interpretation
–
Conclude that the probability that the outcome
is present is higher (in the population) for
group 1 if the entire interval is above 1
–
Conclude that the probability that the outcome
is present is lower (in the population) for
group 1 if the entire interval is below 1
–
Do not conclude that the probability of the
outcome differs for the two groups if the
interval contains 1
Example  Coccidioidomycosis and
TNF
α
antagonists
•
Research Question: Risk of developing Coccidioidmycosis
associated with arthritis therapy?
•
Groups: Patients receiving tumor necrosis factor
α
(TNF
α
)
versus Patients not receiving TNF
α
(all patients arthritic)
COC
No COC
Total
TNF
α
7
240
247
Other
4
734
738
Total
11
974
985
Source: Bergstrom, et al (2004)
Example  Coccidioidomycosis and
TNF
α
antagonists
•
Group 1: Patients on TNF
α
•
Group 2: Patients not on TNF
α
)
76
.
17
,
55
.
1
(
)
24
.
5
,
24
.
5
(
:
%
95
3874
.
4
0054
.
1
7
0283
.
1
24
.
5
0054
.
0283
.
0054
.
738
4
0283
.
247
7
3874
.
96
.
1
3874
.
96
.
1
2
^
1
^
2
^
1
^
≡
=

+

=
=
=
=
=
=
=
=

e
e
CI
v
RR
π
π
π
π
Entire CI above 1
⇒
Conclude higher risk if on TNF
α
Odds Ratio
•
Odds of an event is the probability it occurs
divided by the probability it does not occur
•
Odds ratio is the odds of the event for group 1
divided by the odds of the event for group 2
•
Sample odds of the outcome for each group:
22
21
2
12
11
.
1
12
.
1
11
1
/
/
n
n
odds
n
n
n
n
n
n
odds
=
=
=
Odds Ratio
•
You've reached the end of your free preview.
Want to read all 44 pages?
 Fall '08
 YOUNG
 ChiSquare Test, Pearson's chisquare test, Fisher's exact test