This preview shows pages 1–2. Sign up to view the full content.
Poisson Regression Analysis
136
13. Poisson Regression Analysis
We have so far considered situations where the outcome variable is numeric and Normally
distributed, or binary. In clinical work one often encounters situations where the outcome
variable is numeric, but in the form of counts. Often it is a count of rare events such as the
number of new cases of lung cancer occurring in a population over a certain period of time.
The aim of regression analysis in such instances is to model the dependent variable Y as the
estimate of outcome using some or all of the explanatory variables (in mathematical
terminology estimating the outcome as a function of some explanatory variables.
When the response variable had a Normal distribution we found that its mean could be linked
to a set of explanatory variables using a linear function like Y
=
β
0
+
β
1
X
1
+
β
2
X
2
…….+
β
k
X
k.
In the case of binary regression the fact that probability lies between 01 imposes a constraint.
The normality assumption of multiple linear regression is lost, and so also is the assumption
of constant variance. Without these assumptions the
F
and
t
tests have no basis. The solution
was to use the logistic transformation of the probability p or logit p, such that
log
e
(
p
/1−
p
)
=
β
0
+
β
1
Χ
1
+
β
2
Χ
2
…….
β
n
Χ
n.
The
β
coefficients could now be interpreted as
increasing or decreasing the log odds of an event, and exp
β
(the odds multiplier) could be
used as the odds ratio for a unit increase or decrease in the explanatory variable. In survival
analysis we used the natural logarithm of the hazard ratio, that is
log
e
h(t)/h
0
(t) =
β
0
+
β
1
X
1
+ ….
.+
β
n
X
n
When the response variable is in the form of a count we face a yet different constraint. Counts
are all positive integers and for rare events the Poisson distribution (rather than the Normal) is
more appropriate since the Poisson mean > 0. So the logarithm of the response variable is
linked to a linear function of explanatory variables such that log
e
(Y) =
β
0
+
β
1
Χ
1
+
β
2
Χ
2
…
etc. and so Y = (
e
β0
) (
e
β1Χ1
) (
e
β
2
Χ
2
) .
. etc.
In other words, the typical Poisson regression
model expresses the log outcome rate as a linear function of a set of predictors.
Assumptions in Poisson Regression
The assumptions include:
1.
Logarithm of the disease rate changes linearly with equal increment increases in the
exposure variable.
2.
Changes in the rate from combined effects of different exposures or risk factors are
multiplicative.
3.
At each level of the covariates the number of cases has variance equal to the mean.
4.
Observations are independent.
Methods to identify violations of assumption (3) i.e. to determine whether variances are too
large or too small include plots of residuals versus the mean at different levels of the predictor
variable. Recall that in the case of normal linear regression, diagnostics of the model used
plots of residuals against fits (fitted values). This means that the same diagnostics can be used
in the case of Poisson Regression.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 02/15/2012 for the course GEO 6938 taught by Professor Staff during the Summer '08 term at University of Florida.
 Summer '08
 Staff

Click to edit the document details