STA 841 Discrete data
Lecture 22
1 / 20
Final project due through email on December 4 at noon.
2 / 20
A one-dim table (can be considered an I 1 table)
In a lottery system, each week six numbers are randomly drawn
from 1-54 with replacement.
For the Year 2

STA 841 Discrete data
Lecture 10
1 / 45
Recall the dilution assay example
# of dilutions (x)
density (rx )
chance of getting a colony (p(x)
0
1
2
.
.
.
r0
r0 /2
r0 /4
.
.
.
1 e r0 v
1 e r0 v/2
1 e r0 v/4
.
.
.
x
r0 /2x
1
e
r0 v/2x
I
At each dilution, we m

STA 841 Discrete data
Lecture 8
1 / 28
Review
Last time we introduced the notion of deviance, which is the
squared error distance in the context of exponential family
distributions.
From Hoeffdings formula, we see that we can the difference in
deviance be

STA 841 Discrete data
Lecture 9
1 / 14
Modeling event probabilities (choosing the link)
For binary response, how to model the mean i = i .
Possibility I: The identity link
P(Yi = 1|X) = i = X(i) .
Undesirable as the probability is not bounded in [0,1].
A

STA 841 Discrete data
Lecture 6
1 / 22
Convolution and exponential-dispersion families
Recall a one-parameter exponential family.
There is a one-to-one correspondence between the natural
parameter and all the moments of this distribution.
In particular, t

STA 841 Discrete data
Lecture 7
1 / 27
The role of
Note that drops out of the entire algorithm.
So in getting MLEs of we dont need to worry about the
dispersion parameter at all.
Why do we even bother to have it in the model then?
We need it to appropria

STA 841 Discrete data
Lecture 3
1 / 32
Conditional association versus marginal association
In this class most of the models we consider take discrete values.
These lessons still applyno matter what types of models we
are considering, need to keep these is

STA 841 Discrete data
Lecture 18
1 / 17
Contingency tables with ordered predictors
So far we have been considering the case where the predictor X
is unordered categorical.
In some applications, there may be an intrinsic order in some of
the margins in the

STA 841 Discrete data
Lecture 15
1 / 17
Prospective (cohort) vs retrospective sampling
I
Example: a prospective study for the impact of certain
exposure on early on-set cancer
I
8
>proximity to high voltage line
>
>
<or smoking or not
Possible exposure:
>

STA 841 Discrete data
Lecture 2
1 / 26
What is the regression problem?
Purpose: to study how a (possibly vector-valued) response Y
depend on a set of explanatory variables X.
In particular, to infer on what the distribution of Y given X is?
That is the in

STA 841 Discrete data
Lecture 4
1 / 30
Small sample exact CIs
For small sample sizes the asymptotics for the likelihood triad
often doesnt hold for data far from normal, such as discrete data.
But one can use the exact sampling distribution of test statis

STA 841 Discrete data
Lecture 11
1 / 24
Data augmentation for Bayesian inference
The latent variable view is very useful in Bayesian inference on
GLMs.
It makes the construction of MCMC samplers very simple.
In this context, probit is often a popular choi

STA 841 Discrete data
Lecture 13
1 / 22
Logistic regression and classication
Consider the classication setting where we have data from two
possible classes.
Let Y be the class label:
Y =1
X N( 1 , )
Y =0
X N( 0 , )
and
So the covariates X come from a two

STA 841 Discrete data
Lecture 19
1 / 45
Polytomous response
I
Sometimes a discrete response can take more than two values.
I
I
I
8
>Occupation
<
Nominal (unordered) response Blood group
>
:
Newspaper choice.
Ordered response
8
>Hair colorlight/mediam/dark

STA 841 Discrete data
Lecture 14
1 / 23
Proposal for nal project. (One-page.)
2 / 23
Heterogeneity and correlation
Suppose each observation Yi is the sum of ni Bernoulli random
variables
ni
Yi =
Yij
j=1
where Yij cfw_0, 1 and P(Yij = 1) = ij .
So we have

STA 841 Discrete data
Lecture 5
1 / 34
Moments and cumulants of the exponentially tilted distns
The pdf (or pmf) of the exponentially tilted distribution is
f (y) = e yK0 ( ) f0 (y).
Its MGF is
M (t) =
ety f (y)dy =
ety e y f0 (y)
M0 (t + )
dy =
.
M0 ( )

STA 841 Discrete data
Lecture 23
1 / 25
Example: High school alcohol-cigarrette-marijuana use
Agresti Tables 8.3-8.6 (2nd Ed) or Tables 9.3-9.6.
Model comparison in Table 8.6/9.6. The conditional
independence models t poorly. The homogeneous association
m

STA 841 Discrete data
Lecture 17
1 / 20
Conditional odds ratios in 2 2 K tables
Now consider the case if we have three discrete random variables
(X, Y, Z).
Draw a 2 2 K table.
Our interest is in the conditional dependence between X and Y
given Z.
For exam

STA 841 Discrete data
Lecture 21
1 / 23
Poisson family and log-linear models for counts
Another common type of discrete response are counts.
The one-parameter exponential family that is most suited for
modeling counts is the Poisson family.
p (y) =
y e
y

STA 841 Discrete data
Lecture 24
1 / 27
Course project due December 4 at noon. Please email me.
Please hand in a paper copy to Kaoru Irie by 5pm on
December 4.
2 / 27
More models on squared contingency tables
Next let us look at several more models for I

STA 841 Discrete data
Lecture 20
1 / 21
Latent variable formulation for ordinal responses
Let us generalize the latent variable formulation for binary
responses.
For each observation with covariate x, let U be an unobserved
variable such that
U = x +
whe

STA 841 Discrete data
Lecture 12
1 / 15
Talk about HW2.
2 / 15
Example: Relative potency
In some bioassay studies we may be investigating multiple
toxins.
For two toxins G and D, if the response-dose relationship can be
characterize with a logit model on

STA 841 Generalized linear models (and more)
Lecture 1
1 / 22
Two types of applied statistical and scientic investigation
Structured vs unstructured.
Structuredwith well-dened goals in mind when collecting
data. For example,
Difference between identied su