STAT 645: Biostatistics, Fall 2012 - Assignment 2
1. For the pollution data in pollute data.csv [Note: There is at least one missing value in these
data. Just remove any records with at least one missing value prior to doing the following
STAT 645: Biostatistics, Fall 2011 - Assignment 1
1. For the weight by height data set, contained in the le wt ht data.csv:
(a) Based on the exploratory analysis we carried out in class, what regression model would
you use to carry out inference
STAT 645: Biostatistics, Fall 2012 - Assignment 5
1. For the onset data in onset data.csv:
(a) Carry out an exploratory analysis of the relationships between onset and the 3 covariates
tx, prior, and age. Briey comment on your ndings.
STAT 645: Biostatistics, Fall 2012 - Assignment 6
1. Carry out an analysis of the Salmonella virulence protein data by tting this model:
yjkr = + Pj + Mk +
j kr ,
where yjkr is the intensity for peptide j in mutant group k (k = 1 is control, and
STAT 645: Biostatistics, Fall 2012 - Assignment 4
1. Carry out a simulation to explore the eect of unequal variances on regression inference.
Specically, for the model:
yi = 0 + 1 xi + i
use the following:
Sample size: n = 100.
Number of simul
STAT 645: Biostatistics, Fall 2012 - Assignment 3
1. Consider the two-sample t-test with equal variances. Suppose you apply this to a sample
of size n = 60 (30 individuals in each of the two comparison groups). Also, suppose that
the mean in bot
STAT 645: Biostatistics, Fall 2012 - Assignment 7
1. For the surv times data.txt data on DoStat:
(a) Compute separate Kaplan-Meier survival curves for each of the two treatment groups.
Make two plots, one each for the two treatment groups, showi
Aside: Whats a likelihood? Heres a simple example
Generally, the likelihood is just the product of densiHes for each observed sample value.
Set equal to zero
(aOer some algebra)
Correlated Data Regression
Usually, we assume independence:
Recall: Independence of
Then what about
The case under independence
Consider a sam
X: Degree type
Is there a difference in mean
income by degree?
Z = 0:
Z = 1:
Dierent slopes: Interac<on
Correlated Data Regression
is response at 7me
Ideal design for studying change.
For es7ma7on and inference, can apply familiar strategy:
If we could es8mate this, we could do inference.
So, how to es8mate
n p in general, p = number of model coecients
General approach to tes8ng a single paramete
Salmonella Mutant Virulence Data:
Control group: 6 replicates.
Mutant groups (13 of them): 3 replicates each.
j: pepFde, k: group, r: replicate
So, wildtype is the reference group
Consider a sample for which some observa8ons are missing. How might that happen,
and what impact would it have on inference?
Missing completely at random (MCAR): T