Lab 1
PM 518A
Introduction to STATA
TAs
Charles Lacson
jlacson@usc.edu
Ye Feng (grader)
yefeng@usc.edu
Agenda
Who has had programming experience and
with which programs?
Overview of STATA
Getting started
Log files
Calculating Odds Ratios and Confidence
In

Lab 4
-Analyzing matched data
The mcc command, with syntax like the cc command, calculates the maximum likelihood
estimates of the odds of disease (y) for exposed persons relative to unexposed persons.
- It can only be used to analyze 1:1 matched 2x2 data

Case-Control Study of Low Birthweight
Because infant mortality and birth defect rates are very high in babies born of low
birthweight, this birth outcome is of major concern. This study was conducted
within a particular medical center to determine whether

Basics of Likelihood Theory
1. To address a particular research question, we design a study and collect data on
n (number) subjects. We characterize the data as
x1,xn (x1 = data for subject 1,xn = data for subject n)
and xi f(x;)
(i.e., each xi is a rando

Conditional Logistic Regression for Matched Case-Control Data
Recall that in the unconditional logistic regression model:
logit(P[D|z]) = + z
In this model, is a nuisance parameter (i.e., our real interest is in
estimating the log odds ratio parameters, )

Lab 2
Inputting ASCII Data
Formats and Labels
Creating Categorical Variables
Saving STATA Datasets
Using the CC procedure (Odds Ratios)
Using the TABODDS procedure
Agenda
Address Homework 1 questions
Reading in data (ASCII, xls, dta) into Stata
Useful com

Lab 4
-Analyzing matched data
The mcc command, with syntax like the cc command, calculates the maximum likelihood
estimates of the odds of disease (y) for exposed persons relative to unexposed persons.
- It can only be used to analyze 1:1 matched 2x2 data

Overview of Epidemiologic Analysis
1. Primary concern with discrete, binary events (e.g., mortality, new
diagnosis of breast cancer).
a. An incident event is a new occurrence of a certain disease.
The number of men free of heart disease at age 40 who have

Poisson Regression
SMR analyses are geared to summarizing rates across strata, comparing disease
rates in a crude way to some standard rates, and testing for differences among a
small number of exposure groups.
These methods, however, break down where the

Lab #6 - Introduction to Cohort Studies
The ir command is used to analyze incidence raw data.
It requires three variables:
1.The number of events
2. The exposure
3. The number of person-years of follow-up
Syntax:
ir event exposure py
Note: Exposure "0" is

Proportional Hazards (Cox) Regression: Proportional Hazards
Assumption, Estimation and Graphing of Survival Curves, Tied Event
Times, Model Diagnostics
More on Proportional Hazards Assumption
Last week, we covered three methods to evaluate the proportiona

Analysis of Individual Survival-Time Data
Individual survival-time data: Each data record represents an individual subject
(time of entry into cohort, time of exit from cohort, status at exit, exposure
variables, confounding variables, etc.).
Motivation:

Multivariate Analysis of Case-Control Data
Unconditional Logistic Regression (for unmatched case-control data)
As we have seen in previous lectures, the analytic techniques (2x2 tables, 2xK tables,
matched analysis of dichotomous exposure variables, and s

Lab 5
Unconditional logistic regression:
Syntax:
LOGISTIC casevar indepvars, options
Note the casevar is the disease status (dichotomous outcome) code as 0 or 1, with 1 indicating
the outcome you are modeling the likelihood of.
Key options:
coef gives re

Lab 5 Exercise
Use the STATA procedure CLOGIT and the Leisure World dataset (matched case-control
study of endometrial cancer) to answer the questions below. You will have to read in a new
variable from leisure.dat (duration of estrogen use, in months). I

Exact Logistic Regression for Case-Control Data
In unconditional and conditional logistic regression, the estimation of odds ratios and
confidence intervals as well as hypothesis testing depends on large-sample asymptotic
theory. For example, likelihood r

Cohort Sampling (Hosmer et al., 9.4); Competing Risks (Hosmer, et al., 9.6);
Parametric Survival Analysis
Cohort Sampling
When we have very large cohorts, or the need to collect extra exposure data,
methods of sampling from the cohort may be required.
1.

Introduction to Cohort Studies
Identify an exposed group (free of disease)
Follow forward in time to determine disease outcome
Prospective cohort
Ideal method:
Identify cohort
time (t)
Disease?
Retrospective (historical) cohort: more common method of

Proportional Hazard (Cox) Regression Analysis: Left Truncation
(Staggered Entry), Time-Varying Covariates, Evaluation of
Proportional Hazards Assumption
Recall from last week, a model expressing the individual hazard rate as a
function of a baseline hazar