Assignment 5 Statistics 231
Due: Tuesday, March 29
You can use the following R code to calculate confidence intervals and pvalues for both
blocked and unblocked comparative investigations.
For blocked investigations, the data are stored one row per block with the explanatory
variates in two columns
y1
and
y2
corresponding to the two values of the explanatory
variate. Use the R code
t.test(y1,y2,paired=T)
.
For unblocked investigations, the data are stored one row per unit. Suppose the response
variate is y and the explanatory variate is x. Use the R code
t.test(y~x, var.equal=T)
.
1.
Long exposure to radon, a naturally occurring radioactive gas, is thought to cause
lung cancer. In a casecontrol study, researchers selected 549 lung cancer patients
from a cancer registry and 621 community controls (individuals without lung cancer).
Each subject had lived in a single family dwelling with a basement for at least 15
years. The researchers then measured the concentration of radon (Bq/m
3
) in the
homes of the 1150 subjects in the sample. They also measured other variates,
especially smoking history, since smoking is a known cause of lung cancer. You can
access the data with the R command
source(‘http://uwangel.uwaterloo.ca/AngelUploadsuwangel/Content/MRG041122
144725_admin/_assoc/04C00F4D113D4ABE958960DCA3358257/sourceAss5q1.txt’)
The variate names and values are:
subject
type: case (1) or control (0)
exposure: radon concentration
smoking: heavy smoker(2), moderate smoker(1), nonsmoker(0)
a)
Suppose the target population is all people. Define what it means to say that “radon
causes lung cancer”.
Hold all explanatory variates on all people (the target population) fixed.
Set the radon concentration experienced by all people and determine the proportion
who get lung cancer.
Change the radon concentration experienced by all people and again determine the
proportion who get lung cancer.
If the proportion of people who get lung cancer has changed, the change in radon
concentration caused the change in lung cancer rate, or more informally “radon
causes lung cancer.”
b) Ignoring smoking history, construct histograms for the radon concentrations for the
cases and controls. Does a gaussian model seem appropriate?
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 Marsh
 Statistics, Normal Distribution, PValues, Tobacco smoking, Statistical hypothesis testing, radon

Click to edit the document details