Winter 2011/12 Term 1
Statistics 200
Assignment #2
Due: 5pm on Friday Oct 14, 2011
(Total mark: 50)
1. Consider the data in the file
alcohol.xls
. These are data from a British government survey of house
hold spending. We are interested in the relationship between household spending on tobacco products
and alcholic beverages. This problem illustrates the effect that a single influential observation might
have on a leastsquares regression fit.
Note: You may do this question by hand or by using Excel.
(a) Draw a scatter plot of ”Alcohol” as function of ”Tobacco”. (1 mark)
(b) Compute the intercept and slope for the corresponding least squares linear regression line.
(1
mark)
(c) Add the regression line to the previous plot. Also obtain the residual plot. Is a linear relationship
between these variables reasonable? Do you see any atypical or outlying point? (3 marks)
(d) Repeat (a), (b) and (c) above without using the observation from Northern Ireland. (5 marks)
(e) Which of the two linear regression lines (the one obtained with all the data and the one without
Northern Ireland) provides a better fit to the data from the first ten regions in the dataset? Briefly
justify your answer. (2 marks)
2. A school nurse is interested in studying the relationship between blood type and dairy allergy in
children. She retrieves the health records of all 785 children in the school that she works in. Information
about the blood type and whether each child has dairy allergy are summarized in the following twoway
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '11
 David
 Linear Regression, ABO blood group system, Dairy allergy

Click to edit the document details