homework7-solutions

# homework7-solutions - Statistics 5021 – Homework 7 There...

This preview shows pages 1–3. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistics 5021 – Homework 7 There are 20 total points. This homework is due Thursday, April 28 in your lab section. In this homework, you will analyze datasets (preferably using R ). 1. Two numerical characteristics were measured for 18 countries. The first characteristic: mortality is the mortality rate from heart disease per thousand, and the second characteristic: consumption is the average per capita consumption of wine in liters. These data are available for download on Moodle in the file “wine.txt”. Our interest is to relate mortality with consumption using the simple linear regression model, where mortality is the response and consumption is the predictor. (a) Produce a scatter plot of the response versus the predictor. Based on this scatter plot, would a simple linear regression model be appropriate? Solution: > wine=read.table("wine.txt") > plot(mortality ~ consumption, data=wine) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 40 60 2 4 6 8 10 consumption mortality The scatter plot indicates that a simple linear regression model is not appropriate since the points appear to form a curved pattern. (b) Instead of predicting mortality with consumption , consider predicting log( mortality ) with log( consumption ), where log() is the natural logarithm. Assuming that we read the dataset in as: wine = read.table("wine.txt") we can produce a scatter plot of log( mortality ) versus log( consumption ) using the command: 1 plot(log(mortality) ~ log(consumption), data=wine) Produce this scatter plot. Solution: > plot(log(mortality) ~ log(consumption), data=wine) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 1.5 2.0 log(consumption) log(mortality) (c) Estimate the parameters of the simple linear regression model where the response is log( mortality ) and the predictor is log( consumption ). i. What assumptions must we make about the data? Solution: Let ( x i ,y i ) denote the log consumption and the log morality rate (from heart disease) observed for the i th country. We assume that ( x 1 ,y 1 ) ,..., ( x 18 ,y 18 ) are a realization of ( x 1 ,Y 1 ) ,..., ( x 18 ,Y 18 ), where Y i = β + β 1 x i + E i for i = 1 ,..., 18 and E 1 ,..., E 18 are a random sample from N (0 ,σ ). ii. We estimate the model parameters using the commands: mod = lm(log(mortality) ~ log(consumption), data=wine) summary(mod) Using these commands, compute the sample regression function, what is it estimating?...
View Full Document

{[ snackBarMessage ]}

### Page1 / 7

homework7-solutions - Statistics 5021 – Homework 7 There...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online