Stat 224 Pset 3
October 29 2015
1. Model Form: The following scatterplot matrix indicates that there seems
to be no substantial colinearity between the predictor variables (other than
between days and i79, which must be the case because of the definition
HSTD 334/Stat 224
Applied Regression Analysis
MIDTERM EXAM 4 November 2014
NAME: (Q _Student ID (if any): _._
INSTRUCTIONS: You have 1 hour 20 minutes (full class period) to work on this exam.
Questions are of varying difculty (so dont overthink the easie
STAT 224 / HSTD 324
Problem Set 2
Fall 2014
Possible points 64.
1. [15 pts total] Exercise 3.3
Table 3.10 shows the scores in the nal examination F and the scores in the two preliminary
examinations P1 and P2 for 22 students in a statistics course. The da
STAT 224 / HSTD 324
Problem Set 3
Fall 2014
Possible points 67
1. [12 pts, 3 pts for each part] Exercise 4.1(a)
Check to see whether or not the standard regression assumptions are valid for the following
data set:
The Milk Production data described in Sec
STAT 224 / HSTD 324
Solutions to Problem Set 1
Fall 2014
Possible points 90.
1. [6 pts, 2 pts each] Exercise 2.2
(a) Disagree. Cov(Y, X) can take any value between to +, but the correlation measure
cor(Y, X) must be between -1 and 1.
(b) Disagree. If Cov(
STAT 224 / HSTD 324
Problem Set 5
Fall 2014
Possible points 45
1. [15 pts total] Exercises 6.7
Oil Production Data: The data in Table 6.19 are the annual world crude oil production in
millions of barrels for the period 1880-1988. The data are taken from M
STAT 224 / HSTD 324
Problem Set 1
Fall 2014
Possible points 90.
1. [6 pts, 2 pts each] Modied Exercise 2.2
Explain why you would or would not agree with each of the following statements:
(a) Cov(Y, X) and Cor(Y, X) can take values between and +.
(b) If Co
STAT 224 / HSTD 324
Problem Set 4
Fall 2014
Possible points 78.
1. [20 pts total] Modied from Exercises 5.4
Perform a thorough analysis of the Education Expenditures data in Tables 5.12, 5.13, and
5.14 using the ideas presented in Section 5.7. You are exp
STAT 224 / HSTD 324
Solutions to Problem Set 2
Fall 2014
Possible points 64.
1. [15 pts total] Exercise 3.3
. use http:/www.ats.ucla.edu/stat/stata/examples/chp/p076, clear
(a) The scatter plots indicate positive associations between preliminary exams and
STAT 224 / HSTD 324
Solutions to Problem Set 5
Fall 2014
Possible points 45.
1. [15 pts total] Exercises 6.7
. use http:/www.ats.ucla.edu/stat/stata/examples/chp/p179, clear
0
5000
Barrels
10000
15000
20000
(a) [2 pt] Construct a scatter plot of the oil p
More on Simple Linear Regression
Problem: Mercury levels in fish tissue for largemouth bass in
the Wacamaw and Lumber Rivers
Rivers in North Carolina contain small concentrations of mercury,
which can accumulate in fish over their lifetimes because mercur
PBHS 32400 / STAT 22400 Autumn 2015 J. Dignam
Adjusting for Unequal Variance (Heteroscedastic Errors)
We looked at some transformations that deal with situations
where the response variable is not normally distributed but rather
comes from a distribution
STAT 224 / PBHS 324
Problem Set 1
Autumn 2015
Possible points 80
1. [4 pts, 2 pts each] Modified Exercise 2.2
Explain why you would or would not agree with each of the following statements:
(a) Cov(Y, X) and Cor(Y, X) can take values between and +.
(b) If
HW2 STATA Help
To perform correlation of variables
cor y var1 var2
To perform regression and F test to test significance of variable
regress y var1 var2
test var1
To test if two variables are statistically different from zero
test var1 var2
To generate a
PBHS 32400 / STAT 22400
Regression Models for a Probability of Response Outcome
To now we have talked about regression models where the
response variable Y was continuous and (approximately) normally
distributed. We now consider the case where Y is a bin
PBHS 32400 / STAT 22400
Variable (Model) Selection
Thus far we have mostly worked with example problems where
predictor variables were identified in advance. All or most of these
had some value towards constructing the linear model. Often in
modeling, we
PBHS 32400 / STAT 22400
Multicollinearity in Multiple Regression
What is multicollinearity? Example from Table 9.1 and 9.2 of
C&H: Equal Educational Opportunity (EEO) data:
Measurements were taken in 1965 for 70 random schools. The
level of student achie
A Generalized Approach for Many Model Types
Noting and taking advantage of commonalities among linear
models for dierent response variable types, Nelder and
Wedderburn and later McCullagh (UChicago) and Nelder
developed Generalized Linear Models
This ap
* Example: create dummy variables (HW4 Q2)
* load dataset
use http:/www.ats.ucla.edu/stat/stata/examples/chp/p148, clear
* create dummy (indicator) variables for the fertilizers
* there are three ways to create them, you can use any one of the followings:
* Example: create dummy variables (HW4 Q2)
* load dataset
use http:/www.ats.ucla.edu/stat/stata/examples/chp/p148, clear
* create dummy (indicator) variables for the fertilizers
* there are three ways to create them, you can use any one of the followings
PBHS 32400 / STAT 22400 J Dignam
Multicollinearity in Multiple Regression
What is multicollinearity? Example from Table 9.1 and 9.2 of
C&H: Equal Educational Opportunity (EEO) data:
Measurements were taken in 1965 for 70 random schools. The
level of stud
Confounding
Confounded in ordinary English means confused or perplexed.
The statistical use is essentially the same: our attempt to estimate
the true effect of weight on heart disease was unsuccessful
because the effect was mixed-up with the effect of a
PBHS 32400 / STAT 22400 J. Dignam
More on Logistic Regression
Recall that logistic regression provides a way of relating predictor
variables X to binary (0,1) outcome variable Y . This outcome
variable is an indicator for yes/no, event/non-event, etc
Th
More Multiple Linear Regression
45
Testing in Multiple Regression - subsets of coefficients
The hypothesis setup is:
H0 : j = k = . . . = m = 0
H1 : at least one of these 6= 0
We use properties of the sum of squared errors
X
(yi yi )2
from the dierent mod
Stat 224 Practice Final Exam, Spring 2016
Name:
Question 1:
of 6 pts
Question 2:
of 15 pts
Question 3:
of 12 pts
Question 4:
of 9 pts
Question 5:
of 35 pts
Question 6:
of 15 pts
Total:
of 92 pts
1. The summary for a linear fit in r is given below.
Coeffic
Stat 224 Practice Final Exam, Spring 2016
Name:
Question 1:
of 6 pts
Question 2:
of 15 pts
Question 3:
of 12 pts
Question 4:
of 9 pts
Question 5:
of 35 pts
Question 6:
of 15 pts
Total:
of 92 pts
1. The summary for a linear fit in r is given below.
Coeffic
Stat 224, Spring 2016, Homework 1
Due online on Monday, April 11 at 11:59pm.
1. (Modified Exercise 2.2 )
Explain why you would or would not agree with each of the following statements:
(a) Cov(Y, X) and Cor(Y, X) can take values between and +.
(b) If Cov(