Chapter 6: Statistical Diagnostics
Model: The residual associated with each yi is
yi = 0 + 1x1i + 2x2i + + kxki + i.
Basic assumptions:
(1) Model is correct
(2) Errors is are uncorrelated
(3) E(i) = 0
(4)
(5)
Var(i) = 2
i ~ Normal distribution, required
STAT 2110: Assignment 3
Due date: 7-th December, 2011
NOTE: Please drop your assignment to the mailbox
STAT2110 (NOT my mailbox)!
1. The following data contain the rent and other characteristics for 36
apartments. The goal of the study is to develop a mod
STAT 2110: Assignment 2
Due date: 22nd November, 2011
Do ALL programmings in SAS.
1.
The age y of a clay as a function of the amount (in units) x of a certain mineral is
given by the model y = 0 + 1x + where ~ N(0, 100). During an archeological
excavation
STAT2120 CDA/28.Mar.2013
Weekly Review 10
We nished Chapter 5 this week and Assignment 4 consists of 5.3, 5.10, 5.15, 5.16, 5.18,
5.27 and 5.28; the due date is April 15. Enjoy it!
Chapter 5 continues our discussion on logistic regression. Section 5.1.2 m
STAT2120 CDA/18.April.2013
Weekly Review 12
This review for Chapter 8 will not be as details as those for previous chapters because we
already had another pdf le that summarizes the key issues. However, I still expect that you
will read the corresponding
STAT2120 CDA/22.Mar.2013
Weekly Review 9
Because of the mid-term test, this week we had only two hours for discussion and we
nished Chapter 4. Assignment 3 consists of 4.5, 4.7a, 4.7c, 4.12, 4.13, 4.15, 4.20 and 4.29,
and the due date is April 8, 2013. En
STAT2120 CDA/15.Mar.2013
Weekly Review 8
In the last review we discussed the following table:
Table 4.4: Development of AIDS symptoms by AZT use and race
Race
White
Black
AZT use
Yes
No
Yes
No
Symptoms
Yes
No
14
93
32
81
11
52
12
43
Using dummy variables
STAT2120 CDA/7.Feb.2013
Weekly Review 4
In the last review I introduced the following problem. A person claimed that he was able
to distinguish Coke and Pepsi. We gave him a glasses of Coke and b glasses of Pepsi, where
a + b = n, in random order to taste
STAT2120 CDA/7.Mar.2013
Weekly Review 7
We nished more than a half of this course and we have six more teaching weeks. Chapter
3 was done on Monday and then we started our deeper discussions of various GLMs. Selected
problems for Chapter 3 are 3.3, 3.4, 3
STAT2120 CDA/22.Feb.2013
Weekly Review 5
Because of the Chinese New Year holiday, we had only two hours this week, during which
we nished Chapter 2 and started Chapter 3.
Let us consider three-way tables , i.e. tables with three variables. Suppose we have
STAT2120 CDA/11.April.2013
Weekly Review 11
In this week we started and nished Chapter 7 (up to Section 7.3) this week. Here is
Assignment 5: problems 7.1, 7.3, 7.4, 7.6, 7.7, 7.10 and 7.16. The due date is 25 April 2013.
Chapter 7 gives details of anothe
The SAS System
14:06 Thursday, January 31, 2013
The FREQ Procedure
Table of poured by guess
poured
guess
Frequency
Percent
Row Pct
Col Pct
1
2 Total
1
3
1
4
37.50 12.50 50.00
75.00 25.00
75.00 25.00
2
1
3
4
12.50 37.50 50.00
25.00 75.00
25.00
The SAS System
1
16:27 Wednesday, February 20, 2013
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Response Variable (Events)
Response Variable (Trials)
Number
Number
Number
Number
of
of
of
of
WORK.GLM
Binomial
Identity
disease
The SAS System
14:06 Thursday, January 31, 2013
The FREQ Procedure
Table of gender by party
gender
party
Frequency
Expected
Percent
Row Pct
Col Pct
1
2
3 Total
1
762
327
468
1557
703.67 319.65 533.68
27.64 11.86 16.97 56.47
48.94 21.00 30.06
The SAS System
14:06 Thursday, January 31, 2013
The FREQ Procedure
Table of group by mi
group
mi
Frequency
Expected
Percent
Row Pct
Col Pct
1
2 Total
1
189 10845 11034
146.48 10888
0.86 49.14 49.99
1.71 98.29
64.51 49.80
2
104 10933 11037
146.52
The SAS System
14:06 Thursday, January 31, 2013
The FREQ Procedure
Table of malform by alcohol
malform
alcohol
Frequency
Percent
Row Pct
Col Pct
0
0.5
1.5
4
7 Total
1 17066 14464
788
126
37 32481
52.39 44.40
2.42
0.39
0.11 99.71
52.54 44.53
2.
The SAS System
17:29 Friday, February 22, 2013
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Dependent Variable
WORK.CRAB
Poisson
Log
satell
Number of Observations Read
Number of Observations Used
173
173
Criteria For Assessin
Review
The purpose of this review is to summarize the main issues that we have discussed.
Nevertheless, it is not exhaustive.
1. Contingency Tables
(a) A 2 2 Table
Gender
Female
Male
Total
Yes
n11
n21
n+1
Heart Attack
No
n12
n22
n+2
Total
n1+
n2+
n+
i. Re
STAT2120 CDA/11.April.2013
Chapter 8.
Models for Matched Pairs
Consider Table 8.1:
Pay higher
taxes
Yes
No
Total
Cut living standards
Yes
No
227
132
107
678
334
810
Total
359
785
1144
Our interest is the null hypothesis that there is marginal homogeneit
STAT2120 CDA/1.Mar.2013
Weekly Review 6
In the last review we introduced the probit model, which was once a very popular model.
However, nowadays, its dominance has been taken over by another model, which came from
the same argument that I used for the to
STAT2120 CDA/31.Jan.2013
Weekly Review 3
Let us consider again Table 2.5 on page 38:
Table 2.5: Cross classication of Party identication by gender
Gender
Female
Male
Total
Democrat
762
484
1246
Party Identication
Independent
327
239
566
Republican
468
447
STAT 2110 REGRESSION ANALYSIS
Instructor:
Tang, Man-Lai
e-mail: [email protected]
Office: FSC 1201
Prerequisite:
STAT1131-2 Probability & Statistics I & II and
MATH1121 Linear Algebra I
Time & Place:
Tue 8:30 10:20
Wed 15:30 16:20
Office Hours:
Will
Chapter 5: Model Building
1. Polynomial regression
Case 1. Suppose we have only one regressor/independent
variable, x.
If we observe a nonlinear relationship between x and y, we
may consider the following polynomial regression:
y = 0 + 1x + 2 x2 + + p x
Chapter 3: The Simple Linear Regression Model
Part II
1. Simple regression with fixed intercept
Suppose we observe a set of n pairs of observations (x1, y1), ,
(xn, yn).
Suppose the intercept 0 is known.
After some algebra, the least squares estimate
Chapter 1
Introduction: Matrix Algebra and Some Useful
Distributions
1. Matrix arithmetic
A matrix (over the real line ) is a rectangular array of elements from .
A matrix with m rows and n columns is said to be of dimension/size
mn.
Mmn() denotes the
Regression Analysis
Assignment 3
Ex 1.
(a) From the correlation matrix, Rent is highly correlated with Size, N Bed and
Age, and N Bed is highly correlated with Size. As we can see from the scatter
plots, there is a linear relationship between Rent and Siz
Chapter 2: The Simple Linear Regression Model
Part I
1. What is regression?
The word was originally used by an English statistician Galton.
In his 1885 paper to Royal Anthropological Institute, he derived
regression and used it interchangeably with the
STAT 2110 Regression Analysis: Assignment 1
Due date: 19 October, 2011
1. (Regression without an intercept). As discussed in class we almost always include
an intercept. It is possible however to carry out a regression without an intercept.
In that case t
Regression Analysis
Assignment 2
Ex 1. The least squares regression line is
y = 905.47619 42.85714 x.
(a) The prediction for the age of the last day is y3 = 476.9048. The error in
prediction is 3 = 3.0952, and its variance is
var() = 2 (1 h3 ) = 100 (1 0.
Chapter 7: Nonlinear Regression
7.1 Introduction
Basic idea: Like linear regression (model), nonlinear regression
tries to relate a response y to a vector of predictor variables x =
(x1, x2, , xk)t.
Feature: Nonlinear regression is characterized by the fa