Stat 151a Linear Models
Homework 2 Solutions
October 5, 2015
1. Defining a = (a1 , . . . , an ), we have from the question E(aT y) = 1 . Since E(y) = X, we have
aT X = 1
or another way of writing the same thing,
(X T a)T = 1
Defining a new vector c as X T
Homework One
Statistics 151a (Linear Models)
Due on 16 September, 2015
06 September, 2015
1. Consider simple linear regression where there is one response variable y and an
explanatory variable x and there are n subjects with values y1 , . . . , yn and x1
STAT 151A Linear Modeling
Homework 1 Solution
March 5, 2016
Problem 1
and SY = SX .
Suppose that the means and standard deviations of Y and X are the same: Y = X
(a) Show that under these circumstances,
BY |X = BX|Y = rXY
where BY |X is the least-squares
Stat 151a Linear Models
Homework 3 Solutions
October 14, 2015
1. For an orthonormal set u1 , . . . , un we will freely use the observation (which you should be familiar with
by now) that
(
1 if i = j
T
(ui , uj ) = ui uj =
0 if i 6= j
n
(a) Since u1 , . .
Fall 2013 Statistics 151 (Linear Models) : Lecture Six
Aditya Guntuboyina
17 September 2013
We again consider Y = X + e with Ee = 0 and Cov(e) = 2 In . is estimated by solving the normal
equations X T X = X T Y .
1
The Regression Plane
If we get a new sub
Homework Two
Statistics 151a (Linear Models)
Due on 30 September 2015
21 September, 2015
1. Consider the linear model Y = X + e with = (0 , 1 , . . . , p )T . Suppose I can
find real numbers a1 , . . . , an such that
E (a1 Y1 + . . . an Yn ) = 1 .
Show th
Fall 2013 Statistics 151 (Linear Models) : Lecture Twenty One
Aditya Guntuboyina
12 November 2013
1
Generalized Linear Models
We have n observations on a response variable y1 , . . . , yn and on each of p explanatory variables xij for
i = 1, . . . , n and
Fall 2013 Statistics 151 (Linear Models) : Lecture Twenty Six
Aditya Guntuboyina
05 December 2013
1
Classication Trees
We looked at regression trees in the class. The idea behind classication trees is similar.
The classication tree is constructed top down
Stat 151 Fall 2015
Homework 1 Solutions
September 17, 2015
1. (a)
0 = y 1 x
P
Cov(x, y)
(xi x
)(yi y)
P
=
(xi x
)2
Var(x)
1 =
(b)
P
0 = x
1 y
1 =
(xi x
)(yi y)
Cov(x, y)
P
=
(yi y)2
Var(y)
(c) From the above expressions we see that
(Cov(x, y)2
1 1 =
1
V
Fall 2013 Statistics 151 (Linear Models) : Lecture Twenty Two
Aditya Guntuboyina
14 November 2013
We will again look at tting the logistic regression model to data. But before that, let us digress a
little and take a brief look at weighted least squares
1
Stat 151 Fall 2015
Homework 4 Solutions
November 9, 2015
1. (a) We have iid observations X1 , . . . , Xn from a dsitribution with unknown variance 2 . We shall use bootstrap to
calculate a confidence interval for
Algorithm 1 1 bootstrap confidence interv
Stat 151 Fall 2015
Homework 5 Solutions
December 6, 2015
1. The first 9 plots are shown in Figure 1
bfat = read.table('bodyfat_corrected.txt',header = TRUE)
# fitting linear model and getting diagnostics
lmodel = lm(BODYFAT~ AGE + WEIGHT + HEIGHT + THIGH,
Homework Four
Statistics 151a (Linear Models)
Due on 04 November 2015
26 October, 2015
1.
a) Suppose X1 , . . . Xn are i.i.d observations from a distribution with known variance 2 . Describe a
bootstrap-based algorithm to compute a 95% confidence interval
Homework Five
Statistics 151a (Linear Models)
Due on 18 November 2015
06 November, 2015
1. In the Bodyfat dataset, consider the linear model
BODYFAT = 0 + 1 AGE + 2 WEIGHT + 3 HEIGHT + 4 THIGH + e
In R, plot the following graphs (9 points = one for each g
Homework Three
Statistics 151a (Linear Models)
Due on 14 October 2015
06 October, 2015
1. Suppose u1 , . . . , un form an orthonormal basis of Rn .
a) For every y Rn , show that the following is true (1.5 points)
n
X
y=
(uTi y)ui .
i=1
b) Show that
Pn
T
i
Fall 2013 Statistics 151 (Linear Models) : Lecture Twenty
Aditya Guntuboyina
7 November 2013
1
Generalized Linear Models
We have so far studied linear models. We have n observations on a response variable y1 , . . . , yn and on
each of p explanatory varia
Fall 2013 Statistics 151 (Linear Models) : Lecture Nineteen
Aditya Guntuboyina
5 November 2013
1
Criteria Based Variable Selection
We have so far looked at the following criteria.
1. Adjusted R2
2. AIC
3. BIC
4. Mallowss Cp
2
Mallowss Cp
Mallows Cp is den
Midterm One
Statistics 151, Fall 2013
08 October, 2013
1. Last year, 80 students took this particular course at Berkeley of whom 20 were freshmen, 20 were
sophomores, 20 juniors and 20 seniors. In R, I have saved the scores for the 20 freshmen in the
vect
Fall 2013 Statistics 151 (Linear Models) : Lecture Three
Derek Bean
05 September 2013
1
Linear Algebra Review, contd
Result: for matrix A, rank(A) + dim(K(A) = no. of columns in A.
Denition: Matrix A is full rank if rank(A) = no. of columns in A (i.e. d
Fall 2013 Statistics 151 (Linear Models) : Lecture Five
Aditya Guntuboyina
12 September 2013
1
Least Squares Estimate of in the linear model
The linear model is
with Ee = 0 and Cov(e) = 2 In
Y = X + e
where Y is n 1 vector containing all the values of the
Fall 2013 Statistics 151 (Linear Models) : Lecture Four
Aditya Guntuboyina
10 September 2013
1
Recap
1.1
The Regression Problem
There is a response variable y and p explanatory variables x1 , . . . , xp . The goal is understand the relationship between y
Statistics 151a - Linear Modelling: Theory and
Applications
Adityanand Guntuboyina
Department of Statistics
University of California, Berkeley
29 August 2013
1 / 25
The Regression Problem
This class deals with the regression problem where the goal is to
u
Fall 2013 Statistics 151 (Linear Models) : Lecture Twelve
Aditya Guntuboyina
10 October 2013
1
Regression Diagnostics
We now talk about regression diagnostics. I follow the treatment in Christensens book (Plane Answers
to Complex Questions), Chapter 13 ve
Fall 2013 Statistics 151 (Linear Models) : Lecture Seven
Aditya Guntuboyina
19 September 2013
1
Last Class
We looked at
1. Fitted Values: Y = X = HY where H = X(X T X)1 X T Y . Y is the projection of Y onto the
column space of X.
2. Residuals: e = Y Y = (
Fall 2013 Statistics 151 (Linear Models) : Lecture Eleven
Aditya Guntuboyina
03 October 2013
1
One Way Analysis of Variance
Consider the model
yij = i + eij
for i = 1, . . . , t and j = 1, . . . , ni
where eij are i.i.d normal random variables with mean z
Fall 2013 Statistics 151 (Linear Models) : Lecture Thirteen
Aditya Guntuboyina
15 October 2013
1
Regression Diagnostics
For regression diagnostics, we need to know about the following quantities:
1. Leverage
2. Standardized or Studentized Residuals
3. Pre
Fall 2013 Statistics 151 (Linear Models) : Lecture Sixteen
Derek Bean
24 October 2013
We went over some diagnostic tools, mostly graphical, for checking the reasonableness of certain key
assumptions of the linear model:
Normality are the errors normally
Fall 2013 Statistics 151 (Linear Models) : Lecture Fifteen
Derek Bean
22 October 2013
1
Recap
Let (x1 , y1 ), . . . , (xn , yn ) be n predictor-response pairs; let X be the n (p + 1) design matrix with ith
row (1, xi )T (so an intercept is included) and l
Fall 2013 Statistics 151 (Linear Models) : Lecture Seventeen
Aditya Guntuboyina
29 October 2013
1
Variable Selection
Consider a regression problem with a response variable y and p explanatory variables x1 , . . . , xp . Should
we just go ahead and t a lin
Fall 2013 Statistics 151 (Linear Models) : Lecture Eighteen
Aditya Guntuboyina
31 October 2013
1
Criteria Based Variable Selection
If there are p explanatory variables, then there are 2p possible linear models. In criteria-based variable
selection, we t a
Estimation in linear models
y
= X + ,
y1
y
=
.
.
.
yn
X is an
nk
,
matrix of known constants, called design matrix (or
model matrix)
is a
k 1
vector of unknown parameters
Gauss-Markov condition:
E()
= 0,
cov()
= 2I
1/8
Method of least squares
To estimat
Test for goodness of t of a normal linear model
How can we tell if a model ts the data? If the model is correct then s 2
should be an unbiased estimate of 2 .
Since s 2 is computed from the chosen model, it needs to be compared to
some model-free estimate
n
mn
m
(
+ )=
( )+
(
+ )=
( ) T
n
( T )= T
( )
( ) = ([ ( )][ ( )]T )
( )
T
( ) = ( T [ ( )][ ( )]T ) = ( T [ ( )])2 )
1/1
Suppose y is a random vector (y1 , . . . , yn )T . Let
= E(y) = (E(y1 ), . . . , E(yn )T and let V = cov(y) be the covariance
matr
Multiple comparisons (simultaneous condence intervals)
Under a one-way layout model
yih = i + ih , 1 i k , 1 h ni , ih : i.i.d. N(0, 2 ),
a 100(1 )% condence interval for i j is
y i . y j . tnk ;/2 s n1i
+ n1j
P(y i . y j . tnk ;/2 s
1
1
n i + n j i j
q
MLE of
(or
= X )
Hereafter, for simplicity, suppose = 1.
Log-likelihood function:
l (, y ) =
Pn
1
i =1 cfw_wi [yi i b (i )] + c (yi , wi )
Usually there is no closed form solution for the maximum likelihood
estimate; iterative method is needed.
Since i =