lecture9

# lecture9 - ISYE6414 Summer 2010 Lecture 9 Shrinkage Methods...

This preview shows pages 1–3. Sign up to view the full content.

ISYE6414 Summer 2010 Lecture 9 Shrinkage Methods Dr. Kobi Abayomi July 6, 2010 1 Orthogonalization The best scenario for the observed data x in multiple regression is each x j x k : the observed data are linearly independent. Remember a linear regression is a conditional expectation of the response variable Y given the observed data X = x — if the covariate predictors are completely linearly independent, they form an orthogonal basis for Y . Perfect Collinearity is the opposite of linear orthogonality: the predictors x form a degenerate (deﬁcient rank) basis for Y . The regression coeﬃcient estimates ˆ β are non-identiﬁable, in this stiuation, and their variance is inﬂated. The following methods are designed to mitigate the eﬀects of collinearity (linear dependence) between the predictors by replacing them with components — linear combinations — that are generated to be linearly independent. 2 Principal Components Regression (PCR) Recall that x T x = ˆ Σ is the estimate of the covariate matrix of the predictors. Call λ = ( λ 1 ,...,λ k ) the eigenvalues of ˆ Σ, and e its eigenvectors . As such ˆ Σ = e Λ e T , with Λ = diag ( λ 1 ,...,λ k ). The jth principal component of x is z j = e T j x = e j 1 x 1 + ··· + e jk x k 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Constructing the principal components as the inner products of the eigenvectors and the predictors yields principal components z 1 ,..., z k with these properties V ar ( z j ) = λ i Cov ( z j ,z k ) = 0 from the linear orthogonality of the the eigenvectors. This is the Principal Component Anal- ysis (PCA) procedure. Principal Component Regression (PCR) is replacing the predictors in the regression equation with their linearly orthogonal linear combinations: replace ˆ y = x T ˆ β with ˆ y = z T ˆ β * . The goal in the PCR program is to remove linear dependence and to express ˆ y via a low number of components. Here’s an example in R library(faraway) data(meatspec) ####data on fat content of 215 samples of meat ###with 100 channel spectrum of absorbances ###predict fat content from spectrum data ###variables 1-100 range of spectrum ###training sample is first 172 observations model1 <- lm(fat ~ ., meatspec[1:172,]) summary(model1)\$r.squared ###use RMSE as a stat for gof for training and test sample rmse <- function(x,y) sqrt(mean((x-y)^2)) rmse(model1\$fit,meatspec\$fat[1:172]) rmse(predict(model1,meatspec[173:215,]),meatspec\$fat[173:215]) ###bad performance for test sample ###fit to any data is just that!
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 09/01/2011 for the course ISYE 6414 taught by Professor Staff during the Fall '08 term at Georgia Tech.

### Page1 / 8

lecture9 - ISYE6414 Summer 2010 Lecture 9 Shrinkage Methods...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online