# In reality models are almost never correct so there

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: del is correct for a given problem, then the least squares prediction ˆ is unbiased, and has the lowest variance among f all unbiased estimators that are linear functions of y • But there can be (and often exist) biased estimators with smaller Mse . • Generally, by regularizing (shrinking, dampening, controlling) the estimator in some way, its variance will be reduced; if the corresponding increase in bias is small, this will be worthwhile. • Examples of regularization: subset selection (forward, backward, all subsets); ridge regression, the lasso. • In reality models are almost never correct, so there is an additional model bias between the closest member of the linear model class and the truth. 21 ESL Chapter 3 — Linear Methods for Regression Trevor Hastie and Rob Tibshirani Model Selection Often we prefer a restricted estimate because of its reduced estimation variance. Closest fit in population Realization Closest fit Truth MODEL SPACE Model bias Estimation Bias Shrunken fit Estimation Variance RESTRICTED MODEL SPACE 22 ESL Chapter 3 — Linear Methods for Regression Trevor Hastie and Rob Tibshirani Example: Analysis of time series data Two approaches: frequency domain (fourier)—see discussion of wavelet smoothing. Time domain. Main tool is auto-regressive (AR) model of order k : yt = β1 yt−1 + β2 yt−2 · · · + βk yt−k + t Fit by linear least squares regression on lagged data yt = = yt−1 . .= . β1 yt−1 + β2 yt−2 · · · βk yt−k + t β1 yt−2 + β2 yt−3 · · · βk yt−k−1 + . . . yk+1 = β1 yk + β2 yk−1 · · · βk y1 + t− 1 k+1 23 ESL Chapter 3 — Linear Methods for Regression Trevor Hastie and Rob Tibshirani Example: NYSE data Time series of 6200 daily measurements, 1962-1987 volume — log(trading volume) — outcome volume.Lj — log(trading volume)day−j , j = 1, 2, 3 ret.Lj — ∆ log(Dow Jones)day−j , j = 1, 2, 3 aret.Lj — |∆log(Dow Jones)|day−j , j = 1, 2, 3 vola.Lj — volatilityday−j , j = 1, 2, 3 Source—Weigend and LeBaron (1994) We randomly selected a training set of size 50 and a test set of size 500, from the ﬁrst 600 observations. 24 ESL Chapter 3 — Linear Methods for Regression -2 1 -2 1 • •• • • ••• •• •• ••• • •• ••• • • ••• ••• • •••• •••• ••••• •• • • ••• • •• • •• •••••• •••••• ••• • volume •• •••••• • ••••••• • • ••••••••• • ••••••••••••• • • •••••••••••••• • •••• • ••••••• • •••• •••• • ••••••• • • •• •• ••••••••• ••••• • •••••••• ••••••••• •• • • •• •• • •• ••• ••• •• • • •• •• • •• • 2 •••••• • • ••• ••••• •• • • •••••••••• ••••••• • •••• • • • ••••• • ••• •• ••••• ••••••• • 0 •••••••••••••• volume.L1 •••••••••••• • •••••••••••••• •• • ••••••••••••• ••••••• •• • •••••••••••...
View Full Document

## This note was uploaded on 03/28/2014 for the course STATS 315A taught by Professor Tibshirani,r during the Winter '10 term at Stanford.

Ask a homework question - tutors are online