# mar2 - STA 414/2104 Mar 2 2010 Notes I No Office Hour today...

This preview shows pages 1–9. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STA 414/2104 Mar 2, 2010 Notes I No Office Hour today, Tuesday Mar 2 I Practise test questions posted I HW 2 due date Tuesday Mar 9 I HW 2 questions and HW 1 answers Thursday, Mar 4 I Statistical Society of Canada case studies http://www.ssc.ca/documents/case_studies/2010/ 1 / 19 STA 414/2104 Mar 2, 2010 § 7.3 Estimating expected prediction error I Err T = E { L ( Y , ˆ f ( X )) | T } Err = E T [ E { L ( Y , ˆ f ( X )) | T } ] I know that err = 1 N ∑ L { y i , ˆ f ( x i ) } is too small I can show that on average, err should be inflated by 2 d σ 2 N I where d is the number of inputs 3 / 19 STA 414/2104 Mar 2, 2010 ... can show ( § 7.4) I Define Err in = 1 N N X i = 1 E { L ( Y i , ˆ f ( x i )) | T } I new Y i at each of the training inputs x 1 ,..., x N I and define ω = E y ( Err in- err ) I expectation over y instead of T ( x ’s fixed) I ω = 2 N N X i = 1 cov (ˆ y i , y i ) I leading to (7.22) I E y ( Err in ) = E y ( err ) + 2 N N X i = 1 cov (ˆ y i , y i ) = E y ( err ) + 2 N d σ 2 I for linear fits with d basis functions 4 / 19 STA 414/2104 Mar 2, 2010 ... can show I E y ( Err in ) = E y ( err ) + 2 N N X i = 1 cov (ˆ y i , y i ) = E y ( err ) + 2 N d σ 2 I ( § 7.5) err estimates E y ( err ) 2 N d ˆ σ 2 estimates 2nd term I with log-likelihood loss function, the result is I as N → ∞ ,- 2 E { log Pr ( Y ; ˆ θ ) } ’ - 2 N N X i = 1 log Pr ( y i ; ˆ θ ) + 2 d N I which motivates AIC =- 2 N ‘ ( ˆ θ ; y ) + 2 d N I e.g. in (7.30) d = d ( α ) depends on a smoothing parameter I ( § 7.6) with smoothing splines replace d by trace S λ I skip § 7.7, 8, 9 5 / 19 STA 414/2104 Mar 2, 2010 Direct estimation of EPE § 7.10 I another way to estimate prediction error I Err = E T E { L ( Y , ˆ f ( X )) | T } I CV ( ˆ f ) = 1 N N X i = 1 L ( y i , ˆ f- κ ( i ) ( x i ) } I leave-one-out ( N-fold): I 1 N X L ( y i , ˆ f- i ( x i )) = ( e . g . ) 1 N X { y i- ˆ f- i ( x i ) } 2 I y i- ˆ f- i ( x i ) = y i- ˆ f ( x i ) 1- S ii I for any linear smoother ˆ f = Sy I CV = 1 N X { y i- ˆ f ( x i ) 1- S ii } 2 I GCV = 1 N X { y i- ˆ f ( x i ) 1- tr ( S ) / N } 2 6 / 19 STA 414/2104 Mar 2, 2010 comments I § 7.10.2,3: CV must be carried out before any model simplification that uses the y ’s I both CV and K-fold CV seem to estimate expected prediction error but not prediction error I CV can be quite variable (especially leave one out) I p.231: using AIC, CV or GCV to choose smoothing parameter seems to overfit the data (?) I Example: biasvariance.R 7 / 19 STA 414/2104 Mar 2, 2010 Projection pursuit regression (...
View Full Document

{[ snackBarMessage ]}

### Page1 / 21

mar2 - STA 414/2104 Mar 2 2010 Notes I No Office Hour today...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online