lecture18 - CSE 6740 Lecture 18 How Do I Ensure...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
CSE 6740 Lecture 18 How Do I Ensure Generalization? (Model Selection and Combination) Alexander Gray [email protected] Georgia Institute of Technology CSE 6740 Lecture 18 – p. 1/2 9
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Today 1. The bootstrap 2. Model combination methods CSE 6740 Lecture 18 – p. 2/2 9
Background image of page 2
The Bootstrap The magic of resampling methods. CSE 6740 Lecture 18 – p. 3/2 9
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Empirical Distribution Function Let X 1 ,... ,X N F be IID. The empirical distribution function h F N is the CDF that puts mass 1 /N at each data point X i : h F N ( x ) = N i =1 I ( X i x ) N (1) where I ( X i x ) = 1 if X i x , otherwise 0. CSE 6740 Lecture 18 – p. 4/2 9
Background image of page 4
Empirical Distribution Function For any fixed value x , E p h F N ( x ) P = F ( x ) . (2) V p h F N ( x ) P = F ( x )(1 F ( x )) N . (3) h F N ( x ) p F ( x ) . (4) This is called the Glivenko-Cantelli Theorem : If X 1 ,... ,X N F , sup x v v v h F N ( x ) F ( x ) v v v p 0 . (5) CSE 6740 Lecture 18 – p. 5/2 9
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Bootstrap The bootstrap is a way we can pretend we have a way to get samples from the underlying distribution. We can use this to estimate test errors and confidence intervals, which are effectively about unseen data from the underlying distribution. Let T = g ( X 1 ,... ,X N ) be a statistic, or some function of the data. Suppose we want to know V F ( T ) , the variance of T . For example if T = X then V F ( T ) = σ 2 /N where σ 2 = i ( x μ ) 2 dF ( x ) and μ = i xdF ( x ) . Thus the variance of T is a function of F . CSE 6740 Lecture 18 – p. 6/2 9
Background image of page 6
Bootstrap Variance Estimation There are two steps: 1. Estimate V F ( T ) with V b F ( T ) . 2. Approximate V b F ( T ) by drawing samples from h F . For T = X we have for Step 1 that V b F ( T ) = h σ 2 /N where h σ 2 = 1 N i =1 N ( X i X ) . In this case we are done. But when we don’t know the form of V b F ( T ) we have to approximate it using samples. CSE 6740 Lecture 18 – p. 7/2 9
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Bootstrap Variance Estimation Suppose we draw an IID sample T 1 ,... ,T B from the distribution of T , which we’ll call G . By the law of large numbers, as B → ∞ , T = 1 B B s b =1 T b p i tdG ( t ) = E ( T ) (6) i.e. if we draw a large sample from G , we can use the sample mean to approximate E ( T ) . Similarly, we can use the sample variance to approximate V ( T ) : 1 B B s b =1 ( T b T ) 2 p V ( T ) . (7) CSE 6740 Lecture 18 – p. 8/2 9
Background image of page 8
Bootstrap Variance Estimation How do we get at the distribution of T ? All we have are X values. We can talk about their distribution, F . If we could sample values X * 1 ,... ,X * N from F , we could compute T ( X * 1 ,... ,X * N ) .
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/03/2010 for the course CSE 6740 taught by Professor Staff during the Fall '08 term at Georgia Tech.

Page1 / 29

lecture18 - CSE 6740 Lecture 18 How Do I Ensure...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online