This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Generalization Bounds and Stability 9.520 Class 14, 03 April 2006 Sasha Rakhlin Plan Generalization Bounds Stability Generalization Bounds Using Stability Algorithms We define an algorithm A to be a mapping from a training set S = { z 1 , . . . , z n } to a function f S . Here, z i ( x i , y i ). Throughout the next several lectures, we assume that A is deterministic, and that A does not depend on the ordering of the points in the training set. How can we measure quality of f S ? Risks Recall that in Lecture 2 weve defined the true (expected) risk: I [ f S ] = IE ( x ,y ) [ V ( f S ( x ) , y )] = V ( f S ( x ) , y ) d ( x , y ) and the empirical risk: 1 n I S [ f S ] = V ( f S ( x i ) , y i ) . n i =1 Note : the true and empirical risks are denoted in Bous quet & Elisseeff as R ( A , S ) and R ( A , S ), respectively, to emphasize the algorithm that produced f S . Note : we will denote the loss function as V ( f, z ) or as V ( f ( x ) , y ), where z = ( x , y ). Generalization Bounds Our goal is to choose an algorithm A so that I [ f S ] will be small. This is dicult because we cant measure I [ f S ]. We can, however, measure I S [ f S ]. A generalization bound is a (probabilistic) bound on how big the defect D [ f S ] = I [ f S ] I S [ f S ] can be. If we can bound the defect and we can observe that I S [ f S ] is small, then I [ f S ] must be small. Properties of Generalization Bounds, I What will a generalization bound depend on? A gener alization bound is a way of saying that the performance of a function on the training set has to be similar to its performance on future examples. For this reason, gener alization bounds are always probabilistic : they hold with some (high) probability, to take into account the (low) chance that youll see a very unrepresentative training set. Properties of Generalization Bounds, II Generalization bounds depend on some measure of the size of the hypothesis space we allow ourselves to choose from. As the hypothesis space gets smaller, the generalization bound will get tighter (but the empirical performance will often go down). As the hypothesis space gets bigger, the generalization bound will get looser....
View
Full
Document
This note was uploaded on 11/11/2011 for the course BIO 9.07 taught by Professor Ruthrosenholtz during the Spring '04 term at MIT.
 Spring '04
 RuthRosenholtz

Click to edit the document details