7bregman

7bregman - EE236C (Spring 2008-09) 7. Gradient methods with...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: EE236C (Spring 2008-09) 7. Gradient methods with generalized distances Bregman distances variant of Nesterovs method example 71 Gradient method and extension basic gradient method for minimizing f (lecture 1) x + = argmin z parenleftbigg f ( x ) + f ( x ) T ( z x ) + 1 2 t bardbl z x bardbl 2 2 parenrightbigg extension for minimizing f + g over C (lectures 4-5) x + = argmin z C parenleftbigg f ( x ) + f ( x ) T ( z x ) + 1 2 t bardbl z x bardbl 2 2 + g ( z ) parenrightbigg = S t ( x t f ( X )) g a simple nondifferentiable function; C a simple convex set interesting if projection/thresholding operation S t is inexpensive Gradient methods with generalized distances 72 Generalization replace (1 / 2) bardbl z x bardbl 2 2 with generalized distance function d ( z, x ) basic gradient update argmin z parenleftbigg f ( x ) + f ( x ) T ( z x ) + 1 t d ( z, x ) parenrightbigg extension with projection/thresholding argmin z C parenleftbigg f ( x ) + f ( x ) T ( z x ) + 1 t d ( z, x ) + g ( z ) parenrightbigg potential benefits select d ( z, x ) to fit the curvature of f , or geometry of C simplify the thresholding/projection Gradient methods with generalized distances 73 Bregman distance functions Bregman distance associated with strictly convex, differentiable h : d ( x, y ) = h ( x ) h ( y ) h ( y ) T ( x y ) h is called the kernel function of d properties convex in x for fixed y d ( x, y ) for all x, y ; d ( x, y ) = 0 if and only if x = y not a real distance (not symmetric) d ( x, y ) ( / 2) bardbl x y bardbl 2 2 if h is strongly convex with constant first two properties follow from (strict) convexity of h Gradient methods with generalized distances 74 Examples quadratic function: h ( x ) = bardbl x bardbl 2 2 / 2 d ( x, y ) = 1 2 bardbl x y bardbl 2 2 negative entropy: h ( x ) = n i =1 x i log x i with dom h = R n ++ d ( x, y ) = n summationdisplay i =1 ( x i log( x i /y i ) x i + y i ) the relative entropy or Kullback-Leibler divergence Gradient methods with generalized distances 75 logarithm barrier: h ( x ) = n i =1 log x i with dom h = R n ++ d ( x, y ) = n summationdisplay i =1 ( x i /y i log( x i /y i )) n inverse barrier: h ( x ) = n i =1 1 /x i with...
View Full Document

This note was uploaded on 01/25/2010 for the course EE 236 taught by Professor Staff during the Spring '08 term at UCLA.

Page1 / 11

7bregman - EE236C (Spring 2008-09) 7. Gradient methods with...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online