Unformatted text preview: tiable. Then the Bregman
divergence associated with f is the function Df : Rn × Rn → R given by
Df (x, y ) = f (x) − f (y ) − ∇f (y )T (x − y ).
(a) Show that Df (x, y ) ≥ 0 for all x, y ∈ dom f . (b) Show that if f = · (c) Show that if f (x) =
to be 0), then 2,
2 then Df (x, y ) = x − y 2 .
i=1 xi log xi (negative entropy), with dom f = Rn (with 0 log 0 taken
n Df (x, y ) =
i=1 (xi log(xi /yi ) − xi + yi ) , the Kullback-Leibler divergence between x and y .
(d) Bregman projection. The previous parts suggest that Bregman divergences can be viewed as
generalized ‘distances’, i.e., functions that measure how similar two vectors are. This suggests
solving geometric problems that measure distance between vectors using a Bregman divergence
rather than Euclidean distance.
minimize Df (x, y )
subject to x ∈ C ,
with variable x ∈ Rn , is a convex optimization problem (assuming C is convex). (e) Duality. Show that Dg (y ∗ , x∗ ) = Df (x, y ), where g = f ∗ and z ∗ = ∇f (z ). You can assume...
View Full Document
- Fall '13
- The Aeneid