Optimization - 5 Optimization Optimization plays an...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 5 Optimization Optimization plays an increasingly important role in machine learning. For instance, many machine learning algorithms minimize a regularized risk functional: min f J ( f ) := ( f ) + R emp ( f ) (5.1) with the empirical risk R emp ( f ) := 1 m m i =1 l ( f ( x i ) y i ) . (5.2) Here x i are the training instances and y i are the corresponding labels. l the loss function measures the discrepancy between y and the predictions f ( x i ). Finding the optimal f involves solving an optimization problem. This chapter provides a self-contained overview of some basic concepts and tools from optimization, especially geared towards solving machine learning problems. In terms of concepts, we will cover topics related to convexity, duality, and Lagrange multipliers. In terms of tools, we will cover a variety of optimization algorithms including gradient descent, stochastic gradient descent, Newtons method, and Quasi-Newton methods. We will also look at some specialized algorithms tailored towards solving Linear Programming and Quadratic Programming problems which often arise in machine learning problems. 5.1 Preliminaries Minimizing an arbitrary function is, in general, very dicult, but if the ob- jective function to be minimized is convex then things become considerably simpler. As we will see shortly, the key advantage of dealing with convex functions is that a local optima is also the global optima. Therefore, well developed tools exist to find the global minima of a convex function. Conse- quently, many machine learning algorithms are now formulated in terms of convex optimization problems. We briey review the concept of convex sets and functions in this section. 129 130 5 Optimization 5.1.1 Convex Sets Definition 5.1 Convex Set) A subset C of R n is said to be convex if (1 ) x + y C whenever x C y C and < < 1 . Intuitively, what this means is that the line joining any two points x and y from the set C lies inside C (see Figure 5.1 ). It is easy to see (Exercise 5.1 ) that intersections of convex sets are also convex. Fig. 5.1. The convex set (left) contains the line joining any two points that belong to the set. A non-convex set (right) does not satisfy this property. A vector sum i i x i is called a convex combination if i 0 and i i = 1. Convex combinations are helpful in defining a convex hull: Definition 5.2 Convex Hull) The convex hull, conv( X ) , of a finite sub- set X = { x 1 ... x n } of R n consists of all convex combinations of x 1 ... x n . 5.1.2 Convex Functions Let f be a real valued function defined on a set X R n . The set { ( x ) : x X R f ( x ) } (5.3) is called the epigraph of f . The function f is defined to be a convex function if its epigraph is a convex set in R n +1 . An equivalent, and more commonly used, definition (Exercise 5.5 ) is as follows (see Figure 5.2 for geometric intuition): Definition 5.3 Convex Function) A function f defined on a set...
View Full Document

This note was uploaded on 02/23/2012 for the course STAT 598 taught by Professor Staff during the Spring '08 term at Purdue University-West Lafayette.

Page1 / 52

Optimization - 5 Optimization Optimization plays an...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online