SVM_regression

Course: ECE 5990, Fall 2009
School: UCCS
Introduction An of SVM for nonlinear regression The SVM nonlinear estimator is more robust than a least-squares estimator, i.e., insensitive to small changes. With robustness as a design goal, any quantitative measure of robustness must be concerned with the maximum degradation of performance. An optimal robust estimation procedure minimizes the maximum degradation, and will therefore be a minimax procedure of...

Introduction An of SVM for nonlinear regression The SVM nonlinear estimator is more robust than a least-squares estimator, i.e., insensitive to small changes. With robustness as a design goal, any quantitative measure of robustness must be concerned with the maximum degradation of performance. An optimal robust estimation procedure minimizes the maximum degradation, and will therefore be a minimax procedure of some kind. L(d,y) d-y - + The loss function has the form as L (d, y) = | d - y | where d is the desired response and y is the estimator output. To construct a support vector machine for approximating a desired response d, we may use an extension of the loss function as follows. L ( d, y) = | d - y | - , for |d - y| = 0 otherwise. where is a prescribed parameter. The function L(d, y) is called the -insensitive loss function. Support Vector Machines for Nonlinear Regression Consider a nonlinear regression model in which the dependence of a scalar d on a vector x is described by d = f(x) + The function f and the statistics of noise are unknown. All that we have is a set of training data {(xi, di)}i=1N where xi is the sample value of the input vector x and di is the corresponding value of the model output di The problem is to provide an estimate of the dependence of d on x. Now we postulate an estimate of d, denoted by y, which is expanded in terms of a set of nonlinear basis functions { j(x)}m1j=1 as follows: y = j=0m1 wj j(x) = wT (x) where (x) = [ 0(x), 1(x), 2(x), ......., m1(x)]T w = [w0, w1, ...... , wm1]T It is assumed that 0(x) = 1, so that the weight w0 represents the bias b. Now we want to minimize the empirical risk defined as Remp Remp = 1/N i=1N L (di, yi) subject to the inequality ||w|| c0 where c0 is a constant. We reformulate this constrained optimization problem by introducing sets two of nonnegative slack variables { i}i=1N and {'i}i=1N that are defined as follows: di - wT(xi) + i wT(xi) - di + 'i i 0, 'i 0, for i = 1, 2, ..., N for i = 1, 2, ..., N for i = 1, 2, ...., N for i = 1, 2, ...., N The slack variables describe the -insensitive loss function. The constrained optimization problem may therefore be viewed as equivalent to that of minimizing the cost function. (w, , ') = 1/2 wTw + C { Ni = 1 ( i + 'i)} subject to the following constraints di - wT(xi) + i wT(xi) - di + 'i i 0, 'i 0, for i = 1, 2, ..., N for i = 1, 2, ..., N for i = 1, 2, ...., N for i = 1, 2, ...., N Where C is a user-specified parameter. The Lagrangian function is defined as J(w, , ', , ', , ') = 1/2 wTw + C { Ni = 1 ( i + 'i)} - i=1N i[wT(xi) - di + + i] - i=1N 'i[ di - wT(xi) + + i] - Ni = 1 ( i i + 'i 'i) To minimize J(w, , ', , ', , ') with respect to weight vector w and slack variables and ', and To maximize J(w, , ', , ', , ') with respect to , ' and with respect to , '. w can also be expressed as w = i=1N ( i -'i )(xi) i = C - i 'i = C - 'i Now...

