9 Pages

### Homework8

Course: EMGT 378, Fall 2009
School: Missouri S&T
Rating:

Word Count: 1274

#### Document Preview

to Introduction Neural Networks Homework 8 Student: Dazhi Li Student ID: 153045 E12.1 We want to train the network shown in Figure E12.1 on the training set {(p1 = [-2]), (t1 = [0.8])}, {(p2 = [2]), (t2 = [1])}, where each pair is equally likely to occur. Write a MATLAB M-file to create a contour plot for the mean squared error performance index. ANSWER: e = t a = t logsig(wp + b) i=1; for w = -3:0.1:3 j = 1;...

Coursehero >> Missouri >> Missouri S&T >> EMGT 378

to Introduction Neural Networks Homework 8 Student: Dazhi Li Student ID: 153045 E12.1 We want to train the network shown in Figure E12.1 on the training set {(p1 = [-2]), (t1 = [0.8])}, {(p2 = [2]), (t2 = [1])}, where each pair is equally likely to occur. Write a MATLAB M-file to create a contour plot for the mean squared error performance index. ANSWER: e = t a = t logsig(wp + b) i=1; for w = -3:0.1:3 j = 1; for b = -3:0.1:3 e = [(0.8-logsig(w*(-2)+b)); (1-logsig(w.*2+b))]; mse(i,j) = e' * e; j = j + 1; end i = i + 1; end [w,b] = meshgrid(-3:0.1:3); subplot(1,2,1); meshc(w,b,mse); axis([-3 3 -3 3 0 1.5]); xlabel('w'); ylabel('b'); zlabel('mse') subplot(1,2,2); [C,h] = contour(w,b,mse); clabel(C,h) xlabel('w'); ylabel('b') 1 T 10 - 6 x + [4,4]x. 2 x - 610 perform three iterations of the variable learning rate algorithm, with initial guess x0 = [-1; -2.5]; Plot the algorithm trajectory on a contour plot of F(x). Use the algorithm parameters = 0.4, = 0.1, = 1.5, = 0.5, = 5% E12.4 For the function of Exercise E12.3, F(x) = ANSWER: The first step is to evaluate the function at the initial guess: x0 = [-1; -2.5]; A = [10 -6; -6 10]; d = [4 4]; F(x0) = 0.5 * x0' * A * x0 + d * x0 ans = 7.2500 The next step is to find the gradient: 10 x1 - 6 x2 + 4 10 - 6 F ( x) = Ax + d = x + [4,4] = - 610 - 6 x1 + 10 x2 + 4 The gradient at the initial guess is: 10 * (-1) - 6 * (-2.5) + 4 9 g0 = F (x) T|x=x0 = = - 6 * (-1) + 10 * (-2.5) + 4 -15 With the initial learning rate of = 0.4 , the tentative first step of the algorithm is x0 = x-1 (1- ) g0 = 0.1*[0;0] (1-0.1)*0.4*[9;-15], = [-3.24; 5.40] x1' = x0 + x0 = [-1; -2.5] + [-3.24; 5.40] = [-4.24; 2.90] To verify that this is a valid step we must test the value of the function at this new point: x1t = [-4.24; 2.90]; F(x1') = 0.5 * x1t' * A * x1t + d * x1t ans = 200.3540 This is more than 5% larger than F(x0). Therefore this tentative step is rejected and the learning rate is redued and the momentum coefficient is set to zero. x1 = x0 , F(x1) = F(x0) = 7.2500, = = 0.5*0.4=0.2, = 0 Then a new tentative step is computed (momentum is zero) x1 = g1 = - 0.2*[9; -15] = [-1.8; 3] x2' = x1 + x1 = [-1; -2.5] + [-1.8; 3] = [-2.8; 0.5] x2t = [-2.8; 0.5]; F(x2') = 0.5 * x2t' * A * x2t + d * x2t ans = 39.6500 Again this is more then 5% larger than F(x1), the tentative step is rejected and the learning rate is reduced and the momentum coefficient is set to zero. x2 = x1 , F(x2) = F(x1) = 7.2500, = = 0.5*0.2=0.1, = 0 Now a new tentative step is computed (momentum is zero) x2 = g2 = - 0.1*[9; -15] = [-0.9; 1.5] x3' = x2 + x2 = [-1; -2.5] + [-0.9; 1.5] = [-1.9; -1] x3t = [-1.9; -1]; F(x3') = 0.5 * x3t' * A * x3t + d * x3t ans = 0.0500 This is less than F(x2). Therefore this step is accepted, the momentum is reset to its original value, and the learning rate is increased. x3 = x3' = [-1.9; -1] , = = 1.5*0.1=0.15, = 0.1 This completes the third iteration. clear [x1 x2] = meshgrid(-3:0.1:3); z = 5*(x1.^2) - 6*(x1.*x2) + 5*(x2.^2) + 4.*x1 + 4.*x2; figure; contour(x1, x2, z); hold on; alfa = 0.4; gamma_0 = 0.1; eta = 1.5; rou = 0.5; ksi = 0.05; x0 = [-1 -2.5]'; A = [10 -6;-6 10]; d = [4 4]'; fold = (0.5)*x0'*A*x0 + d'*x0; dx = zeros(2,1); xold = x0; g0 = A*xold + d; z = [0 0]; w = [0 0]; while abs(g0) > 0.001 dx = gamma_0*dx - (1-gamma_0)*alfa*g0; xt = xold + dx; ft = (0.5)*xt'*A*xt + d'*xt; df = (ft - fold)/fold; if df > ksi alfa = alfa * rou; gamma = 0; elseif df <= ksi & df > 0 z = [xold(1) xt(1)]; w = [xold(2) xt(2)]; plot(z,w); xold = xt; fold = ft; else z = [xold(1) xt(1)]; w = [xold(2) xt(2)]; plot(z,w); xold = xt; fold = ft; alfa = alfa * eta; gamma = 0.1; end g0 = A*xold + d; end 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 E12.5 For the function of Exercise E12.3, perform one iteration of the conjucgate gradient algorithm, with initial guess x0 = [-1; -2.5]; For the linear minimization use interval location by function evaluation and interval reduction by the Golden Section search. Plot the path of the search on a contour plot of F(x). ANSWER: Write the function of Exercise E12.3 in a M-file function [a]=F123(x0) A = [10 -6; -6 10]; d = [4 4]; a = 0.5 * x0' * A * x0 + d * x0; The gradient of this function is: 10 x1 - 6 x2 + 4 10 - 6 F ( x) = Ax + d = x + [4,4] = - 610 - 6 x1 + 10 x2 + 4 As with steepest descent, the first search direction for the conjugate gradient algorithm the is negative of the gradient: - 9 p0 = -g0 = - F (x) T|x=x0 = 15 For the first iteration we need to minimize F(x) along the line x1 = x0 + 0p0 = [-1; -2.5] + 0 * [-9;15]. The first step is interval location. Assume that the initial step size is = 0.075. Then the interval location would proceed as follows: F(a1) = F([-1;-2.5]) = F123([-1;-2.5]) ans = 7.2500 b1 = = 0.075, F(b1) = F([-1;-2.5]+0.075*[-9;15]) = F123([-1;-2.5]+0.075*[-9;15]) ans = -2.5375 b2 = 2 = 0.15. F(b2) = F([-1;-2.5]+0.15*[-9;15]) = F123([-1;-2.5]+0.15*[-9;15]) ans = 14.0000 Since the function increases between two consecutive evaluations we know that the minimum must occur in the interval [0, 0.15]. The next step in the linear minimization is interval reduction using the Golden Section search. This proceeds as follows: c1 = a1 + (1- )(b1-a1) = 0 + 0.382*(0.15-0) = 0.0573, d1 = b1 - (1- )(b1-a1) = 0.15 0.382*(0.15-0) = 0.0927, Fa = 7.2500, Fb = 14, Fc = F([-1;-2.5]+0.0573*[-9;15]) = F123([-1;-2.5]+0.0573*[-9;15]) ans = -2.6009 Fd = F([-1;-2.5]+0.0927*[-9;15]) = F123([-1;-2.5]+0.0927*[-9;15]) ans = -1.0079 Since Fc < Fd, we have a2 = a1 = 0, b2 = d1 = 0.0927, d2 = c1 = 0.0573 c2 = a2 +(1- )(b2-a2) = 0 + 0.382 * (0.0927 0) = 0.0354 Fd = F(c1) = -2.6009. Fc = F([-1;-2.5]+0.0354*[-9;15]) = F123([-1;-2.5]+0.0354*[-9;15]) ans = -0.6500 This time Fc > Fd, therefore a3 = c2 = 0.0354, b3 = b2 = 0.0927, c3 = d2 = 0.0573 d3 = b3 - (1- )(b3-a3) = 0.0927 0.382 * (0.0927 0.0354) = 0.0708 This routine continues until bk+1 a k+1 < tol. E125 a1 = 0.0653 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 E12.6 We want to use the network of figure E12.2 to approximate the function g(p) = 1+ sin(p*pi/4) for 2<=p<=2. The initial network parameters arechosen to be w1(0) = [-0.27; -0.41], b1(0) = [-0.48; -0.13], w2(0) = [0.09 0.17], b2(0) = [0.48]. To create the training set we sample the function g(p) at the points p=1 and p = 0. Find the Jacobian matrix for the first step of the LMBP algorithm. (Some of the information you will need has been computed in the example starting on page 11-14.) ANSWER: The first step is to propagate the inputs through the network and compute the errors. a10 = p1 = [1] From page 11-15 n11 = [-0.75; -0.54], a11 = [0.321; 0.368]; n12= [0.09 -0.17] * [0.321; 0.368]+[0.48] = [0.4463], a12 = [0.4463]; The error would be e1 = t a = (1+sin(1 * pi/4)) a12 =1.261 a20 = p2 = [0] n21 = [-0.27; -0.41] *[0] + [-0.48; -0.13] = [-0.48; 0.13], a21 = logsig([-0.48; 0.13]) = [0.3823; 0.5325]; n22= [0.09 -0.17] * [0.3823; 0.5325]+[0.48] = [0.4239], a22 = [0.4239]; The error would be e1 = t a = (1+sin(0 * pi/4)) a22 =0...

