Lesson 39 - Module 12 Machine Learning Version 1 CSE IIT Kharagpur Lesson 39 Neural Networks III Version 1 CSE IIT Kharagpur 12.4.4 Multi-Layer

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon
Module 12 Machine Learning Version 1 CSE IIT, Kharagpur
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lesson 39 Neural Networks - III Version 1 CSE IIT, Kharagpur
Background image of page 2
12.4.4 Multi-Layer Perceptrons In contrast to perceptrons, multilayer networks can learn not only multiple decision boundaries, but the boundaries may be nonlinear. The typical architecture of a multi-layer perceptron (MLP) is shown below. Input nodes Internal nodes Output nodes To make nonlinear partitions on the space we need to define each unit as a nonlinear function (unlike the perceptron). One solution is to use the sigmoid unit. Another reason for using sigmoids are that they are continuous unlike linear thresholds and are thus differentiable at all points. x1 x2 xn X0=1 w0 w1 w2 wn Σ O = σ (net) = 1 / 1 + e -net net σ ( WX ) Function σ is called the sigmoid or logistic function. It has the following property: d σ (y) / dy = σ (y) (1 – σ (y)) O(x1,x2,…,xn) = where: σ ( WX ) = 1 / 1 + e -WX Version 1 CSE IIT, Kharagpur
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
12.4.4.1 Back-Propagation Algorithm Multi-layered perceptrons can be trained using the back-propagation algorithm described next. Goal: To learn the weights for all links in an interconnected multilayer network. We begin by defining our measure of error: E(W) = ½ Σ d Σ k (tkd – okd) 2 k varies along the output nodes and d over the training examples. The idea is to use again a gradient descent over the space of weights to find a global minimum (no guarantee). Algorithm: 1. Create a network with nin input nodes, nhidden internal nodes, and nout output nodes. 2. Initialize all weights to small random numbers. 3. Until error is small do: For each example X do Propagate example X forward through the network Propagate errors backward through the network Forward Propagation Given example X, compute the output of every node until we reach the output nodes: Input Internal Output Example Compute sigmoid function Version 1 CSE IIT, Kharagpur
Background image of page 4
Backward Propagation A. For each output node k compute the error: δ k = Ok (1-Ok)(tk – Ok) B. For each hidden unit h, calculate the error: δ h = Oh (1-Oh) Σ k Wkh δ k C. Update each network weight: C. Wji = Wji + Δ Wji where Δ Wji = η δ j Xji (Wji and Xji are the input and weight of node i to node j) A momentum term, depending on the weight value at last iteration, may also be added to
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 09/20/2010 for the course MCA DEPART 501 taught by Professor Hemant during the Fall '10 term at Institute of Computer Technology College.

Page1 / 12

Lesson 39 - Module 12 Machine Learning Version 1 CSE IIT Kharagpur Lesson 39 Neural Networks III Version 1 CSE IIT Kharagpur 12.4.4 Multi-Layer

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online