This preview shows pages 1–6. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Lecture 3: Feedforward neural networks. Backpropagation Network architecture Backpropagation algorithm Tweaks: Avoiding local minima Choosing learning rates Encoding the inputs and outputs September 12, 2007 1 COMP652 Lecture 3 The need for networks Sigmoid units are very similar to perceptrons, but provide a soft threshold But their expressive power is the same as perceptrons: limited to linearly separable instances x 1 x 2 + + + x 1 x 2 ( a ) ( b ) + + September 12, 2007 2 COMP652 Lecture 3 Example: Logical functions of two variables 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 100 200 300 400 500 600 700 800 900 1000 Mean Square Error Number of Epochs Sigmoid Unit for And Function msecurve0.1 msecurve0.01 msecurve0.5 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2000 4000 6000 8000 10000 Squared Error Number of epochs xornohidden One sigmoid neuron can learn the AND function (left) but not the XOR function (right) In order to learn discrimination in data sets that are not linearly separable, we need networks of sigmoid units September 12, 2007 3 COMP652 Lecture 3 Example: A network representing the XOR function 5 1 2 3 w31 w32 w41 w42 w51 w52 w40 w50 w30 4 Input 1 Input 2 Ouput 1 (10.34) (6.8) (4.36) (4.5) (6.91) (10.28) (3.06) (6.92) (4.86) 1 0 1 1 1 1 0.04 0.001 0 0.08 0.98 Input1 Input2 o3 o4 Ouput 1 0.011 0.99 0.2 0.4 0.6 0.8 1 1.2 1.4 500 1000 1500 2000 2500 3000 Squared Error Number of Epochs Learning Curve for XOR Function with 221 Architecture msecurve September 12, 2007 4 COMP652 Lecture 3 Feedforward neural networks A collection of units (neurons) with sigmoid or sigmoidlike activations, arranged in layers Layers is the input layer , and its units just copy the inputs (by convention) September 12, 2007 5 COMP652 Lecture 3 Feedforward neural networks The last layer, K , is called output layer , since its units provide the result of the network Layers 1 , . . . K 1 are usually called hidden layers (their presence cannot be detected from outside the network) September 12, 2007 6 COMP652 Lecture 3 Why this name? In feedforward networks the outputs of units in layer k become inputs for units in layers with index > k There are no crossconnections between units in the same layer There are no backward (recurrent) connections from layers downstream Typically, units in layer k provide input only to units in layer k + 1 In fully connected networks , all units in layer k are connected to all units in layer k + 1 September 12, 2007 7 COMP652 Lecture 3 Notation w j,i is the weight on the connection from unit i to unit j By convention, x j, = 1 , j September 12, 2007 8 COMP652 Lecture 3 Notation The output of unit j , denoted o j , is computed using a sigmoid: o j = ( w T j x j ) where w j is the vector of weights on the connections entering unit j and x j is the vector of inputs to unit j By the deFnition of the connections,...
View
Full
Document
 Fall '07
 PREICUP

Click to edit the document details