ml-lecture03

ml-lecture03 - Lecture 3: Feed-forward neural networks....

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lecture 3: Feed-forward neural networks. Backpropagation Network architecture Backpropagation algorithm Tweaks: Avoiding local minima Choosing learning rates Encoding the inputs and outputs September 12, 2007 1 COMP-652 Lecture 3 The need for networks Sigmoid units are very similar to perceptrons, but provide a soft threshold But their expressive power is the same as perceptrons: limited to linearly separable instances x 1 x 2 + +-- +- x 1 x 2 ( a ) ( b )- +- + September 12, 2007 2 COMP-652 Lecture 3 Example: Logical functions of two variables 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 100 200 300 400 500 600 700 800 900 1000 Mean Square Error Number of Epochs Sigmoid Unit for And Function mse-curve-0.1 mse-curve-0.01 mse-curve-0.5 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2000 4000 6000 8000 10000 Squared Error Number of epochs xor-no-hidden One sigmoid neuron can learn the AND function (left) but not the XOR function (right) In order to learn discrimination in data sets that are not linearly separable, we need networks of sigmoid units September 12, 2007 3 COMP-652 Lecture 3 Example: A network representing the XOR function 5 1 2 3 w31 w32 w41 w42 w51 w52 w40 w50 w30 4 Input 1 Input 2 Ouput 1 (-10.34) (-6.8) (4.36) (4.5) (6.91) (10.28) (-3.06) (6.92) (-4.86) 1 0 1 1 1 1 0.04 0.001 0 0.08 0.98 Input1 Input2 o3 o4 Ouput 1 0.011 0.99 0.2 0.4 0.6 0.8 1 1.2 1.4 500 1000 1500 2000 2500 3000 Squared Error Number of Epochs Learning Curve for XOR Function with 2-2-1 Architecture mse-curve September 12, 2007 4 COMP-652 Lecture 3 Feed-forward neural networks A collection of units (neurons) with sigmoid or sigmoid-like activations, arranged in layers Layers is the input layer , and its units just copy the inputs (by convention) September 12, 2007 5 COMP-652 Lecture 3 Feed-forward neural networks The last layer, K , is called output layer , since its units provide the result of the network Layers 1 , . . . K- 1 are usually called hidden layers (their presence cannot be detected from outside the network) September 12, 2007 6 COMP-652 Lecture 3 Why this name? In feed-forward networks the outputs of units in layer k become inputs for units in layers with index > k There are no cross-connections between units in the same layer There are no backward (recurrent) connections from layers downstream Typically, units in layer k provide input only to units in layer k + 1 In fully connected networks , all units in layer k are connected to all units in layer k + 1 September 12, 2007 7 COMP-652 Lecture 3 Notation w j,i is the weight on the connection from unit i to unit j By convention, x j, = 1 , j September 12, 2007 8 COMP-652 Lecture 3 Notation The output of unit j , denoted o j , is computed using a sigmoid: o j = ( w T j x j ) where w j is the vector of weights on the connections entering unit j and x j is the vector of inputs to unit j By the deFnition of the connections,...
View Full Document

Page1 / 16

ml-lecture03 - Lecture 3: Feed-forward neural networks....

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online