# NN_perceptron_1 - Single Layer Neural Network Xingquan(Hill...

Single Layer Neural Network Xingquan (Hill) Zhu

Outline Perceptron for Classification Perceptron training rule Why perceptron training rule work? Gradient descent learning rule Incremental stochastic gradient descent Delta Rule (Adaline: Adaptive Linear Element)
Perceptron: architecture We consider the architecture: feed-forward NN with one layer It is sufficient to study single layer perceptrons with just one neuron:

Single layer perceptrons Generalization to single layer perceptrons with more neurons is easy because: The output units are independent among each other Each weight only affects one of the outputs
Perceptron: Neuron Model The (McCulloch-Pitts) perceptron is a single layer NN with a non-linear ϕ , the sign function

Perceptron for Classification The perceptron is used for binary classification. • Given training examples of classes C 1 , C 2 train the perceptron in such a way that it classifies correctly the training examples: If the output of the perceptron is +1 then the input is assigned to class C 1 If the output is -1 then the input is assigned to C 2
Perceptron Training How can we train a perceptron for a classification task? We try to find suitable values for the weights in such a way that the training examples are correctly classified. Geometrically, we try to find a hyper- plane that separates the examples of the two classes.

Perceptron Geometric View The equation below describes a (hyper-)plane in the input space consisting of real valued 2D vectors. The plane splits the input space into two regions, each of them describing one class. 0 w x w 0 2 1 i i i = + = x 2 C 2 x 1 decision boundary w 1 x 1 + w 2 x 2 + w 0 = 0 decision region for C1
Example: AND Here is a representation of the AND function White means false , black means true for the output -1 means false , +1 means true for the input -1 AND -1 = false -1 AND +1 = false +1 AND -1 = false +1 AND +1 = true

Example: AND continued A linear decision surface separates false from true instances
Example: AND continued Watch a perceptron learn the AND function:

Example: XOR Here’s the XOR function: -1 XOR -1 = false -1 XOR +1 = true +1 XOR -1 = true +1 XOR +1 = false Perceptrons cannot learn such linearly inseparable functions
Example: XOR continued Watch a perceptron try to learn XOR

Example -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 +1 +1 +1 +1 -1 -1 -1 -1 -1 -1 -1 +1 -1 -1 -1 -1 -1 +1 +1 +1 -1 -1 -1 -1 -1 -1 -1 +1 -1 -1 -1 -1 -1 -1 - 1 +1 -1 -1 -1 -1 +1 +1 +1 +1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
Example How to train a perceptron to recognize this 3? Assign –1 to weights of input values that are equal to -1, +1 to weights of input values that are equal to +1, and –63 to the bias. Then the output of the perceptron will be 1 when presented with a “prefect” 3, and at most –1 for all other patterns.

Example -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 +1 +1 +1 +1 -1 -1 -1 -1 -1 -1 -1 +1 -1 -1 -1 -1 -1 +1 +1 +1 -1 -1 -1 +1 -1 -1 -1 +1 -1 -1 -1 -1 -1 -1 - 1 +1 -1 -1 -1 -1 +1 +1 +1 +1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
Example What if a slightly different 3 is to be recognized, like the one in the previous slide?

