This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Lecture 7 Lecture Artificial neural networks Artificial Introduction, or how the brain works The neuron as a simple computing element The perceptron Summary 1 Introduction
IBM Supercomputer called Deep Blue was IBM capable of analyzing 200 million positions in a second and it appeared to be displaying thoughts. thoughts. Chess playing programs must be able to Chess improve their performance with experience or in other words a machine must be capable of learning. capable 2 Introduction, or how the brain works
Machine learning involves adaptive mechanisms Machine that enable computers to learn from experience, learn by example and learn by analogy. Learning capabilities can improve the performance of an intelligent system over time. The most popular approaches to machine learning are artificial neural networks and genetic algorithms. This lecture is dedicated to neural algorithms This networks. networks.
3 A neural network can be defined as a model of reasoning based on the human brain. The brain consists of a densely interconnected set of nerve cells, or basic informationprocessing units, called neurons. neurons The human brain incorporates nearly 10 billion The neurons and 60 trillion connections, synapses, synapses between them. By using multiple neurons using simultaneously, the brain can perform its functions much faster than the fastest computers in existence today. today.
4 Each neuron has a very simple structure, but an Each army of such elements constitutes a tremendous processing power. tremendous A neuron consists of a cell body, soma, a neuron soma number of fibers called dendrites, and a single dendrites and long fiber called the axon. axon 5 Biological neural network
Synapse Axon Synapse Dendrites Axon Soma Dendrites Soma Synapse Signals are propagated from one neuron to another by electro chemical reactions. Chemical substances released from Synapses cause a change in the electrical potential of the cell body. in When the potential reaches its threshold, an electrical pulse, action potential is sent down through the axon. The pulse spreads out and eventually reaches synapses causing them to increase or decrease its potential causing 6 Our brain can be considered as a highly complex, Our nonlinear and parallel informationprocessing system. system. Learning is a fundamental and essential Learning characteristic of biological neural networks. The ease with which they can learn led to attempts to emulate a biological neural network in a computer. emulate 7 How do Artificial nets model the brain? An artificial neural network consists of a number of An very simple processors, also called neurons, which neurons which are analogous to the biological neurons in the brain. The neurons are connected by weighted links The passing signals from one neuron to another. passing The output signal is transmitted through the The neuron’s outgoing connection. The outgoing connection splits into a number of branches that transmit the same signal. The outgoing branches terminate at the incoming connections of other neurons in the network. neurons 8 How does an artificial neural How network learn? network The neurons are connected by links, and The each link has a numeral weight associated with it. Weights express the strength or the importance of each neuron input. A neural network learns through repeated adjustments of these weights. adjustments 9 Architecture of a typical artificial neural network Middle Layer Input Layer Output Layer Output Signals
10 Input Signals Analogy between biological and Analogy artificial neural networks artificial
Biological Neural Network Soma Dendrite Axon Synapse Artificial Neural Network Neuron Input Output Weight 11 To build an artificial neural network:
First decide how many neurons are to be used and First how they are connected (Architecture) how Then decide which learning algorithm to be used. Finally we train the neural network ,that is we Finally initialize the weights of the network and update the weights from the set of training examples. the The weights are modified to bring the network The input /output behavior into line with that of the environment. environment. 12 Artificial Neural Networks
Input Signals x1 x2 Weights w1 w2 Neuron
Θ X = x1w1 + x2w2 + ... + xnwn Output Signals Y Y Y Y xn wn ld thr esho 13 How does the neuron determine its output? The neuron computes the weighted sum of the input The signals and compares the result with a threshold value, θ . If the net input is less than the threshold, value If the neuron output is –1. But if the net input is greater than or equal to the threshold, the neuron becomes activated and its output attains a value +1. becomes 14 Activation function The neuron uses the following transfer or The activation function: activation
X = ∑ xi wi
i =1 n +1, if X ≥ θ Y = −1, if X < θ This type of activation function is called a sign function. This sign 15 Activation functions of a neuron
Step function
Y Sign function Sigmoid function Linear function
Y Y Y +1
0 1 X +1
0 1 X 1
0 1 X 1
0 1 X 1 , step= , if X ≥ 0 Y sign = +1 if X ≥ 0 Y sigmoid= Y 0, if X < 0 −1, if X < 0 1 1 + e− X Y linear= X Step and Sign function –For decision making and Pattern recognition Sigmoid function – For back propagation network 16 Linear function – For linear approximation Can a single neuron learn a task? In 1958, Frank Rosenblatt introduced a training In algorithm that provided the first procedure for training a simple ANN: a perceptron. perceptron The perceptron is the simplest form of a neural The network. It consists of a single neuron with adjustable synaptic weights and a hard limiter. hard 17 Singlelayer twoinput perceptron
Inputs x1
w1 Linear Combiner Hard Limiter Output Y w2 x2 θ Threshold
18 The Perceptron Perceptron The operation of Rosenblatt’s perceptron is based The on the McCulloch and Pitts neuron model. The McCulloch The model consists of a linear combiner followed by a hard limiter. hard The weighted sum of the inputs is applied to the The hard limiter, which produces an output equal to +1 if its input is positive and − if it is negative. 1 19 The aim of the perceptron is to classify inputs, The x1, x2, . . ., xn, into one of two classes, say ., into A1 and A2. In the case of an elementary perceptron, the ndimensional space is divided by a hyperplane into two decision regions. The hyperplane is defined by the linearly separable function: function ∑ xi wi − θ = 0
i =1
20 n Linear separability in the perceptrons
x2 Class A 1 1 1 Class A 2 2 x 1w 1 + x 2w 2 − = 0 θ (a) Twoinput perceptron. x3 x1 w 1 + x2 w 2 + x3 w 3 − = 0 θ (b) Threeinput perceptron. x1 2 x1 x2 21 Perceptrons Perceptrons How does a perceptron learn? How l earn
– A perceptron has initial (often random) weights perceptron typically in the range [0.5, 0.5] typically – A pply an established training dataset Apply training – Calculate the error as Calculate as expected output minus actual output: actual error e = Y expected – Y actual
If the error, e(p), is positive, we need to increase perceptron ), output Y(p), but if it is negative, we need to decrease Y(p). ), – A djust the weights to reduce the error Adjust error 22 Perceptrons Perceptrons How do we adjust a perceptron’ s weights to produce Y expected? weights
– I f e i s positive, we need to increase Y actual (and If
vice versa) vice – Use this formula:
wi = wi + Δwi , where Δwi = α x xi x e and » α i s the learning rate (between 0 and 1) l earning » e i s the calculated error error
23 Perceptron Example – Perceptron AND A ND Use threshold Θ = 0.2 and learning rate α = 0.1 Train a perceptron to recognize logical Train AND A ND
Epoch 1 Inputs x1 0 0 1 1 0 0 1 1 0 0 x2 0 1 0 1 0 1 0 1 0 1 Desired output Yd 0 0 0 1 0 0 0 1 0 0 Initial weights w1 w2 0. 3 0. 3 0. 3 0. 2 0. 3 0. 3 0. 3 0. 2 0. 2 0. 2 − 0. 1 − 0. 1 − 0. 1 − 0. 1 0.0 0.0 0.0 0.0 0.0 0.0 Actual output Y 0 0 1 0 0 0 1 1 0 0 Error e 0 0 −1 1 0 0 −1 0 0 0 Final weights w1 w2 0. 3 0. 3 0. 2 0. 3 0. 3 0. 3 0. 2 0. 2 0. 2 0. 2 − 0. 1 − 0. 1 − 0. 1 0.0 0.0 0.0 0.0 0.0 0.0 0.0
24 2 3 Epoch 1 Perceptron Example – Perceptron AND A ND
Inputs x1 0 0 1 1 x2 0 1 0 1 Desired output Yd 0 0 0 1 Initial weights w1 w2 0. 3 0. 3 0. 3 0. 2 − 0. 1 − 0. 1 − 0. 1 − 0. 1 0 0 1 0 0 −1 1 0. 3 0. 2 0. 3 0 Desired 0 output 1 Yd 0 0 0 1 0 0 0 1 0 0 0. 3 itia0.0 In l 0.weights 0 3 0. 0. 2 0. w w0
1 2 Actual Error Final output weights Use threshold Θ = 0.2 and Y e w1 w2 learning rate α = 0.1 0 0. 3 0. 1
− − 0. 1 − 0. 1 0.0 Train a perceptron to recognize logical Train AND 0 0 0 0.3 0.0 0 A ND 2 0 0. 3 0.0
0 1 Inputs 0 Epoch 1 x1 x 2 1 1 1 3 0 0 1 1 0 0 1 1 0 0 0 1 0 1 0 1 0 1 0 1 Ac0 al tu 1 output 1 Y 0 0 1 0 0 0 1 1 0 0 E0 r rro 1 − 0 e 0 0 −1 − 1 0 0 − −1 0 0 0 0. 3Final0.0 0.weights 0 2 0. 0. 2 0. w w0
1 2 0. 3 2 0. 3 2 0. 3 2 0. 2 1 0. 3 2 0. 3 2 0. 3 2 0. 2 1 0.1 2 2 0. 1 1 − 0.0 1 − 0.0 1 − 0.0 1 − 0.0 0.0 1 0.0 1 0.0 1 0.0 1 0.0 1 0.0 1 0. 3 2 0. 3 2 0. 2 1 0. 3 2 3 0. 2 3 0. 2 2 0. 1 2 0. 1 0.1 2 2 0. 1 1 − 0.0 1 − 0.0 1 − 0.0 0.0 1 0.0 1 0.0 1 0.0 1 0.0 1 0.0 1 0.0 1
25 2 4 3 5 Perceptron Example – AND A ND
2 0 0 1 1 0 0 1 1 0 1 0 1 0 0 0 1 0. 3 0. 3 0. 3 0. 2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 1 0 0 −1 0 0. 3 0. 3 0. 2 0. 2 0 1 1 1 0 1 0 0 1 0. 3 0. 3 0. 2 − 0. 1 − 0. 1 − 0. 1 0 1 0 0 −1 1 0. 3 0. 2 0. 3 − 0. 1 − 0. 1 0.0 0.0 0.0 0.0 0.0 Repeat until convergence
0 0 0 1 Inputs 0 Epoch 1 1 x1 x 2 1 1 5 0 0 1 1 0 1 0 1 4 0 0 Desired 0 output 1 Yd 0 0 0 1 0. 2 0.1 0. 2 itia0.1 In l 0.weights 1 2 0. 0. 1 0.1 w1 w2 0. 3 1 1 0. 3 0. 3 1 0. 2 1 − 0.1 − 0.1 − 0. 1 − 0.1 =0.01 0. 0.0 0.0 0.0 0.0 3 – ii.e. final weights do not change and no error .e. no
0 Ac0 al tu 1 ut outp 1 Y 0 0 1 0 1 0 0 0 1 1 0 0 E0 r rro 1 − 0 e 0 0 0 −1 1 0 0 0 −1 0 0 0. 2 0.1 0. 2Final0.1 0.weights 1 1 0. 0. 1 0. w1 w21 0. 3 1 1 0. 3 0. 2 1 0. 3 1 0. 3 0. 3 0. 2 0. 2 0. 2 − 0.1 − 0.1 − 0. 1 1 0.0 0.0 0.0 0.0 0.0 0.0 0 1 0 1 0 0 0 1 0. 2 0. 2 0. 2 0. 1 0Use threshold Θ = 0.2 and 0 0. 2 0.0 0 0. 2 0.0 0 learning rate α = 0.1 1 0. 1 0.0 −1 0 1 0. 2 0.1 Thr2 hold0 θ =00.2; learning ra0. 3 es : te: 0 0 0 0. 3 1 1 0 0 0. 3 1 1 1 0. 2 3 0 0 0 0. 2 26 Perceptron Example – Perceptron AND A ND Twodimensional plot of logical AND operation: A single perceptron can be trained to recognize any linear separable function any l inear x2 1 x1 0 1 – Can we train a perceptron to recognize logical OR? – How about logical exclusiveOR (i.e. XOR)? 27 Perceptron – OR and XOR Twodimensional plots of logical OR and Twodimensional XOR: X OR:
x2 x2 1 x1 0 1 x1 1 x1 0 1 ( b ) OR ( x 1 ∪ x 2 ) (c ) Ex cl us iv e OR (x ⊕ x ) 28 Perceptron’s training algorithm
Step 1: Initialisation Initialisation Set initial weights w1, w2,…, wn and threshold ,…, θ to random numbers in the range [−0.5, 0.5, to 0.5]. 0.5]. If the error, e(p), is positive, we need to increase If ), perceptron output Y(p), but if it is negative, we perceptron ), need to decrease Y(p). 29 Perceptron’s training algorithm (continued) Step 2: Activation Activation Activate the perceptron by applying inputs x1(p), ), x2(p),…, xn(p) and desired output Yd (p). Calculate ),…, and ). the actual output at iteration p = 1
n Y ( p ) = step ∑ x i ( p ) w i ( p ) − θ i =1 where n is the number of the perceptron inputs, where and step is a step activation function. is 30 Perceptron’s training algorithm (continued) Step 3: Weight training Weight Update the weights of the perceptron Update wi ( p + 1) = wi ( p) + ∆wi ( p) where ∆ wi(p) is the weight correction at iteration p. where is The weight correction is computed by the delta The rule: rule
. ∆wi ( p) = α ⋅ xi ( p ) ⋅ e( p) Step 4: Iteration Iteration Increase iteration p by one, go back to Step 2 and repeat the process until convergence. repeat
31 Example of perceptron learning: the logical operation Example Desired Initial Actual Error Final Inputs AND AND Epoch output weights output weights
x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 x2 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Yd 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 w1 w2 Y e w1 w2 1 0. 3 0. 3 0. 3 0. 2 0. 3 0. 3 0. 3 0. 2 0. 2 0. 2 0. 2 0. 1 0. 2 0. 2 0. 2 0. 1 0. 1 0. 1 0. 1 0. 1 − 0. 1 − 0. 1 − 0. 1 − 0. 1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 = 0. 1
32 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 0 0 1 0 0 −1 1 0 0 −1 0 0 0 −1 1 0 0 −1 0 0 0 0 0 0. 3 0. 3 0. 2 0. 3 0. 3 0. 3 0. 2 0. 2 0. 2 0. 2 0. 1 0. 2 0. 2 0. 2 0. 1 0. 1 0. 1 0. 1 0. 1 0. 1 − 0. 1 − 0. 1 − 0. 1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 2 3 4 5 Threshold: θ = 0.2; learning rate: Twodimensional plots of basic logical operations
x2 x2 x2 1 x1 0 1 1 x1 0 1 1 x1 0 1 (a) AND (x1 ∩ x2) ( b ) OR ( x 1 ∪ x 2 ) (c ) Ex cl us iv e OR (x 1 ⊕ x 2 ) A perceptron can learn the operations AND and OR, but perceptron OR but not ExclusiveOR. ExclusiveOR Points in the input space where the function output is 1 Points are indicated by black dots, and for output 0 are indicated by white dots indicated 33 ...
View
Full
Document
This note was uploaded on 01/04/2010 for the course MSC CP 1312 taught by Professor Ms.nireshfathima during the Fall '09 term at Unity.
 Fall '09
 Ms.NireshFathima

Click to edit the document details