Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Sec. 12.4] Function approximation 233 + + + + + + Fig. 12.2: A linear discrimination between two classes. then the output is 1, indicating membership of a class, otherwise it is 0, indicating exclusion from the class. Clearly, w x describes a hyperplane and the goal of perceptron learning is to find a weight vector, w , that results in correct classification for all training examples. The perceptron is an example of a linear threshold unit (LTU). A single LTU can only recognise one kind of pattern, provided that the input space is linearly separable. If we wish to recognise more than one pattern, several LTU’s can be combined. In this case, instead of having a vector of weights, we have an array. The output will now be a vector: where each element of u indicates membership of a class and each row in W is the set of weights for one LTU. This architecture is called a pattern associator. LTU’s can only produce linear discriminantfunctions and consequently, they are limited in the kinds of classes that can be learned. However, it was found that by cascading pattern associators, it is possible to approximate decision surfaces that are of a higher order than simple hyperplanes. In cascaded system, the outputs of one pattern associator are fed into the inputs of another, thus: To facilitate learning, a further modification must be made. Rather than using a simple threshold, as in the perceptron, multi-layer networks usually use a non-linear threshold such as a sigmoid function. Like perceptron learning, back-propagation attempts to reduce the errors between the output of the network and the desired result. Despite the non-linear threshold, multi-layer networks can still be thought of as describing a complex collection of hyperplanes that approximate the required decision surface.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
234 Knowledge Representation [Ch. 12 θ x Fig. 12.3: A Pole Balancer. 12.4.1 Discussion Function approximation methods can often produce quite accurate classifiers because they are capable of constructing complex decision surfaces. The observation language for algorithms of this class is usually a vector of numbers. Often preprocessing will convert raw data into a suitable form. For example, Pomerleau (1989) accepts raw data from a camera mounted on a moving vehicle and selects portions of the image to process for input to a neural net that learns how to steer the vehicle. The knowledge acquired by such a system is stored as weights in a matrix. Therefore, the hypothesis language is usually an array of real numbers. Thus, the results of learning are not easily available for inspection by a human reader. Moreover, the design of a network usually requires informed guesswork on the part of the user in order to obtain satisfactory results. Although some effort has been devoted to extracting meaning from networks, the still communicate little about the data. Connectionist learning algorithms are still computationally expensive. A critical factor
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 08/10/2011 for the course IT 331 taught by Professor Nevermind during the Spring '11 term at King Abdulaziz University.

Page1 / 20


This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online