Stat841f09 - Wiki Course Notes

# 3 convergence rates depend on the size of the gap

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: im/M L09/le ct0126.pdf) and Is s ue s Affe cting Conve rge nce 1. The output values of a perceptron can take on only one of two values (+1 or - 1); that is, it only can be used for two- class classification. 2. If the data is not separable, then the Perceptron algorithm will not converge since it cannot find a linear classifier that classifies all of the points correctly. 3. Convergence rates depend on the size of the gap between classes. If the gap is large, then the algorithm converges quickly. However, if the gap is small, the algorithm converges slowly. This problem can be eliminated by using basis expansions technique. To be specific, we try to find a hyperplane not in the original space, but in the enlarged space obtained by using some basis functions. 4. If the classes are separable, there exists infinitely many solutions to Perceptron, all of which are hyperplanes. 5. The speed of convergence of the algorithm is also dependent on the value of , the learning rate. A larger value of could yield quicker convergence, but if this value is too large, it may also result in “skipping over” the minimum that the algorithm is trying to find and possibly oscillating forever between the last two points, before and after the min. 6. A perfect separation is not always available even desirable. If observations comes from different classes sharing the same imput, the classification model seems to be overfitting and will generally have poor predictive performance. 7. The perceptron convergence theorem (http://annet.eeng.nuim.ie/intro/course/chpt2/convergence.shtml) states that if there exists an exact solution (in other words, if the training data set is linearly separable), then the perceptron learning algorithm is guaranteed to ﬁnd an exact solution in a ﬁnite number of steps. Proofs of this theorem can be found for example in Rosenblatt (1962), Block (1962), Nilsson (1965), Minsky and Papert (1969), Hertz et al. (1991), and Bishop (1995a). Note, however, that the number o...
View Full Document

## This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online