This preview shows page 1. Sign up to view the full content.
Unformatted text preview: f steps required to achieve convergence could still be substantial, and in practice, until convergence is achieved we will not be able to distinguish
between a nonseparable problem and one that is simply slow to converge.
Comme nt on gradie nt de s ce nt algorithm
Consider yourself on the peak and you want to get to the land as fast as possible. So which direction should you step? Intuitively it should be the direction in which the height
decreases fastest, which is given by the gradient. However, if the mountain has a saddle shape and you initially stand in the middle, then you will finally arrive at the saddle
point (local minimum) and get stuck there.
In addition, note that in the final form of our gradient descent algorithm, we get rid of the summation over (all data points). Actually, this is an alternative of the original
gradient descent algorithm (sometimes called batch gradient descent) known as Stochastic gradient descent, where we approximate the true gradient by only evaluating on a
single training example. This means that
gets improved by computation of only one sample. When there is a large data set, say, population database, it's very timeconsuming to do summation over millions of samples. By Stochastic gradient descent, we can treat the problem sample by sample and still get decent result in practice. A Perceptron applet can be found at http://isl.ira.uka.de/neuralNetCourse/2004/VL_11_5/Perceptron.html . Neural Networks (NN) - October 28, 2009
A neural network (http://en.wikipedia.org/wiki/Neural_network) is a two stage regression for classification model. It can be represented by a network diagram. It is a
parallel, distributed information processing structure consisting of processing elements interconnected together with signal channels called connections. Each processing
element has a single output connection with branches that "fan out" onto as many connections as desired, each carrying the same signal - the processing element output
 A neural network resembles the...
View Full Document
- Winter '13