Stat841f09 - Wiki Course Notes

# Thus generally it is better to choose a relatively

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: network achieves a stable solution. Choosing too large learning rate may cause the unstability of the system and make the weights and objective function diverge, while the too small learning rate may lead to a very slow convergence rate(very long time in learning phase). However, the advantage of small learning rate is that it can guarantee the convergence. Thus, generally, it is better to choose a relatively small learning rate to ensure the stability. Usually, choose between 0.01 and 0.7. If the learning rate is appropriate, the algorithm is guaranteed to converge to a local minimum, but not a global minimum which is better. Furthermore, there can exist many local minimum values. How to de te rmine the numbe r of hidde n units Here we will mainly discuss how to estimate the number of hidden units at very beginning. Obviously, we should adjust it to be more precise using CV, LOO or other complexity control methods. wikicour senote.com/w/index.php?title= Stat841&amp;pr intable= yes 40/74 10/09/2013 Stat841 - Wiki Cour se Notes Basically, if the patterns are well separated, few hidden units are fairly enough. If the patterns are drawn from some highly complicated mixture models, more hidden units are really needed. Actually, the number of hidden units determines the size of the model, and therefore the total number of the weights in the model. Typically speaking, the number of weights should not be larger than the number of training data, say N. Thus, sometimes, N/10 is a good choice. However, in pratice, many well performed models will use more hidden units. Dimens ionality reduction application One possible application of Neural Networks is to perform dimensionality reduction, like other techniques, e.g., PCA, MDS, LLE and Isomap. Consider the following configuration as shown in figure 1: As we go forward in layers of this Neural Network, the number of nodes is reduced, until we reach a layer with the number of nodes representing the desired dimensionality. However, note that at the very first few layers the number of nodes may not be strictly decreasing, as long as finally it can reach a layer with less...
View Full Document

## This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online