This preview shows page 1. Sign up to view the full content.
Unformatted text preview: network achieves a stable solution. Choosing too large learning rate may cause the unstability of the system and make the weights and
objective function diverge, while the too small learning rate may lead to a very slow convergence rate(very long time in learning phase). However, the advantage of small
learning rate is that it can guarantee the convergence. Thus, generally, it is better to choose a relatively small learning rate to ensure the stability. Usually, choose between
0.01 and 0.7.
If the learning rate is appropriate, the algorithm is guaranteed to converge to a local minimum, but not a global minimum which is better. Furthermore, there can exist many
local minimum values.
How to de te rmine the numbe r of hidde n units
Here we will mainly discuss how to estimate the number of hidden units at very beginning. Obviously, we should adjust it to be more precise using CV, LOO or other
complexity control methods.
wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 40/74 10/09/2013 Stat841  Wiki Cour se Notes Basically, if the patterns are well separated, few hidden units are fairly enough. If the patterns are drawn from some highly complicated mixture models, more hidden units are
really needed.
Actually, the number of hidden units determines the size of the model, and therefore the total number of the weights in the model. Typically speaking, the number of weights
should not be larger than the number of training data, say N. Thus, sometimes, N/10 is a good choice. However, in pratice, many well performed models will use more
hidden units. Dimens ionality reduction application
One possible application of Neural Networks is to perform dimensionality reduction, like other techniques,
e.g., PCA, MDS, LLE and Isomap.
Consider the following configuration as shown in figure 1: As we go forward in layers of this Neural
Network, the number of nodes is reduced, until we reach a layer with the number of nodes representing the
desired dimensionality. However, note that at the very first few layers the number of nodes may not be
strictly decreasing, as long as finally it can reach a layer with less...
View
Full
Document
This document was uploaded on 03/07/2014.
 Winter '13

Click to edit the document details