Unformatted text preview: the optimal solution in every case, but is simple to implement. To be more specific, random values near zero will be a good choice for the initial
weights(usually from [ 1,1]). In this case, the model evolves from a nearly linear one to a nonlinear one as we desired. An alternative is to use an orthogonal least squares
method to find the initial weights [11]. Regression is performed on the weights and output by using a linear approximation of
, and finds optimal weights in the linear
model. Back propagation is used afterward to find the optimal solution, since the NN is non linear.
Why all initial weights should be randomized and small?
Since the error back propagated through the network is proportional to the value of the weights. If all the weights are the same, then the back propagated errors will
be the same as well and causing all of the weights will be updated by the same amount. Thus, same initial weights will prevent the network from learning.
Since the weights updates in the Back Prop algorithm are proportional to the derivative of activation function, it is important to consider how the net input affects its
value. The derivative is a maximum when the activation function is equal to 0.5 and approaches its minimum as the activation function approaches 0 or 1, then its
associated weights will vary very little. Thus, if we choose small initial weights, we will have the activation function close to the maximal weight change.
How to s e t le arning rate s
The learning rate is usually a constant. If we use On line learning, as a form of stochastic approximation process, should decrease as the iteration increase. In typical feedforwad NNs with hidden units, the objective function has many local and global optimal values, so the optimal learning rate often changes dramatically during
the training process. The larger the learning rate the larger the the weight changes on each epoch, and the quicker the network learns.However, the size of the learning rate
can also influence whether the...
View
Full
Document
This document was uploaded on 03/07/2014.
 Winter '13

Click to edit the document details