*This preview shows
page 1. Sign up to
view the full content.*

**Unformatted text preview: **In computing errors, all trainable weights are FF only, so
we can apply the standard backpropagation algorithm
The weights from the copy layer to the hidden layer play a
copy
hidden
special role in error computation
The
The error signal they receive
comes from the hidden units,
and so depends on the error at
the hidden units at time t
Activations
Activations in the copy units,
however, are just the activation
of the hidden units at time t-1
So,
So, in training, we are considering
a gradient of an error function which is determined by the
activations at the present and the previous time steps
present
previous
ECE 517: Reinforcement Learning in AI 14 RealReal-Time Recurrent Networks (RTRL) (Zipser et. al ’89)
In deriving a gradient-based update rule, we now make
gradientnetwork connectivity very unconstrained
unconstrained
Suppose
Suppose we have a set of input units, I = {xk(t), 0<k<m},
and a set of other units, U = {yk(t), 0<k<n}, which can be
hidden or output units To index an arbitrary unit in the network we can use xk (t ) if k ∈ I
z k (t ) = yk (t ) if k ∈ U
Let W be the weight matrix with n rows and n+m columns,
n+m
where wi,j is the weight to unit i (which is in U ) from unit j
from
(which is in I or U ) ECE 517: Reinforcement Learning in AI 15 RTRL
R...

View Full
Document