units compute their activations in the now familiar

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: TRL (cont.) Units compute their activations in the now familiar way, by first computing the weighted sum of their inputs: weighted net k (t ) = ∑w l∈U ∪ I z (t ) kl l where the only new element in the formula is the introduction of the temporal index t Units then apply a non-linear function on their net input yk(t+1) = fk(netk(t)) Usually, Usually, both hidden and output units will have non-linear nonactivation functions Note that external input at time t does not influence the output of any unit until time t+1. The network is thus a discrete discrete dynamical system ECE 517: Reinforcement Learning in AI 16 RTRL RTRL (cont.) Some of the units in U are output units, for which a target is defined A target may not be defined in every single time step For For example, if we are presenting a string to the network to be classified as either grammatical or ungrammatical, we may provide a target only for the last symbol in the string In defining an error over the outputs, therefore, we need to to make the error time dependent too Let T(t) be the set of indices k in U for which there exists T(t) a target value dk(t) at time t, so that the error is d k (t ) − yk (t ) if k ∈ T (t ) ek (t ) = otherwise 0 ECE 517: Reinforcement Learning in AI 17 RTRL...
View Full Document

This note was uploaded on 05/04/2013 for the course ECE 517 taught by Professor Arel during the Fall '11 term at University of Tennessee.

Ask a homework question - tutors are online