# the error function for a single time step is defined

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: RTRL (cont.) The error function for a single time step is defined as 1 2 E (τ ) = ∑ [ek (τ )] 2 k∈U The The error function we wish to minimize is the sum of this error over all past steps of the network Etotal (t0 , t1 ) = t1 ∑ E (τ ) τ = t 0 +1 Since the total error is the sum of all previous errors and the error at this time step, so also, the gradient of the gradient total error is the sum of the gradient for this time step sum and the gradient for previous steps ECE 517: Reinforcement Learning in AI 18 RTRL RTRL (cont.) Hence, the gradient can be expressed as ∇W Etotal (t0 , t + 1) = ∇W E (t0 , t ) + ∇W E (t + 1) As a time series is presented to the network, we can accumulate the values of the gradient, or equivalently, of the weight changes We thus keep track of the value ∂E (t ) ∆wij (t ) = − µ ∂wij After the network has been presented with the entire series, we alter each weight by t1 ∑ ∆w (t ) t =t 0 +1 ECE 517: Reinforcement Learning in AI ij 19 RTRL RTRL (cont.) We therefore need an algorithm that computes ∂E (t ) ∂E (t ) ∂yk (t ) ∂yk (t ) =∑ = ∑ ek (t ) ∂wij ∂wij k∈U ∂y k (t ) ∂wij k∈U at each time step. Since we know ek(t) at all times (the difference between our our targets and outputs), we only need to find a way to compute compute the second factor It is important to understand what the latter expresses … It It is essentially a measure of the sensitivity of the activation unit k at time t to a small change in the value of wij It It takes into account the effect of such a change in the weight over the entire network trajectory from t0 to t Note Note that wij does not h...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online