L10-StochOptControl

I 1 x t n n unabsorbed expyi i we can then deduce

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: orbed (by the V (x , t ) field – I’ll explain) we weight it to zero. 1 J (x , t ) −λ log N N unabsorbed exp(−φ(yi )/λ). i∈ N 1 ψ (x , t ) = N i ∈unabsorbed exp(−φ(yi )/λ) Nick Jones Elements of Stochastic Optimal Control: ICDNS Using diffusions to solve a large class of control problems II ∂t ρ = − V − ∂x (b ρ) + λ 1 2 2 ij νij ∂ x∂∂ xj ρ . i J (x , t ) = −λlog dy ρ(y , T |x , t )exp (−φ(y )/λ). 1 J (x , t ) −λ log N N unabsorbed exp(−φ(yi )/λ). i∈ 1 ψ (x , t ) N N unabsorbed exp(−φ(yi )/λ) i∈ We can then deduce our optimal controls from our optimal cost-to-go: u (x , t ) = −R −1 B ∂x J (t , x ) In the case where the optimal control is unique we can approximate u (x , t ) directly through the form: 1 u (x , tj ) ψ(x ,t ) N unabsorbed exp(−φ(yi )/λ)ξj where we are i∈ considering the discrete time of our simulation and ξj is the perturbation at time j . This is particularly cute since it tells us that we can interpret our noise perturbations which are successful at steering the particle as controls. We are now set for the practical. Nick Jones Elements of Stochastic Optimal Control: ICDNS Remark I’ve just taken us through what is sometimes called Path-Integral Control [3]. It relied on this constraint (inversely) connecting my the costs of control in a direction to whether the noise was strong in that direction (scaled by a parameter which can be interpreted as a temperature). It turns out th...
View Full Document

Ask a homework question - tutors are online