L10-StochOptControl

# Nick jones elements of stochastic optimal control

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: his system around in direction i (Rii small). This is natural for systems generically - we might think that it is easier to control a system to something close to where it might have gone anyway [12]. Nick Jones Elements of Stochastic Optimal Control: ICDNS A more general class of system II Dynamics: dx = (b (x , t ) + Bu )dt + d ξ 1 Cost: w (x , u , t ) = 2 u T Ru + Q (x , t ) Final cost: W (x ) = φ(xT ) Constraint: ν = λBR −1 B We thus have a simple control (relatively simple noise) but a complex system we’d like to control with complex costs. We have also introduced a constraint which is strong but natural. I recommend reading Ref. [3] though it is not required (beyond what you need to understand the practical). Nick Jones Elements of Stochastic Optimal Control: ICDNS A more general class of system III Dynamics: dx = (b (x , t ) + Bu )dt + d ξ 1 Cost: w (x , u , t ) = 2 u T Ru + Q (x , t ) Final cost: W (x ) = φ(xT ) Constraint: ν = λBR −1 B We can thus write the stochastic HJB as: −∂t J (t , x ) = minu ( 1 u T Ru + Q (x , t ) + (b + Bu )T ∂x J (t , x ) + 2 1 2 Tr (ν (t , x , u )∂x J (t , x )). 2 Optimizing over u yields: u (x , t ) = −R −1 B ∂x J (t , x ). Plugging this optimal control back in to the stochastic HJB, deﬁning J (x , t ) = −λ log ψ (x , t ) and using the constraint discussed yields 2 an equation linear in ψ : ∂t ψ = V − b T ∂x − 1 Tr (ν (t , x , u )∂x ) ψ . λ 2 This can be solved backwards in time starting with ψ (x , T ) = exp (−φ(x )/λ) (since J (x , T ) = φ(xT )). Nick Jones Elements of Stocha...
View Full Document

Ask a homework question - tutors are online