Lec4 - MIT OpenCourseWare http/ocw.mit.edu 16.323 Principles of Optimal Control Spring 2008 For information about citing these materials or our

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: MIT OpenCourseWare http://ocw.mit.edu 16.323 Principles of Optimal Control Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms . 16.323 Lecture 4 HJB Equation DP in continuous time • • HJB Equation • Continuous LQR Factoids: for symmetric R ∂ u T R u = 2 u T R ∂ u ∂R u = R ∂ u Spr 2008 16.323 4–1 DP in Continuous Time • Have analyzed a couple of approximate solutions to the classic control problem of minimizing: t f min J = h ( x ( t f ) ,t f ) + g ( x ( t ) , u ( t ) ,t ) dt t subject to x ˙ = a ( x , u ,t ) x ( t ) = given m ( x ( t f ) ,t f ) = set of terminal conditions u ( t ) ∈ U set of possible constraints • Previous approaches discretized in time, state, and control actions – Useful for implementation on a computer, but now want to consider the exact solution in continuous time – Result will be a nonlinear partial differential equation called the Hamilton-Jacobi-Bellman equation ( HJB ) – a key result. • First step: consider cost over the interval [ t,t f ] , where t ≤ t f of any control sequence u ( τ ) , t ≤ τ ≤ t f t f J ( x ( t ) ,t, u ( τ )) = h ( x ( t f ) ,t f ) + g ( x ( τ ) , u ( τ ) ,τ ) dτ t – Clearly the goal is to pick u ( τ ) , t ≤ τ ≤ t f to minimize this cost. J ( x ( t ) ,t ) = min J ( x ( t ) ,t, u ( τ )) u ( τ ) ∈U t ≤ τ ≤ t f June 18, 2008 Spr 2008 16.323 4–2 • Approach: – Split time interval [ t,t f ] into [ t,t + Δ t ] and [ t + Δ t,t f ] , and are specifically interested in the case where Δ t → – Identify the optimal cost-to-go J ( x ( t + Δ t ) ,t + Δ t ) – Determine the “stage cost” in time [ t,t + Δ t ] – Combine above to find best strategy from time t . – Manipulate result into HJB equation. • Split: t f J ( x ( t ) ,t ) = min h ( x ( t f ) ,t f ) + g ( x ( τ ) , u ( τ ) ,τ )) dτ t u ( τ ) ∈U t ≤ τ ≤ t f t +Δ t t f = min h ( x ( t f ) ,t f ) + g ( x , u ,τ ) dτ + g ( x , u ,τ ) dτ t t +Δ t u ( τ ) ∈U t ≤ τ ≤ t f • Implicit here that at time t +Δ t , the system will be at state x ( t +Δ t ) . – But from the principle of optimality , we can write that the optimal cost-to-go from this state is: J ( x ( t + Δ t ) ,t + Δ t ) Thus can rewrite the cost calculation as: • t +Δ t J ( x ( t ) ,t ) = min g ( x , u ,τ ) dτ + J ( x ( t + Δ t ) ,t + Δ t ) t u ( τ ) ∈U t ≤ τ ≤ t +Δ t June 18, 2008 Spr 2008 16.323 4–3 • Assuming J ( x ( t + Δ t ) ,t + Δ t ) has bounded second derivatives in both arguments, can expand this cost as a Taylor series about x ( t ) ,t ∂J J ( x ( t + Δ t ) ,t + Δ t ) ≈ J ( x ( t ) ,t ) + ( x ( t ) ,t ) Δ t ∂t ∂J + ( x ( t ) ,t ) ( x ( t + Δ t ) − x ( t )) ∂ x – Which for small Δ t can be compactly written as: J ( x ( t + Δ t ) ,t + Δ t ) ≈ J ( x ( t ) ,t ) + J ( x ( t ) ,t )Δ t t + J x ( x ( t ) ,t ) a ( x ( t ) , u ( t ) ,t )Δ t • Substitute this into the cost calculation with...
View Full Document

This note was uploaded on 11/07/2011 for the course AERO 16.323 taught by Professor Jonathanhow during the Spring '08 term at MIT.

Page1 / 25

Lec4 - MIT OpenCourseWare http/ocw.mit.edu 16.323 Principles of Optimal Control Spring 2008 For information about citing these materials or our

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online