Even when a system follows nonlinear dynamics it is

This preview shows page 14 - 17 out of 70 pages.

Even when a system follows nonlinear dynamics, it is often possible to de- sign an effective control strategy via optimizing a linear approximation. For example, contexts involving stabilization of physical systems, as illustrated in our next example, are often amenable to such an approach. Example 1.6.3. (cart-pole dynamics) Consider a cart-pole system, as illustrated in Figure 1.2. The cart can move only along a horizontal rail and the force can be applied to accelerate the cart in either direction. θ x y F x = 0 Figure 1.2: A cart-pole system. Let x τ and θ τ denote the position of the cart and angle of the pole at time τ . The position and angle are measured in meters and radians, while time is in seconds. The time derivatives ˙ x τ and ˙ θ τ represent velocity and angular velocity. The continuous-time nonlinear dynamics of the system (obtained from openai gym) are: ¨ θ τ = ( m + M ) g sin θ τ - ( F τ + ml ˙ θ 2 τ sin θ τ ) cos θ τ l ( 4 3 ( m + M ) - m cos 2 θ τ ) , ¨ x τ = 4 3 ( ml ˙ θ 2 τ sin θ τ + F τ ) - mg sin θ τ cos θ τ 4 3 ( m + M ) - m cos 2 θ τ , where m is the mass of the pole, M is the mass of the cart, and l is the length of the pole.
c Benjamin Van Roy 15 Approximating the continuous progression of time over discrete periods of duration Δ , we can write the evolution of state as x τ ˙ x τ θ τ ˙ θ τ = x τ ˙ x τ θ τ ˙ θ τ + ˙ x τ ( m + M ) g sin θ τ - ( F τ + ml ˙ θ 2 τ sin θ τ ) cos θ τ l ( 4 3 ( m + M ) - m cos 2 θ τ ) ˙ θ τ ( m + M ) g sin θ τ - ( F τ + ml ˙ θ 2 τ sin θ τ ) cos θ τ l ( 4 3 ( m + M ) - m cos 2 θ τ ) Δ . Letting t = τ/ Δ , we can write the evolution as s t +1 = h ( s t , u t ) , with u t = F t Δ , s t = x t Δ ˙ x t Δ θ t Δ ˙ θ t Δ and h ( s t , u t ) = s t + s t 2 ( m + M ) g sin s t 3 - ( u t + mls 2 t 4 sin s t 3 ) cos s t 3 l ( 4 3 ( m + M ) - m cos 2 s t 3 ) s t 4 ( m + M ) g sin s t 3 - ( u t + mls 2 t 4 sin s t 3 ) cos s t 3 l ( 4 3 ( m + M ) - m cos 2 s t 3 ) Δ . The cart-pole system follows nonlinear dynamics since the system func- tion h depends nonlinearly on state variables. When the aim is to maintain stability of a nonlinear system, it is often helpful to work with a linear approx- imation obtained via a first-order Taylor expansion around the equilibrium. Consider a nonlinear system function h , and for simplicity, a desired equi- librium ( s * , u * ) taking values (0 , 0). A first-order Taylor expansion around (0 , 0) gives h ( s, u ) h ( s * , u * ) + ( [ s u ] h (0 , 0) ) > [ s u ] = ( [ s u ] h (0 , 0) ) > [ s u ] = As + Bu, for some matrices A and B . This approximation is accurate for sufficiently small s and u . Returning to the cart-pole example, we can offer a concrete case and formulate a natural quadratic control problem involving the linear approximation. Example 1.6.4. (cart-pole linear approximation) Consider the pole bal- anced at state s * = [ 0 0 0 0 ] > . Suppose we wish to keep the pole nearly balanced in the face of disturbances that nudge the state by small amounts. One approach to generating controls involves linearizing around s * = 0 and u * = 0 . Via a first-order Taylor expansion, we obtain suitable matrices A = I + 0 1 0 0 0 0 - mg 4 3 ( m + M ) - m 0 0 0 0 1 0 0 ( m + M ) g l ( 4 3 ( m + M ) - m ) 0 Δ and B = 0 4 3 4 3 ( m + M ) - m 0 - 1 l ( 4 3 ( m + M ) - m ) Δ .
16 To encourage balancing over some planning horizon T without expending

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture