Even when a system follows nonlinear dynamics, it is often possible to de-sign an effective control strategy via optimizing a linear approximation. Forexample, contexts involving stabilization of physical systems, as illustratedin our next example, are often amenable to such an approach.Example 1.6.3. (cart-pole dynamics)Consider a cart-pole system, asillustrated in Figure 1.2. The cart can move only along a horizontal rail andthe force can be applied to accelerate the cart in either direction.θxyFx= 0Figure 1.2: A cart-pole system.Letxτandθτdenote the position of the cart and angle of the pole at timeτ. The position and angle are measured in meters and radians, while timeis in seconds. The time derivatives˙xτand˙θτrepresent velocity and angularvelocity.The continuous-time nonlinear dynamics of the system (obtainedfrom openai gym) are:¨θτ=(m+M)gsinθτ-(Fτ+ml˙θ2τsinθτ) cosθτl(43(m+M)-mcos2θτ),¨xτ=43(ml˙θ2τsinθτ+Fτ)-mgsinθτcosθτ43(m+M)-mcos2θτ,wheremis the mass of the pole,Mis the mass of the cart, andlis the lengthof the pole.
cBenjamin Van Roy15Approximating the continuous progression of time over discrete periods ofdurationΔ, we can write the evolution of state asxτ+Δ˙xτ+Δθτ+Δ˙θτ+Δ=xτ˙xτθτ˙θτ+˙xτ(m+M)gsinθτ-(Fτ+ml˙θ2τsinθτ) cosθτl(43(m+M)-mcos2θτ)˙θτ(m+M)gsinθτ-(Fτ+ml˙θ2τsinθτ) cosθτl(43(m+M)-mcos2θτ)Δ.Lettingt=τ/Δ, we can write the evolution asst+1=h(st, ut), withut=FtΔ,st=xtΔ˙xtΔθtΔ˙θtΔandh(st, ut) =st+st2(m+M)gsinst3-(ut+mls2t4sinst3) cosst3l(43(m+M)-mcos2st3)st4(m+M)gsinst3-(ut+mls2t4sinst3) cosst3l(43(m+M)-mcos2st3)Δ.The cart-pole system follows nonlinear dynamics since the system func-tionhdepends nonlinearly on state variables. When the aim is to maintainstability of a nonlinear system, it is often helpful to work with a linear approx-imation obtained via a first-order Taylor expansion around the equilibrium.Consider a nonlinear system functionh, and for simplicity, a desired equi-librium (s*, u*) taking values (0,0). A first-order Taylor expansion around(0,0) givesh(s, u)≈h(s*, u*) +(∇[su]h(0,0))>[su]=(∇[su]h(0,0))>[su]=As+Bu,for some matricesAandB. This approximation is accurate for sufficientlysmallsandu. Returning to the cart-pole example, we can offer a concretecase and formulate a natural quadratic control problem involving the linearapproximation.Example 1.6.4. (cart-pole linear approximation)Consider the pole bal-anced at states*= [ 0000 ]>. Suppose we wish to keep the pole nearlybalanced in the face of disturbances that nudge the state by small amounts.One approach to generating controls involves linearizing arounds*= 0andu*= 0. Via a first-order Taylor expansion, we obtain suitable matricesA=I+010000-mg43(m+M)-m0000100(m+M)gl(43(m+M)-m)0ΔandB=04343(m+M)-m0-1l(43(m+M)-m)Δ.
16To encourage balancing over some planning horizonTwithout expending