Tutorials / Inverted Pendulum Modeling and Control (Åström Ex. 2.2 & 9.6)

Inverted Pendulum Modeling and Control (Åström Ex. 2.2 & 9.6)

June 6, 2026 · Updated June 10, 2026

inverted pendulumLQRpole placementstate feedbackÅströmlinearizationPID tuning

Why this tutorial exists

The Inverted Pendulum Simulator ships a one-click preset for Åström & Murray, Feedback Systems (2008), §2.2 Example 2.2 — the simplified pole-only model. Press the button and you’ll see the pole stay roughly upright, but the cart drifts indefinitely to one side. That’s not a bug; it’s the textbook lesson:

The Åström Example 2.2 model has no x-control and no friction. The cart has no equilibrium, so any practical controller needs to close the loop on all four states (cart position, pole angle, cart velocity, pole angular velocity), not just on θ and ω.

This page walks through the whole modeling and control design process, from Newton’s laws on the cart-pole to a state-feedback gain that actually keeps the cart where you put it. The Python code is copy-pastable; the simulator tie-in at the bottom shows you what gains to plug in to reproduce the result in the browser.

Prerequisites

  • Linear algebra: matrix exponential, controllability
  • Basic control: state-space form x˙=Ax+Bu\dot{x} = Ax + Bu, eigenvalues
  • python-control ≥ 0.10 and scipy
  • Helpful but not required: read Getting Started with PID Tuning first

1. The cart-pole, fully assembled

A cart of mass MM slides on a horizontal rail. A pole of mass mm and length 2l2l (so the CoM is at distance ll from the pivot) is hinged on top of the cart. A horizontal force FF is applied to the cart. The pole angle θ\theta is measured from the upright, positive in the fall-right direction. Viscous friction coefficients bcartb_{\text{cart}} and bpivotb_{\text{pivot}} resist motion in the cart and at the pivot.

Choosing q=(p,θ)q = (p, \theta) as the configuration and writing Lagrange’s equations gives (Åström & Murray eq. 2.9, with Jt=J+ml2J_t = J + ml^2 and the rigid-body inertia term kept; we’ll take the point-mass limit J=0Jt=ml2J = 0 \Rightarrow J_t = ml^2 in a moment):

(M+mmlcosθmlcosθJt)(p¨θ¨)+(cp˙mlθ˙2sinθγθ˙)=(Fmglsinθ)\begin{pmatrix} M + m & -ml\cos\theta \\ -ml\cos\theta & J_t \end{pmatrix} \begin{pmatrix} \ddot{p} \\ \ddot{\theta} \end{pmatrix} + \begin{pmatrix} c\dot{p} - ml\dot{\theta}^2\sin\theta \\ \gamma\dot{\theta} \end{pmatrix} = \begin{pmatrix} F \\ -mgl\sin\theta \end{pmatrix}

For the rest of this tutorial we set c=γ=0c = \gamma = 0 (no friction) to match the textbook’s cleanest form, and use the point-mass moment of inertia Jt=ml2J_t = ml^2 (a uniform rod about its end would give Jt=13ml2J_t = \tfrac{1}{3}ml^2, but the simpler form is what the simulator integrates). Pre-solving for p¨\ddot{p} and θ¨\ddot{\theta} gives the decoupled form used in the simulator:

p¨=F+mlθ˙2sinθM+mmcosθ(gsinθ)M+m\ddot{p} = \frac{F + ml\dot{\theta}^2\sin\theta}{M + m} - \frac{m\cos\theta\,(g\sin\theta)}{M + m} θ¨=gsinθ+(F+mlθ˙2sinθ)cosθ/(M+m)l\ddot{\theta} = \frac{g\sin\theta + (F + ml\dot{\theta}^2\sin\theta)\cos\theta / (M + m)}{l}

(See the simulator’s EOM section for the version with friction and the textbook citation. The two forms are algebraically identical.)

2. Linearize at the upright

The upright is an equilibrium: θ=0\theta = 0, θ˙=0\dot\theta = 0, p˙=0\dot p = 0, F=0F = 0. We want to know what happens for small perturbations. Substitute sinθθ\sin\theta \to \theta, cosθ1\cos\theta \to 1, and drop the θ˙2\dot\theta^2 (quadratic in small quantities) terms:

(M+m)p¨+mlθ¨=F(M + m)\ddot p + ml\,\ddot\theta = F mlp¨+ml2θ¨=mglθml\,\ddot p + ml^2\,\ddot\theta = mgl\,\theta

Pick state x=(p,  θ,  p˙,  θ˙)x = (p,\;\theta,\;\dot p,\;\dot\theta)^\top and input u=Fu = F. Solving for the accelerations:

p¨=mgMθ+1MF\ddot p = -\frac{mg}{M}\,\theta + \frac{1}{M}\,F θ¨=g(M+m)Mlθ1MlF\ddot\theta = \frac{g(M + m)}{Ml}\,\theta - \frac{1}{Ml}\,F

The linearized state-space form x˙=Ax+Bu\dot x = Ax + Bu:

A=(001000010mgM000g(M+m)Ml00),B=(001/M1/(Ml))A = \begin{pmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & -\dfrac{mg}{M} & 0 & 0 \\ 0 & \dfrac{g(M + m)}{Ml} & 0 & 0 \end{pmatrix}, \qquad B = \begin{pmatrix} 0 \\ 0 \\ 1/M \\ -1/(Ml) \end{pmatrix}

For the canonical Åström values M=2,  m=0.2,  l=0.5,  g=9.81M = 2,\; m = 0.2,\; l = 0.5,\; g = 9.81:

A=(0010000100.98100021.58200),B=(000.51.0)A = \begin{pmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & -0.981 & 0 & 0 \\ 0 & 21.582 & 0 & 0 \end{pmatrix}, \qquad B = \begin{pmatrix} 0 \\ 0 \\ 0.5 \\ -1.0 \end{pmatrix}

Open-loop eigenvalues

eig(A)={0,  0,  +4.646,  4.646}\text{eig}(A) = \{0,\; 0,\; +4.646,\; -4.646\}

The +4.646+4.646 rad/s eigenvalue is the falling-pole instability. Two zero eigenvalues correspond to the cart’s position being a free state (no spring pulling it back to zero) and the angle-rate being the derivative of the angle. With this single unstable mode, the reachability matrix [B,  AB,  A2B,  A3B][B,\;AB,\;A^2B,\;A^3B] has full rank — the system is controllable. Good.

3. State feedback design

The classical fix for a controllable linear system with one unstable mode is to put all four closed-loop poles in the open left half plane. Two standard ways:

3a. Pole placement (Ackermann)

Pick the four desired closed-loop poles. A reasonable starting point for “moderately damped, a few seconds settling time” is:

import numpy as np
from scipy.signal import place_poles

M, m, l, g = 2.0, 0.2, 0.5, 9.81
A = np.array([
  [0, 0, 1, 0],
  [0, 0, 0, 1],
  [0, -m*g/M, 0, 0],
  [0, g*(M+m)/(M*l), 0, 0],
])
B = np.array([[0], [0], [1/M], [-1/(M*l)]])

# Two slow poles for the (cart-position) integrator-like behavior
# and two faster poles for the (pole-angle) unstable mode.
desired = np.array([-1.5 + 0.5j, -1.5 - 0.5j, -3.0 + 1.5j, -3.0 - 1.5j])
K = place_poles(A, B, desired).gain_matrix
print(K)
# K = [[ -2.39, -56.16,  -6.85,  -9.94 ]]

The sign of each gain matters. The four entries of KK multiply (p,θ,p˙,θ˙)(p, \theta, \dot p, \dot\theta)^\top — so K0<0K_0 < 0 means “if the cart is too far right, push it left” (regulate p0p \to 0), K1<0K_1 < 0 means “if the pole is leaning right, push the cart right to get under it” (regulate θ0\theta \to 0), and so on.

3½. Same physics, different coordinates: Ex 2.2 vs Ex 9.6

The two presets on the simulator’s preset bar — Ex 2.2 and Ex 9.6 — look very different (different masses, different default gains, very different time scale) but they are the same cart-pole equation in two different unit systems. The simulator’s 4-state nonlinear EOM (Spong eq. 5.6–5.7, point-mass pole with I=ml2I = ml^2) is integrated in both cases. The presets only change the slider values of (M,m,l,bcart,bpivot)(M, m, l, b_{\text{cart}}, b_{\text{pivot}}).

Linearized about the upright, frictionless, the theta-channel of either preset reduces to a single-input single-output plant:

Θ(s)F(s)  =  1Mls2(M+m)g\frac{\Theta(s)}{F(s)} \;=\; \frac{1}{Ml\,s^2 - (M+m)g}

with one open-loop pole on the right-half plane at

s  =  +(M+m)gMl(rad/s)s \;=\; +\sqrt{\frac{(M+m)\,g}{M\,l}} \quad \text{(rad/s)}

Plug in the two presets’ slider values and the open-loop pole is at very different rad/s:

PresetMM, mm, llMlMl(M+m)g(M+m)gOpen-loop pole
Ex 2.20.5, 0.2, 0.30.156.867±6.77 rad/s (≈ 1.1 Hz)
Ex 9.60.001, 1, 10.0019.821±99.1 rad/s (≈ 15.8 Hz)

The Ex 9.6 plant is ~15× faster than Ex 2.2 in raw time. That’s the only intrinsic difference between the two as far as the linearized theta-channel goes.

Where does the 15× come from? It is a choice of time unit. Define a normalized time

τ  =  t(M+m)gMl  =  tωunstable\tau \;=\; t \cdot \sqrt{\frac{(M+m)\,g}{M\,l}} \;=\; t \cdot \omega_{\text{unstable}}

i.e. measure time in units of the unstable-pole period. The plant becomes

ΘF  =  1(M+m)g(τ21)\frac{\Theta}{F} \;=\; \frac{1}{(M+m)\,g\,(\tau^2 - 1)}

so the pole in the τ\tau-frame is at ±1\pm 1, independent of M,m,lM, m, l. This is the unit system Åström uses in §8.3 and §9.6 when he writes P(s)=1/(s21)P(s) = 1/(s^2-1) with the cart on a massless pivot. It is not a different physics — it is the same physics with the time axis rescaled.

Consequence for tuning. In the τ\tau-frame, the closed-loop poles depend on three dimensionless gain ratios:

K~p  =  Kp(M+m)g,K~d  =  Kd(M+m)gMl,N~  =  Nωunstable\tilde K_p \;=\; \frac{K_p}{(M+m)\,g}, \qquad \tilde K_d \;=\; \frac{K_d}{\sqrt{(M+m)\,g\,M l}}, \qquad \tilde N \;=\; \frac{N}{\omega_{\text{unstable}}}

Two plants give the same closed-loop pole locations (in τ\tau-units) iff these three ratios are equal. Concretely, if you have a working PID on one preset, the matching PID on the other preset is:

Kp(B)  =  Kp(A)(MB+mB)(MA+mA)K_p^{\text{(B)}} \;=\; K_p^{\text{(A)}} \cdot \frac{(M_{\text{B}} + m_{\text{B}})}{(M_{\text{A}} + m_{\text{A}})} Kd(B)  =  Kd(A)(MB+mB)MBlB(MA+mA)MAlAK_d^{\text{(B)}} \;=\; K_d^{\text{(A)}} \cdot \sqrt{\frac{(M_{\text{B}} + m_{\text{B}})\, M_{\text{B}}\, l_{\text{B}}} {(M_{\text{A}} + m_{\text{A}})\, M_{\text{A}}\, l_{\text{A}}}} N(B)  =  N(A)ωunstable, Bωunstable, AN^{\text{(B)}} \;=\; N^{\text{(A)}} \cdot \frac{\omega_{\text{unstable, B}}}{\omega_{\text{unstable, A}}}

Numerically, mapping the current Ex 9.6 gains Kp=25, Kd=1, N=50K_p = 25,\ K_d = 1,\ N = 50 over to the Ex 2.2 plant (M=0.5,m=0.2,l=0.3M{=}0.5, m{=}0.2, l{=}0.3) gives

Kp17.5,Kd10.2,N3.4\boxed{K_p \approx 17.5,\quad K_d \approx 10.2,\quad N \approx 3.4}

Plug those into the Ex 2.2 preset and the closed-loop behavior (in real time) is exactly what you see on the Ex 9.6 preset — just ~15× slower, which is what “same physics in different units” means.

Why the same PID feels different on the two presets. If you load the current Ex 2.2 default gains (Kp=100K_p = 100, Kd=30K_d = 30, N=20N = 20) onto the Ex 9.6 plant, the dimensionless ratios are K~p14.6\tilde K_p \approx 14.6 and K~d29.6\tilde K_d \approx 29.6 — about 5–6× larger than the Ex 9.6 default’s K~p2.55, K~d10.1\tilde K_p \approx 2.55,\ \tilde K_d \approx 10.1. A high K~d\tilde K_d in the τ\tau-frame pushes the closed-loop poles into a fast, lightly-damped region, which is why the Ex 2.2 gains can destabilize the Ex 9.6 plant even though the physical setup is “the same.”

Where the two presets are physically different:

  1. Cart friction bcartb_{\text{cart}}. This is the only physical difference that the τ\tau-frame mapping above does not absorb. Ex 9.6 has bcart=1b_{\text{cart}} = 1 (a strong damper on the cart’s runaway mode). Ex 2.2 has bcart=0b_{\text{cart}} = 0, so the cart position xx is a free integrator and the cart will drift indefinitely under any pole-only feedback. The scalar PD on (θ,ω)(\theta, \omega) cannot regulate the cart, only the pole — this is the lesson in §1.

    The reason this matters in the simulator and not in the textbook τ\tau-frame analysis: the textbook models the 2-state plant (theta, omega only). The simulator integrates the 4-state plant (cart x, cart v, pole theta, pole omega). With bcart=0b_{\text{cart}} = 0 the linearized 4-state has eigenvalues {0,0,+6.77,6.77}\{0, 0, +6.77, -6.77\} rad/s — two of them are on the imaginary axis (the cart-position double integrator). The theta-only feedback closes the loop on the unstable pole pair but does not move the two imaginary-axis eigenvalues at all. Any tiny force from numerical noise eventually excites the double-integrator mode and the cart runs off to ±\pm\infty at growing speed.

    Adding bcart>0b_{\text{cart}} > 0 breaks the double-integrator degeneracy and replaces the {0,0}\{0, 0\} pair with a damped mode (real, negative, finite). The cart is then bounded and the system is at least practical to work with under pole-only PD. The proper fix is a 4-state controller — see §8.

  2. What the page’s plant-TF box shows. When Ex 9.6 is active, the displayed plant is switched to the textbook 1/(s21)1/(s^2-1) form in normalized units (pole at ±1\pm 1 rad/s, which is 1/L/g0.321/\sqrt{L/g} \approx 0.32 s per natural second). The simulator below the math is still integrating the 4-state plant in raw SI units — the two views are in different unit systems. The page has a long note under the plant-TF box spelling this out.

Worked example: Ex 2.2 with the τ\tau-mapped gains. Plug Kp=18, Kd=10, N=3K_p = 18,\ K_d = 10,\ N = 3 into the Ex 2.2 preset and press Start. Hold bcart=0b_{\text{cart}} = 0 (the default) and watch for 20 s, sampling every second:

t (s)cart xx (m)pole θ\theta (deg)peak θ\|\theta\| (deg)
00.0005.75.7
11.290−326.1326.1
6−464.710−310.3338.2
15−3632.080−337.9338.2
20(off-screen)338.2

The cart accelerates monotonically (peak x=3632|x| = 3632 m at t=15t = 15 s) and the pole has fully rotated within the first second. This is not a gain-mapping bug — it’s the unobservable cart mode I described above. The matching gains give the τ\tau-frame behavior the mapping promised: a 338° peak angle is the same ζ0.07\zeta \approx 0.07 lightly-damped response as Ex 9.6, just on a 15× slower clock. The cart, however, is runaway because bcart=0b_{\text{cart}} = 0.

Now do the same with bcart=1b_{\text{cart}} = 1 (matching Ex 9.6):

bcartb_{\text{cart}}peak x\|x\| (m) at t=20t = 20 speak θ\|\theta\| (deg)
0 (Ex 2.2 default)3632 (runaway)338
0.13452 (runaway)382
0.5399 (bounded, large)382
1.0 (matches Ex 9.6)506 (bounded)363
2.0 (slider max)283 (bounded)523

bcart=1b_{\text{cart}} = 1 is the minimum friction that keeps the cart finite. The response is still ζ0.07\zeta \approx 0.07 lightly-damped (visible ringing), but it is the same τ\tau-frame behavior as Ex 9.6 stretched out 15× in real time. To get a critically damped response, retune the τ\tau-frame gains for the desired pole locations — see the recipe below.

Recipe: use the simulator with matched gains, step by step.

  1. Click Åström Ex. 2.2 (loads the finite-mass cart plant with M=0.5,m=0.2,l=0.3,bcart=0M{=}0.5, m{=}0.2, l{=}0.3, b_{\text{cart}}{=}0, Kp=100,Kd=30,N=20K_p{=}100, K_d{=}30, N{=}20).

  2. Set bcart=1b_{\text{cart}} = 1 (cart friction slider). The default is 0 and the cart will run away regardless of PID gains — see the worked example above.

  3. Set Kp=18,Kd=10,N=3K_p = 18, K_d = 10, N = 3 (the τ\tau-mapped gains that match Ex 9.6’s Kp=25,Kd=1,N=50K_p{=}25, K_d{=}1, N{=}50). The pole stays near upright; the cart oscillates ±0.5\pm 0.5 m with a 1.5\approx 1.5 s period (lightly damped, ζ0.07\zeta \approx 0.07).

  4. For a more conservative response, design the τ\tau-frame pole locations first. Place 3 τ\tau-frame poles at sτ=1, 1+j, 1js_\tau = -1,\ -1 + j,\ -1 - j (well damped, 1 τ\tau-rad/s bandwidth). The required τ\tau-frame gains are K~p=5/3, K~d=10/9, N~=3\tilde K_p = 5/3,\ \tilde K_d = 10/9,\ \tilde N = 3, which map to the Ex 2.2 plant as Kp11.4, Kd1.13, N20.3\boxed{K_p \approx 11.4,\ K_d \approx 1.13,\ N \approx 20.3}. Plug those in and the pole settles in about 1.5 s with very little ringing; the cart reaches a steady offset of a few cm and stays there.

    The same τ\tau-frame poles map to Kp16.4,Kd0.11,N297K_p \approx 16.4, K_d \approx 0.11, N \approx 297 on the Ex 9.6 plant — the KdK_d is below the Ex 9.6 slider’s step size, so this recipe is really only practical on the Ex 2.2 plant.

Why “well-damped” still does not mean “cart stays at zero.” Even with Kd=1.13K_d = 1.13 and bcart=1b_{\text{cart}} = 1, the cart reaches a finite steady-state offset. The simulator’s PD on (θ,ω)(\theta, \omega) has no integral action on cart position, so any steady force on the cart (from a tilt, say) leaves a steady position error. The proper fix is a 4-state controller (see §8) or an integral term on cart position — neither is in the current simulator build. The page’s callout under the canvas flags this when you press Ex 2.2.

3b. The transfer-function form (Åström Ex. 9.6)

Chapters 2 and 5 of Åström work in state space; Chapter 8 introduces the transfer function as an alternative representation. The linearized cart-pole is a 2-input 1-output system in general, but if you make the cart massless (M0M \to 0) the cart’s position becomes a pure integrator p¨=u\ddot p = u and the dynamics collapse to a single-input single-output plant with the pole as the only state.

This is the normalized form Åström uses in §8.3 and §9.6: M=0M = 0, m=1m = 1, l=1l = 1, g=1g = 1, and the input is the cart’s acceleration uu (not force). Linearizing the pole EOM θ¨=(g/l)sinθ(1/l)p¨cosθ\ddot\theta = (g/l)\sin\theta - (1/l)\ddot p \cos\theta at the upright gives θ¨=θu\ddot\theta = \theta - u, i.e. the transfer function

P(s)=Θ(s)U(s)=1s21P(s) = \frac{\Theta(s)}{U(s)} = \frac{1}{s^2 - 1}

This is the Example 9.6 plant. The unstable pole at s=+1s = +1 corresponds to the falling-pole mode (same physical mode as the +4.65 rad/s eigenvalue in §2, just with a different time scaling).

Designing a PD controller. A PD compensator with zero at s=2s = -2 has the form C(s)=k(s+2)C(s) = k(s + 2). The loop transfer function is

L(s)=C(s)P(s)=k(s+2)s21L(s) = C(s)P(s) = \frac{k(s + 2)}{s^2 - 1}

The closed-loop characteristic polynomial is the denominator of 1/(1+L(s))1/(1 + L(s)):

s21+k(s+2)=s2+ks+(2k1)=0s^2 - 1 + k(s + 2) = s^2 + k\,s + (2k - 1) = 0

By Routh–Hurwitz, this is stable iff 2k1>02k - 1 > 0 and k>0k > 0 — i.e. k>0.5k > 0.5. (The textbook quotes k>1k > 1 as the practical bound; the formal stability boundary is k>0.5k > 0.5.) For the textbook’s figure value k=2k = 2, the closed-loop poles are at s=1±1.414js = -1 \pm 1.414\,j, giving a natural frequency of 2  rad/s\sqrt{2}\;\text{rad/s} and a damping ratio of 1/20.7071/\sqrt{2} \approx 0.707 (nice round numbers — this is why the textbook picks k=2k = 2).

Verify with python-control:

import control as ct
P = ct.tf([1], [1, 0, -1])         # P(s) = 1/(s^2 - 1)
C = ct.tf([2, 4], [1])             # C(s) = 2(s + 2)  (i.e. k = 2)
L = C * P
T = ct.feedback(L, 1)              # closed-loop
print(T.poles())                   # [-1+1.414j, -1-1.414j]

# Gain and phase margin from the open loop
gm, pm, wg, wp = ct.margin(L)
print(f'gain margin = {gm:.2f}, phase margin = {pm*180/3.14159:.1f} deg')

Bode / Nyquist interpretation. The Nyquist plot of L(s)=2(s+2)/(s21)L(s) = 2(s+2)/(s^2-1) crosses the negative real axis at L(0)=2k=4L(0) = -2k = -4 (for k=2k = 2), and L()=0L(\infty) = 0. The curve makes one counterclockwise encirclement of the critical point s=1s = -1, so N=1N = -1. With P=1P = 1 (one open-loop RHP pole), the Nyquist criterion gives Z=N+P=0Z = N + P = 0 closed-loop RHP poles. Same conclusion as Routh-Hurwitz.

The page’s PD form F=KpθKddfF = K_p\theta - K_d\,d_f corresponds to C(s)=Kds+KpC(s) = K_d s + K_p in the Laplace domain, so to get C(s)=k(s+2)C(s) = k(s+2) in the simulator you set Kp=2kK_p = 2k and Kd=kK_d = k. For the textbook’s k=2k = 2: Kp=4K_p = 4, Kd=2K_d = 2, Ki=0K_i = 0. With M=0M = 0 (massless cart) the simulator’s “force” FF is numerically equal to the pivot acceleration uu, so the normalized form drops out automatically.

4. The control law in the simulator

The current simulator implements a PD on θ and ω only:

F = clamp(Kp · θ + Ki · ∫θ dt - Kd · d_f, -F_max, F_max)

with d˙f=N(ωdf)\dot d_f = N(\omega - d_f) for the derivative filter. This is a scalar control law — it only sees the pole, not the cart. That’s why clicking Åström Ex. 2.2 leaves the cart drifting.

To use the full state-feedback gain from §3, the simulator needs a 4-input control law:

F=(K0p+K1θ+K2p˙+K3θ˙)F = -(K_0 p + K_1 \theta + K_2 \dot p + K_3 \dot\theta)

with the cart position pp actively regulated to zero. This extension is planned for the next simulator update; in the meantime, you can reproduce the behavior with the LQR gains above by running the Python code below against the same equations the simulator uses.

5. Verify with python-control

import control as ct

# Same A, B, K_lqr from §3b
A_cl = A - B @ K_lqr
sys_cl = ct.ss(A_cl, np.zeros((4, 1)), np.eye(4), np.zeros((4, 1)))
T = np.linspace(0, 5, 1000)
X0 = np.array([0, 0.1, 0, 0])   # pole at 0.1 rad, cart at rest
t, y = ct.initial_response(sys_cl, T, X0)
# y[:, 0] = p(t), y[:, 1] = θ(t), ...

You should see pp and θ\theta both decay to zero with a couple of oscillations, with the cart excursion on the order of centimetres (not the runaway drift of the PD-only law).

6. Take it to the simulator

The simulator now ships two Åström textbook presets:

  • Åström Ex. 2.2 — the full state-space form with scalar PD on θ, ω only. Pole stays upright, cart drifts (see §1).
  • Åström Ex. 9.6 — the normalized transfer-function form with the massless-cart assumption. The controller is C(s)=k(s+2)C(s) = k(s+2) with k=2k = 2, which gives the textbook’s closed-loop poles at s=1±1.414js = -1 \pm 1.414\,j (damping ratio 0.7070.707). Cart doesn’t drift because the controller is a true PD on θ\theta (zero steady-state error) and the plant is linear.

To reproduce the LQR result from §3b in the browser, the simulator needs the 4-state gain (Kp_x, Kp_θ, Kd_x, Kd_ω) = (−4.5, −82.5, −10.1, −16.2). The current PID interface only exposes scalar KpK_p, KdK_d on θ\theta, so the workaround is:

  1. Open the simulator
  2. Set the cart mass, pole mass, half-length, and friction to the Åström values (M=0.5, m=0.2, L=0.3, b_cart=b_pivot=0)
  3. Plug in Kp=82.5K_p = 82.5, Kd=16.2K_d = 16.2, set Ki=0K_i = 0, and add cart-position feedback manually by leaving the cart undisturbed (the simulator currently has no pp term in the PID)
  4. Click Start. The pole will swing up; the cart will hold roughly still. The deviation from true LQR is the missing pp and p˙\dot p terms.

For the Ex. 9.6 form, just click the button — the gains Kp=4K_p = 4, Kd=2K_d = 2 are loaded and the normalized masses M=0.001M = 0.001, m=1m = 1, L=1L = 1 are set so the simulator’s “force” becomes the pivot acceleration.

7. What to take away

  • Åström Example 2.2 is a teaching model, not a working controller. Its value is in showing the structure of the inverted-pendulum EOM, not in producing a deployable design.
  • A real cart-pole controller closes the loop on all four states. Pole-only PD is enough to keep the pole upright for a while but cannot hold the cart still.
  • The simulator is intentionally minimal. It exposes the scalar PID knobs (Kp, Ki, Kd, N) so you can build intuition. The state-feedback extension above is the next planned step.
  • Ex 2.2 and Ex 9.6 are the same physics in different unit systems. A gain that works on one preset does not, in general, work on the other — see §3½ for the τ\tau-frame mapping that tells you how to translate PID gains between presets.

8. Other methods (not in the simulator)

The methods below are conceptually important and are the standard tooling for cart-pole control design, but the current simulator implements only the scalar PID. They are documented here for reference and to keep this tutorial complete; running them requires either python-control in a notebook or a future simulator build.

8a. LQR (more robust than manual pole placement)

LQR finds the gain KK that minimizes 0(xQx+uRu)dt\int_0^\infty (x^\top Q x + u^\top R\, u)\, dt for a chosen weighting. Pick QQ to penalize what you care about (pole angle is usually the priority) and RR to penalize control effort:

from scipy.linalg import solve_continuous_are

Q = np.diag([1, 50, 1, 1])   # weight θ heavily
R = np.array([[0.05]])        # small control effort
P = solve_continuous_are(A, B, Q, R)
K_lqr = np.linalg.solve(R, B.T @ P)
print(K_lqr)
# K_lqr = [[ -4.47, -82.55, -10.10, -16.21 ]]

LQR guarantees closed-loop stability and a guaranteed gain/phase margin. It’s almost always a better starting point than pole placement. The simulator would need a 4-state input gain (Kp,Kθ,Kd,x,Kd,θ)(K_p, K_\theta, K_{d,x}, K_{d,\theta}) to use LQR; see §4 for the gap.

8b. Energy-based / swing-up control

For large initial angles θ0\theta_0 (e.g. the pole hanging down at π\pi rad), a linear controller cannot stabilize — the linearized model is only valid near the upright. The classical fix is a two-mode controller: an energy-shaping swing-up law V˙=k(EEup)\dot V = -k (E - E_{\text{up}}) that drives the pole’s total mechanical energy to the upright value, and a switching rule to the linear balancing controller once θ|\theta| is small enough that the linearization is valid (typically θ<20°|\theta| < 20°).

The swing-up law for the cart-pole is

u  =  ksign(θ˙cosθ)(EEup)u \;=\; k \cdot \operatorname{sign}(\dot\theta \cos\theta) \cdot \big( E - E_{\text{up}} \big)

with E=12ml2θ˙2mglcosθE = \tfrac{1}{2} m l^2 \dot\theta^2 - m g l \cos\theta the pole’s total energy. The sign term drives the cart back and forth to inject energy into the pole, building up amplitude. Once near upright the linear controller takes over.

8c. MPC / LQR with partial state observation

If only the pole angle θ\theta and the cart position pp are measured (no rate sensors), you need an observer. The standard choice is a Kalman filter (LQE) running on the linearized A,B,CA, B, C to produce state estimates, which are then fed to an LQR controller. This is the LQG (Linear-Quadratic-Gaussian) design. It is robust to sensor noise and handles the situation where you can only measure position, not velocity.

The simulator currently assumes θ\theta and ω\omega are both available (the slider values for KdK_d multiply ω\omega directly), so an observer is not strictly needed for the experiments on the page. If you wanted to hook up a real sensor (e.g. an accelerometer on the cart, an encoder on the pivot), an observer would be the next step.

References

  • Åström, K. J., & Murray, R. M. (2008). Feedback Systems: An Introduction for Scientists and Engineers. Princeton University Press. §2.2 Example 2.2, §6.3 State Feedback, Exercise 8.3 (linearized model), Exercise 8.13 (PD design).
  • Spong, M. W., Hutchinson, S., & Vidyasagar, M. (2006). Robot Modeling and Control. Wiley. Ch. 5 (decoupled EOM derivation).
  • Åström, K. J., & Hägglund, T. (2006). Advanced PID Control. ISA. (PID form with derivative filter.)

Comments