Inverted Pendulum Modeling and Control (Åström Ex. 2.2 & 9.6)

Why this tutorial exists

The Inverted Pendulum Simulator ships a one-click preset for Åström & Murray, Feedback Systems (2008), §2.2 Example 2.2 — the simplified pole-only model. Press the button and you’ll see the pole stay roughly upright, but the cart drifts indefinitely to one side. That’s not a bug; it’s the textbook lesson:

The Åström Example 2.2 model has no x-control and no friction. The cart has no equilibrium, so any practical controller needs to close the loop on all four states (cart position, pole angle, cart velocity, pole angular velocity), not just on θ and ω.

This page walks through the whole modeling and control design process, from Newton’s laws on the cart-pole to a state-feedback gain that actually keeps the cart where you put it. The Python code is copy-pastable; the simulator tie-in at the bottom shows you what gains to plug in to reproduce the result in the browser.

Prerequisites

Linear algebra: matrix exponential, controllability
Basic control: state-space form $\dot{x} = Ax + Bu$ , eigenvalues
python-control ≥ 0.10 and scipy
Helpful but not required: read Getting Started with PID Tuning first

1. The cart-pole, fully assembled

A cart of mass $M$ slides on a horizontal rail. A pole of mass $m$ and length $2l$ (so the CoM is at distance $l$ from the pivot) is hinged on top of the cart. A horizontal force $F$ is applied to the cart. The pole angle $\theta$ is measured from the upright, positive in the fall-right direction. Viscous friction coefficients $b_{\text{cart}}$ and $b_{\text{pivot}}$ resist motion in the cart and at the pivot.

Choosing $q = (p, \theta)$ as the configuration and writing Lagrange’s equations gives (Åström & Murray eq. 2.9, with $J_t = J + ml^2$ and the rigid-body inertia term kept; we’ll take the point-mass limit $J = 0 \Rightarrow J_t = ml^2$ in a moment):

\begin{pmatrix} M + m & -ml\cos\theta \\ -ml\cos\theta & J_t \end{pmatrix} \begin{pmatrix} \ddot{p} \\ \ddot{\theta} \end{pmatrix} + \begin{pmatrix} c\dot{p} - ml\dot{\theta}^2\sin\theta \\ \gamma\dot{\theta} \end{pmatrix} = \begin{pmatrix} F \\ -mgl\sin\theta \end{pmatrix}

For the rest of this tutorial we set $c = \gamma = 0$ (no friction) to match the textbook’s cleanest form, and use the point-mass moment of inertia $J_t = ml^2$ (a uniform rod about its end would give $J_t = \tfrac{1}{3}ml^2$ , but the simpler form is what the simulator integrates). Pre-solving for $\ddot{p}$ and $\ddot{\theta}$ gives the decoupled form used in the simulator:

\ddot{p} = \frac{F + ml\dot{\theta}^2\sin\theta}{M + m} - \frac{m\cos\theta\,(g\sin\theta)}{M + m}

\ddot{\theta} = \frac{g\sin\theta + (F + ml\dot{\theta}^2\sin\theta)\cos\theta / (M + m)}{l}

(See the simulator’s EOM section for the version with friction and the textbook citation. The two forms are algebraically identical.)

2. Linearize at the upright

The upright is an equilibrium: $\theta = 0$ , $\dot\theta = 0$ , $\dot p = 0$ , $F = 0$ . We want to know what happens for small perturbations. Substitute $\sin\theta \to \theta$ , $\cos\theta \to 1$ , and drop the $\dot\theta^2$ (quadratic in small quantities) terms:

(M + m)\ddot p + ml\,\ddot\theta = F

ml\,\ddot p + ml^2\,\ddot\theta = mgl\,\theta

Pick state $x = (p,\;\theta,\;\dot p,\;\dot\theta)^\top$ and input $u = F$ . Solving for the accelerations:

\ddot p = -\frac{mg}{M}\,\theta + \frac{1}{M}\,F

\ddot\theta = \frac{g(M + m)}{Ml}\,\theta - \frac{1}{Ml}\,F

The linearized state-space form $\dot x = Ax + Bu$ :

A = \begin{pmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & -\dfrac{mg}{M} & 0 & 0 \\ 0 & \dfrac{g(M + m)}{Ml} & 0 & 0 \end{pmatrix}, \qquad B = \begin{pmatrix} 0 \\ 0 \\ 1/M \\ -1/(Ml) \end{pmatrix}

For the canonical Åström values $M = 2,\; m = 0.2,\; l = 0.5,\; g = 9.81$ :

A = \begin{pmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & -0.981 & 0 & 0 \\ 0 & 21.582 & 0 & 0 \end{pmatrix}, \qquad B = \begin{pmatrix} 0 \\ 0 \\ 0.5 \\ -1.0 \end{pmatrix}

Open-loop eigenvalues

\text{eig}(A) = \{0,\; 0,\; +4.646,\; -4.646\}

The $+4.646$ rad/s eigenvalue is the falling-pole instability. Two zero eigenvalues correspond to the cart’s position being a free state (no spring pulling it back to zero) and the angle-rate being the derivative of the angle. With this single unstable mode, the reachability matrix $[B,\;AB,\;A^2B,\;A^3B]$ has full rank — the system is controllable. Good.

3. State feedback design

The classical fix for a controllable linear system with one unstable mode is to put all four closed-loop poles in the open left half plane. Two standard ways:

3a. Pole placement (Ackermann)

Pick the four desired closed-loop poles. A reasonable starting point for “moderately damped, a few seconds settling time” is:

import numpy as np
from scipy.signal import place_poles

M, m, l, g = 2.0, 0.2, 0.5, 9.81
A = np.array([
  [0, 0, 1, 0],
  [0, 0, 0, 1],
  [0, -m*g/M, 0, 0],
  [0, g*(M+m)/(M*l), 0, 0],
])
B = np.array([[0], [0], [1/M], [-1/(M*l)]])

# Two slow poles for the (cart-position) integrator-like behavior
# and two faster poles for the (pole-angle) unstable mode.
desired = np.array([-1.5 + 0.5j, -1.5 - 0.5j, -3.0 + 1.5j, -3.0 - 1.5j])
K = place_poles(A, B, desired).gain_matrix
print(K)
# K = [[ -2.39, -56.16,  -6.85,  -9.94 ]]

The sign of each gain matters. The four entries of $K$ multiply $(p, \theta, \dot p, \dot\theta)^\top$ — so $K_0 < 0$ means “if the cart is too far right, push it left” (regulate $p \to 0$ ), $K_1 < 0$ means “if the pole is leaning right, push the cart right to get under it” (regulate $\theta \to 0$ ), and so on.

3½. Same physics, different coordinates: Ex 2.2 vs Ex 9.6

The two presets on the simulator’s preset bar — Ex 2.2 and Ex 9.6 — look very different (different masses, different default gains, very different time scale) but they are the same cart-pole equation in two different unit systems. The simulator’s 4-state nonlinear EOM (Spong eq. 5.6–5.7, point-mass pole with $I = ml^2$ ) is integrated in both cases. The presets only change the slider values of $(M, m, l, b_{\text{cart}}, b_{\text{pivot}})$ .

Linearized about the upright, frictionless, the theta-channel of either preset reduces to a single-input single-output plant:

\frac{\Theta(s)}{F(s)} \;=\; \frac{1}{Ml\,s^2 - (M+m)g}

with one open-loop pole on the right-half plane at

s \;=\; +\sqrt{\frac{(M+m)\,g}{M\,l}} \quad \text{(rad/s)}

Plug in the two presets’ slider values and the open-loop pole is at very different rad/s:

Preset	$M$ , $m$ , $l$	$Ml$	$(M+m)g$	Open-loop pole
Ex 2.2	0.5, 0.2, 0.3	0.15	6.867	±6.77 rad/s (≈ 1.1 Hz)
Ex 9.6	0.001, 1, 1	0.001	9.821	±99.1 rad/s (≈ 15.8 Hz)

The Ex 9.6 plant is ~15× faster than Ex 2.2 in raw time. That’s the only intrinsic difference between the two as far as the linearized theta-channel goes.

Where does the 15× come from? It is a choice of time unit. Define a normalized time

\tau \;=\; t \cdot \sqrt{\frac{(M+m)\,g}{M\,l}} \;=\; t \cdot \omega_{\text{unstable}}

i.e. measure time in units of the unstable-pole period. The plant becomes

\frac{\Theta}{F} \;=\; \frac{1}{(M+m)\,g\,(\tau^2 - 1)}

so the pole in the $\tau$ -frame is at $\pm 1$ , independent of $M, m, l$ . This is the unit system Åström uses in §8.3 and §9.6 when he writes $P(s) = 1/(s^2-1)$ with the cart on a massless pivot. It is not a different physics — it is the same physics with the time axis rescaled.

Consequence for tuning. In the $\tau$ -frame, the closed-loop poles depend on three dimensionless gain ratios:

\tilde K_p \;=\; \frac{K_p}{(M+m)\,g}, \qquad \tilde K_d \;=\; \frac{K_d}{\sqrt{(M+m)\,g\,M l}}, \qquad \tilde N \;=\; \frac{N}{\omega_{\text{unstable}}}

Two plants give the same closed-loop pole locations (in $\tau$ -units) iff these three ratios are equal. Concretely, if you have a working PID on one preset, the matching PID on the other preset is:

K_p^{\text{(B)}} \;=\; K_p^{\text{(A)}} \cdot \frac{(M_{\text{B}} + m_{\text{B}})}{(M_{\text{A}} + m_{\text{A}})}

K_d^{\text{(B)}} \;=\; K_d^{\text{(A)}} \cdot \sqrt{\frac{(M_{\text{B}} + m_{\text{B}})\, M_{\text{B}}\, l_{\text{B}}} {(M_{\text{A}} + m_{\text{A}})\, M_{\text{A}}\, l_{\text{A}}}}

N^{\text{(B)}} \;=\; N^{\text{(A)}} \cdot \frac{\omega_{\text{unstable, B}}}{\omega_{\text{unstable, A}}}

Numerically, mapping the current Ex 9.6 gains $K_p = 25,\ K_d = 1,\ N = 50$ over to the Ex 2.2 plant ( $M{=}0.5, m{=}0.2, l{=}0.3$ ) gives

\boxed{K_p \approx 17.5,\quad K_d \approx 10.2,\quad N \approx 3.4}

Plug those into the Ex 2.2 preset and the closed-loop behavior (in real time) is exactly what you see on the Ex 9.6 preset — just ~15× slower, which is what “same physics in different units” means.

Why the same PID feels different on the two presets. If you load the current Ex 2.2 default gains ( $K_p = 100$ , $K_d = 30$ , $N = 20$ ) onto the Ex 9.6 plant, the dimensionless ratios are $\tilde K_p \approx 14.6$ and $\tilde K_d \approx 29.6$ — about 5–6× larger than the Ex 9.6 default’s $\tilde K_p \approx 2.55,\ \tilde K_d \approx 10.1$ . A high $\tilde K_d$ in the $\tau$ -frame pushes the closed-loop poles into a fast, lightly-damped region, which is why the Ex 2.2 gains can destabilize the Ex 9.6 plant even though the physical setup is “the same.”

Where the two presets are physically different:

Cart friction $b_{\text{cart}}$ . This is the only physical difference that the $\tau$ -frame mapping above does not absorb. Ex 9.6 has $b_{\text{cart}} = 1$ (a strong damper on the cart’s runaway mode). Ex 2.2 has $b_{\text{cart}} = 0$ , so the cart position $x$ is a free integrator and the cart will drift indefinitely under any pole-only feedback. The scalar PD on $(\theta, \omega)$ cannot regulate the cart, only the pole — this is the lesson in §1.

The reason this matters in the simulator and not in the textbook $\tau$ -frame analysis: the textbook models the 2-state plant (theta, omega only). The simulator integrates the 4-state plant (cart x, cart v, pole theta, pole omega). With $b_{\text{cart}} = 0$ the linearized 4-state has eigenvalues $\{0, 0, +6.77, -6.77\}$ rad/s — two of them are on the imaginary axis (the cart-position double integrator). The theta-only feedback closes the loop on the unstable pole pair but does not move the two imaginary-axis eigenvalues at all. Any tiny force from numerical noise eventually excites the double-integrator mode and the cart runs off to $\pm\infty$ at growing speed.

Adding $b_{\text{cart}} > 0$ breaks the double-integrator degeneracy and replaces the $\{0, 0\}$ pair with a damped mode (real, negative, finite). The cart is then bounded and the system is at least practical to work with under pole-only PD. The proper fix is a 4-state controller — see §8.
What the page’s plant-TF box shows. When Ex 9.6 is active, the displayed plant is switched to the textbook $1/(s^2-1)$ form in normalized units (pole at $\pm 1$ rad/s, which is $1/\sqrt{L/g} \approx 0.32$ s per natural second). The simulator below the math is still integrating the 4-state plant in raw SI units — the two views are in different unit systems. The page has a long note under the plant-TF box spelling this out.

Worked example: Ex 2.2 with the $\tau$ -mapped gains. Plug $K_p = 18,\ K_d = 10,\ N = 3$ into the Ex 2.2 preset and press Start. Hold $b_{\text{cart}} = 0$ (the default) and watch for 20 s, sampling every second:

t (s)	cart $x$ (m)	pole $\theta$ (deg)	peak $\\|\theta\\|$ (deg)
0	0.000	5.7	5.7
1	1.290	−326.1	326.1
6	−464.710	−310.3	338.2
15	−3632.080	−337.9	338.2
20	(off-screen)	—	338.2

The cart accelerates monotonically (peak $|x| = 3632$ m at $t = 15$ s) and the pole has fully rotated within the first second. This is not a gain-mapping bug — it’s the unobservable cart mode I described above. The matching gains give the $\tau$ -frame behavior the mapping promised: a 338° peak angle is the same $\zeta \approx 0.07$ lightly-damped response as Ex 9.6, just on a 15× slower clock. The cart, however, is runaway because $b_{\text{cart}} = 0$ .

Now do the same with $b_{\text{cart}} = 1$ (matching Ex 9.6):

$b_{\text{cart}}$	peak $\\|x\\|$ (m) at $t = 20$ s	peak $\\|\theta\\|$ (deg)
0 (Ex 2.2 default)	3632 (runaway)	338
0.1	3452 (runaway)	382
0.5	399 (bounded, large)	382
1.0 (matches Ex 9.6)	506 (bounded)	363
2.0 (slider max)	283 (bounded)	523

$b_{\text{cart}} = 1$ is the minimum friction that keeps the cart finite. The response is still $\zeta \approx 0.07$ lightly-damped (visible ringing), but it is the same $\tau$ -frame behavior as Ex 9.6 stretched out 15× in real time. To get a critically damped response, retune the $\tau$ -frame gains for the desired pole locations — see the recipe below.

Recipe: use the simulator with matched gains, step by step.

Click Åström Ex. 2.2 (loads the finite-mass cart plant with $M{=}0.5, m{=}0.2, l{=}0.3, b_{\text{cart}}{=}0$ , $K_p{=}100, K_d{=}30, N{=}20$ ).
Set $b_{\text{cart}} = 1$ (cart friction slider). The default is 0 and the cart will run away regardless of PID gains — see the worked example above.
Set $K_p = 18, K_d = 10, N = 3$ (the $\tau$ -mapped gains that match Ex 9.6’s $K_p{=}25, K_d{=}1, N{=}50$ ). The pole stays near upright; the cart oscillates $\pm 0.5$ m with a $\approx 1.5$ s period (lightly damped, $\zeta \approx 0.07$ ).
For a more conservative response, design the $\tau$ -frame pole locations first. Place 3 $\tau$ -frame poles at $s_\tau = -1,\ -1 + j,\ -1 - j$ (well damped, 1 $\tau$ -rad/s bandwidth). The required $\tau$ -frame gains are $\tilde K_p = 5/3,\ \tilde K_d = 10/9,\ \tilde N = 3$ , which map to the Ex 2.2 plant as $\boxed{K_p \approx 11.4,\ K_d \approx 1.13,\ N \approx 20.3}$ . Plug those in and the pole settles in about 1.5 s with very little ringing; the cart reaches a steady offset of a few cm and stays there.

The same $\tau$ -frame poles map to $K_p \approx 16.4, K_d \approx 0.11, N \approx 297$ on the Ex 9.6 plant — the $K_d$ is below the Ex 9.6 slider’s step size, so this recipe is really only practical on the Ex 2.2 plant.

Why “well-damped” still does not mean “cart stays at zero.” Even with $K_d = 1.13$ and $b_{\text{cart}} = 1$ , the cart reaches a finite steady-state offset. The simulator’s PD on $(\theta, \omega)$ has no integral action on cart position, so any steady force on the cart (from a tilt, say) leaves a steady position error. The proper fix is a 4-state controller (see §8) or an integral term on cart position — neither is in the current simulator build. The page’s callout under the canvas flags this when you press Ex 2.2.

3b. The transfer-function form (Åström Ex. 9.6)

Chapters 2 and 5 of Åström work in state space; Chapter 8 introduces the transfer function as an alternative representation. The linearized cart-pole is a 2-input 1-output system in general, but if you make the cart massless ( $M \to 0$ ) the cart’s position becomes a pure integrator $\ddot p = u$ and the dynamics collapse to a single-input single-output plant with the pole as the only state.

This is the normalized form Åström uses in §8.3 and §9.6: $M = 0$ , $m = 1$ , $l = 1$ , $g = 1$ , and the input is the cart’s acceleration $u$ (not force). Linearizing the pole EOM $\ddot\theta = (g/l)\sin\theta - (1/l)\ddot p \cos\theta$ at the upright gives $\ddot\theta = \theta - u$ , i.e. the transfer function

P(s) = \frac{\Theta(s)}{U(s)} = \frac{1}{s^2 - 1}

This is the Example 9.6 plant. The unstable pole at $s = +1$ corresponds to the falling-pole mode (same physical mode as the +4.65 rad/s eigenvalue in §2, just with a different time scaling).

Designing a PD controller. A PD compensator with zero at $s = -2$ has the form $C(s) = k(s + 2)$ . The loop transfer function is

L(s) = C(s)P(s) = \frac{k(s + 2)}{s^2 - 1}

The closed-loop characteristic polynomial is the denominator of $1/(1 + L(s))$ :

s^2 - 1 + k(s + 2) = s^2 + k\,s + (2k - 1) = 0

By Routh–Hurwitz, this is stable iff $2k - 1 > 0$ and $k > 0$ — i.e. $k > 0.5$ . (The textbook quotes $k > 1$ as the practical bound; the formal stability boundary is $k > 0.5$ .) For the textbook’s figure value $k = 2$ , the closed-loop poles are at $s = -1 \pm 1.414\,j$ , giving a natural frequency of $\sqrt{2}\;\text{rad/s}$ and a damping ratio of $1/\sqrt{2} \approx 0.707$ (nice round numbers — this is why the textbook picks $k = 2$ ).

Verify with python-control:

import control as ct
P = ct.tf([1], [1, 0, -1])         # P(s) = 1/(s^2 - 1)
C = ct.tf([2, 4], [1])             # C(s) = 2(s + 2)  (i.e. k = 2)
L = C * P
T = ct.feedback(L, 1)              # closed-loop
print(T.poles())                   # [-1+1.414j, -1-1.414j]

# Gain and phase margin from the open loop
gm, pm, wg, wp = ct.margin(L)
print(f'gain margin = {gm:.2f}, phase margin = {pm*180/3.14159:.1f} deg')

Bode / Nyquist interpretation. The Nyquist plot of $L(s) = 2(s+2)/(s^2-1)$ crosses the negative real axis at $L(0) = -2k = -4$ (for $k = 2$ ), and $L(\infty) = 0$ . The curve makes one counterclockwise encirclement of the critical point $s = -1$ , so $N = -1$ . With $P = 1$ (one open-loop RHP pole), the Nyquist criterion gives $Z = N + P = 0$ closed-loop RHP poles. Same conclusion as Routh-Hurwitz.

The page’s PD form $F = K_p\theta - K_d\,d_f$ corresponds to $C(s) = K_d s + K_p$ in the Laplace domain, so to get $C(s) = k(s+2)$ in the simulator you set $K_p = 2k$ and $K_d = k$ . For the textbook’s $k = 2$ : $K_p = 4$ , $K_d = 2$ , $K_i = 0$ . With $M = 0$ (massless cart) the simulator’s “force” $F$ is numerically equal to the pivot acceleration $u$ , so the normalized form drops out automatically.

4. The control law in the simulator

The current simulator implements a PD on θ and ω only:

F = clamp(Kp · θ + Ki · ∫θ dt - Kd · d_f, -F_max, F_max)

with $\dot d_f = N(\omega - d_f)$ for the derivative filter. This is a scalar control law — it only sees the pole, not the cart. That’s why clicking Åström Ex. 2.2 leaves the cart drifting.

To use the full state-feedback gain from §3, the simulator needs a 4-input control law:

F = -(K_0 p + K_1 \theta + K_2 \dot p + K_3 \dot\theta)

with the cart position $p$ actively regulated to zero. This extension is planned for the next simulator update; in the meantime, you can reproduce the behavior with the LQR gains above by running the Python code below against the same equations the simulator uses.

5. Verify with `python-control`

import control as ct

# Same A, B, K_lqr from §3b
A_cl = A - B @ K_lqr
sys_cl = ct.ss(A_cl, np.zeros((4, 1)), np.eye(4), np.zeros((4, 1)))
T = np.linspace(0, 5, 1000)
X0 = np.array([0, 0.1, 0, 0])   # pole at 0.1 rad, cart at rest
t, y = ct.initial_response(sys_cl, T, X0)
# y[:, 0] = p(t), y[:, 1] = θ(t), ...

You should see $p$ and $\theta$ both decay to zero with a couple of oscillations, with the cart excursion on the order of centimetres (not the runaway drift of the PD-only law).

6. Take it to the simulator

The simulator now ships two Åström textbook presets:

Åström Ex. 2.2 — the full state-space form with scalar PD on θ, ω only. Pole stays upright, cart drifts (see §1).
Åström Ex. 9.6 — the normalized transfer-function form with the massless-cart assumption. The controller is $C(s) = k(s+2)$ with $k = 2$ , which gives the textbook’s closed-loop poles at $s = -1 \pm 1.414\,j$ (damping ratio $0.707$ ). Cart doesn’t drift because the controller is a true PD on $\theta$ (zero steady-state error) and the plant is linear.

To reproduce the LQR result from §3b in the browser, the simulator needs the 4-state gain (Kp_x, Kp_θ, Kd_x, Kd_ω) = (−4.5, −82.5, −10.1, −16.2). The current PID interface only exposes scalar $K_p$ , $K_d$ on $\theta$ , so the workaround is:

Open the simulator
Set the cart mass, pole mass, half-length, and friction to the Åström values (M=0.5, m=0.2, L=0.3, b_cart=b_pivot=0)
Plug in $K_p = 82.5$ , $K_d = 16.2$ , set $K_i = 0$ , and add cart-position feedback manually by leaving the cart undisturbed (the simulator currently has no $p$ term in the PID)
Click Start. The pole will swing up; the cart will hold roughly still. The deviation from true LQR is the missing $p$ and $\dot p$ terms.

For the Ex. 9.6 form, just click the button — the gains $K_p = 4$ , $K_d = 2$ are loaded and the normalized masses $M = 0.001$ , $m = 1$ , $L = 1$ are set so the simulator’s “force” becomes the pivot acceleration.

7. What to take away

Åström Example 2.2 is a teaching model, not a working controller. Its value is in showing the structure of the inverted-pendulum EOM, not in producing a deployable design.
A real cart-pole controller closes the loop on all four states. Pole-only PD is enough to keep the pole upright for a while but cannot hold the cart still.
The simulator is intentionally minimal. It exposes the scalar PID knobs (Kp, Ki, Kd, N) so you can build intuition. The state-feedback extension above is the next planned step.
Ex 2.2 and Ex 9.6 are the same physics in different unit systems. A gain that works on one preset does not, in general, work on the other — see §3½ for the $\tau$ -frame mapping that tells you how to translate PID gains between presets.

8. Other methods (not in the simulator)

The methods below are conceptually important and are the standard tooling for cart-pole control design, but the current simulator implements only the scalar PID. They are documented here for reference and to keep this tutorial complete; running them requires either python-control in a notebook or a future simulator build.

8a. LQR (more robust than manual pole placement)

LQR finds the gain $K$ that minimizes $\int_0^\infty (x^\top Q x + u^\top R\, u)\, dt$ for a chosen weighting. Pick $Q$ to penalize what you care about (pole angle is usually the priority) and $R$ to penalize control effort:

from scipy.linalg import solve_continuous_are

Q = np.diag([1, 50, 1, 1])   # weight θ heavily
R = np.array([[0.05]])        # small control effort
P = solve_continuous_are(A, B, Q, R)
K_lqr = np.linalg.solve(R, B.T @ P)
print(K_lqr)
# K_lqr = [[ -4.47, -82.55, -10.10, -16.21 ]]

LQR guarantees closed-loop stability and a guaranteed gain/phase margin. It’s almost always a better starting point than pole placement. The simulator would need a 4-state input gain $(K_p, K_\theta, K_{d,x}, K_{d,\theta})$ to use LQR; see §4 for the gap.

8b. Energy-based / swing-up control

For large initial angles $\theta_0$ (e.g. the pole hanging down at $\pi$ rad), a linear controller cannot stabilize — the linearized model is only valid near the upright. The classical fix is a two-mode controller: an energy-shaping swing-up law $\dot V = -k (E - E_{\text{up}})$ that drives the pole’s total mechanical energy to the upright value, and a switching rule to the linear balancing controller once $|\theta|$ is small enough that the linearization is valid (typically $|\theta| < 20°$ ).

The swing-up law for the cart-pole is

u \;=\; k \cdot \operatorname{sign}(\dot\theta \cos\theta) \cdot \big( E - E_{\text{up}} \big)

with $E = \tfrac{1}{2} m l^2 \dot\theta^2 - m g l \cos\theta$ the pole’s total energy. The sign term drives the cart back and forth to inject energy into the pole, building up amplitude. Once near upright the linear controller takes over.

8c. MPC / LQR with partial state observation

If only the pole angle $\theta$ and the cart position $p$ are measured (no rate sensors), you need an observer. The standard choice is a Kalman filter (LQE) running on the linearized $A, B, C$ to produce state estimates, which are then fed to an LQR controller. This is the LQG (Linear-Quadratic-Gaussian) design. It is robust to sensor noise and handles the situation where you can only measure position, not velocity.

The simulator currently assumes $\theta$ and $\omega$ are both available (the slider values for $K_d$ multiply $\omega$ directly), so an observer is not strictly needed for the experiments on the page. If you wanted to hook up a real sensor (e.g. an accelerometer on the cart, an encoder on the pivot), an observer would be the next step.

References

Åström, K. J., & Murray, R. M. (2008). Feedback Systems: An Introduction for Scientists and Engineers. Princeton University Press. §2.2 Example 2.2, §6.3 State Feedback, Exercise 8.3 (linearized model), Exercise 8.13 (PD design).
Spong, M. W., Hutchinson, S., & Vidyasagar, M. (2006). Robot Modeling and Control. Wiley. Ch. 5 (decoupled EOM derivation).
Åström, K. J., & Hägglund, T. (2006). Advanced PID Control. ISA. (PID form with derivative filter.)