Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Lagrangian and Hamiltonian mechanics

Newton’s laws and conservation of energy are two approaches to solving for the equations of motion of an object. We can make Newtonian mechanics more elegant by extending them to fields and potentials. But ultimately, Newtonian mechanics is still cumbersome to use. Here is an alternate, more beautiful approach - Lagrangian mechanics.

Lagrangian Mechanics

The Lagrangian is the difference of an object’s kinetic and potential energies, and is denoted by:

L(x,x˙)=K(x˙)U(x)\mathcal{L(x, \dot x)} = K(\dot x) - U(x)

Note that the dots are used for the time derivatives - that is, x˙=dxdt\dot x = \frac{dx}{dt}. The action is a fundamental quantity of all physical systems and is given by the time integral of the Lagrangian:

S=t1t2L(x,x˙) dtS = \int_{t_1}^{t_2} \mathcal{L}(x, \dot x)~dt

The principle of stationary action states that for any given system, the action is stationary. What does stationary mean? Recall the idea of stationary points in calculus - which include minima and maxima. For the action to be stationary, that means the Lagrangian must be a stationary function, which are analogous to stationary points, just for the action, which is a function of functions (what we call a functional, which we’ll go more in-depth with later).

But what form does that Lagrangian have to take to obey the principle of stationary action? The short answer is that it must obey the following equation, known as the Euler-Lagrange equation:

LxddtLx˙=0\frac{\partial \mathcal{L}}{\partial x} - \frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot x} = 0

where again, x˙=dxdt\dot x = \dfrac{dx}{dt} is the velocity. This is one of the most fundamental and profound equations of physics, and works for any particle’s Lagrangian (particle, remember, can be a big object like a planet or star, it is a generic term in physics). Once you write down the Euler-Lagrange equation, you just need to take the derivatives of the Lagrangian and substitute to get the equations of motion (the differential equations you use to solve for the trajectory of the particle). Applying it, at least conceptually, is fairly simple.

But to gain a deeper understanding of why this equation works, we must first dive into the theory of functionals and variational calculus. If this section is too math-heavy, feel free to skip this section - it’s not required for applying the Euler-Lagrange equation. But for those that want the step-by-step derivation - let’s dive in!

Functionals

A functional is a function that takes in other functions as input. In contrast to a function ff which takes in a real number xx and outputs f(x)f(x), a functional L\mathcal{L} takes in a function y(x)y(x) and outputs L(y(x),y(x),x)\mathcal{L}(y(x), y'(x), x).

The derivative appearing in L(y,y,x)\mathcal{L}(y, y', x) and the mention of the word “calculus” suggests that functionals are based in differential operators, such as derivatives and integrals. Indeed, this is the case - a great number of functionals are in fact integrals.

Consider, for instance, a functional that appears - under a different name - in a first introduction to calculus. This is the functional expression for the arc length:

S=ab1+y2dxS = \int_a^b \sqrt{1 + {y'}^2}\, dx

While an introductory treatment of calculus may simply give this formula with the provided function yy and its derivative yy', the calculus of variations would consider this formula a functional of an arbitrary function ff in the form:

S(y,y,x)=ab1+y2dxS(y, y', x) = \int_a^b \sqrt{1 + {y'}^2}\, dx

The calculus of variations is concerned with optimizing functionals to find their stationary points. In many cases, we want to obtain the minimum or maximum of a funtional, but remember that stationary points are more general and can include things like saddle points and other points of inflection (i.e. points around which the second derivative changes sign).

In our case, we want to figure out which path y(x)y(x) is the shortest distance between points x=ax = a and x=bx = b. Translated to mathematical terms, we can say that we want to optimize S(y,y,x)S(y, y', x) for the function y(x)y(x) that minimizes SS. But how do we do so? The answer requires a fair bit of explaining, so this is a section to be read through slowly.

The general functional optimization problem

Consider a general functional S(f,f,q)S(f, f', q) where the functional SS is a function of f(q)f(q), f(q)f'(q), and qq. Here, f(q)f(q) is a parametric function of one parameter qq - we will explore specific cases of f(q)f(q) later (hint: one of these will be the position function x(t)x(t) which is a parametric function where the parameter is tt). Our functional S(f,f,q)S(f, f', q) is given by:

S(f,f,q)=q1q2L(f,f,q)dqS(f, f', q) = \int_{q_1}^{q_2} \mathcal{L}(f, f', q)\, dq

All of this is certainly very abstract, so let us examine what it all means. SS is a functional, meaning that it takes some function f(q)f(q) and outputs a number. The precise thing it does, in this case, is to integrate any composite function of ff, its derivative f(q)f'(q), and its input qq, between two points in the domain of ff. For notational clarity, we call this composite function of ff, ff', and qq as L(f,f,q)\mathcal{L}(f, f', q). As we are taking the integral of the composite function, this results in a number, since definite integrals return a number. So to sum it all up, SS is a functional, that, given any function f(q)f(q) - whatever the function may be - returns the definite integral of any possible composite function of ff, its derivative ff', and its input qq.

We want to find the function ff that minimizes or maximizes SS. This means we want to find a function for which SS does not change with respect to ff (similar to how the derivative is zero at a critical point in normal calculus). To find this optimal function, let us vary SS by adding a function η(q)\eta(q) multiplied by a tiny number ε\varepsilon to ff between q1q_1 and q2q_2 - this represents adding a tiny shift, also called a variation, to SS. Our particular shift is such that η(q1)=η(q2)=0\eta(q_1) = \eta(q_2) = 0, meaning that η(q)\eta(q) vanishes at the endpoints, since we want this variation to only be between q1q_1 and q2q_2 (and nowhere outside of that range). We then have:

S(f+εη,f+εη,q)=q1q2L(f+εη,f+εη,q)dqS(f + \varepsilon \eta, f' + \varepsilon \eta', q) = \int_{q_1}^{q_2} \mathcal{L}(f + \varepsilon \eta, f' + \varepsilon \eta', q)\, dq

Our next step is to find the amount of change δS\delta S between S(f,f,q)S(f, f', q) and S(f+εη,f+εη,q)S(f + \varepsilon \eta, f' + \varepsilon \eta', q). As a first step, we want to compute L(f+εη,f+εη,q)\mathcal{L}(f + \varepsilon \eta, f' + \varepsilon \eta', q), as that will allow us to compute S(f+εη,f+εη,q)S(f + \varepsilon \eta, f' + \varepsilon \eta', q), which we need in order to calculate δS\delta S.

Recall how, in single-variable calculus, we can express a small shift y(x+h)y(x + h) in a function y(x)y(x) for some tiny number hh by:

y(x+h)=y(x)+y(x)hy(x + h) = y(x) + y'(x) h

This idea can be generalized to multivariable calculus, where we can express a small shift in a function f(x+σ,y+λ)f(x + \sigma, y + \lambda) in a function f(x,y)f(x, y) by:

f(x+σ,y+λ)=f(x,y)+fxσ+fyλf(x + \sigma, y + \lambda) = f(x, y) + \dfrac{\partial f}{\partial x} \sigma + \dfrac{\partial f}{\partial y} \lambda

In a similar way, in the calculus of variations, we can write:

L(f+εη,f+εη,q)=L(f,f,q)+Lfηε+Lfηε=L(f,f,q)+(Lfη+Lfdηdq)ε\begin{align} \mathcal{L}(f + \varepsilon \eta, f' + \varepsilon \eta', q) &= \mathcal{L}(f, f', q) + \dfrac{\partial \mathcal{L}}{\partial f}\eta\,\varepsilon + \dfrac{\partial \mathcal{L}}{\partial f'}\eta' \varepsilon \\ &=\mathcal{L}(f, f', q) + \left(\dfrac{\partial \mathcal{L}}{\partial f}\eta\, + \dfrac{\partial \mathcal{L}}{\partial f'}\dfrac{d\eta}{dq}\right)\varepsilon \end{align}

Now, by substitution, we can find S(f+εη,f+εη,q)S(f + \varepsilon \eta, f' + \varepsilon \eta', q):

S(f+εη,f+εη,q)=q1q2L(f+εη,f+εη,q)dq=q1q2[L(f,f,q)+(Lfη+Lfdηdq)ε]dq\begin{align} S(f + \varepsilon \eta, f' + \varepsilon \eta', q) &= \int_{q_1}^{q_2} \mathcal{L}(f + \varepsilon \eta, f' + \varepsilon \eta', q)\, dq \\ &=\int_{q_1}^{q_2} \left[\mathcal{L}(f, f', q) + \left(\dfrac{\partial \mathcal{L}}{\partial f}\eta\, + \dfrac{\partial \mathcal{L}}{\partial f'}\dfrac{d\eta}{dq}\right)\varepsilon\right]d q \end{align}

We may find the change in SS, which we will call δS\delta S, by subtracting S(f,f,q)S(f, f', q) from the left-hand side:

δS=S(f+εη,f+εη,q)S(f,f,q)=q1q2[L(f,f,q)+(Lfη+Lfdηdq)ε]dqq1q2L(f,f,q)dq=q1q2(Lfη+Lfdηdq)εdt\begin{align} \delta S &= S(f + \varepsilon \eta, f' + \varepsilon \eta', q) - S(f, f', q) \\ &=\int_{q_1}^{q_2} \left[\mathcal{L}(f, f', q) + \left(\dfrac{\partial \mathcal{L}}{\partial f}\eta\, + \dfrac{\partial \mathcal{L}}{\partial f'}\dfrac{d\eta}{dq}\right)\varepsilon\right]d q - \int_{q_1}^{q_2} \mathcal{L}(f, f', q)\, dq \\ &= \int_{q_1}^{q_2}\left(\dfrac{\partial \mathcal{L}}{\partial f}\eta\, + \dfrac{\partial \mathcal{L}}{\partial f'}\dfrac{d\eta}{dq}\right)\varepsilon\,dt \end{align}

In the limit as ε0\varepsilon \to 0, we would expect that δS=0\delta S = 0, as the function that maximizes (or minimizes) SS, again, is the function for which SS does not change with respect to ff. In formal language, this is called the process of varying SS by a variation ε\varepsilon, and then demanding that limε0δS=0\displaystyle \lim_{\varepsilon \to 0} \delta S = 0. This is why this form of calculus is called the calculus of variations or variational calculus. By setting δS=0\delta S = 0 we have:

q1q2(Lfη+Lfdηdq)εdt=0\int_{q_1}^{q_2}\left(\dfrac{\partial \mathcal{L}}{\partial f}\eta\, + \dfrac{\partial \mathcal{L}}{\partial f'}\dfrac{d\eta}{dq}\right)\varepsilon\,dt = 0

We would, however, prefer some way to get rid of the added function η(q)\eta(q) to obtain an equation that doesn’t depend on η\eta. We can do this by explicitly performing the above integral. First, we split the sum into two parts for mathematical convenience for the following steps:

q1q2(Lfη+Lfdηdq)εdt=q1q2Lfηεdt+q1q2Lfdηdqε dt\int_{q_1}^{q_2}\left(\dfrac{\partial \mathcal{L}}{\partial f}\eta\, + \dfrac{\partial \mathcal{L}}{\partial f'}\dfrac{d\eta}{dq}\right)\varepsilon\,dt = \int_{q_1}^{q_2}\dfrac{\partial \mathcal{L}}{\partial f} \eta\,\varepsilon\, dt + \int_{q_1}^{q_2} \dfrac{\partial \mathcal{L}}{\partial f'}\dfrac{d\eta}{dq}\varepsilon\ dt

We now simplify the second term in the integral by performing integration by parts to evaluate the integral. Recall that the integration by parts formula is as follows:

abudv=uvababvdu\int_a^b u dv = uv \bigg|_a^b - \int_a^b v du

If we let u=Lfεu = \dfrac{\partial \mathcal{L}}{\partial f'}\varepsilon and dv=dηdqdv = \dfrac{d\eta}{dq}, then v=dηdqdq=η(q)v = \displaystyle \int \dfrac{d\eta}{dq} dq = \eta(q) and du=ddq(Lf)εdu = \dfrac{d}{dq} \left(\dfrac{\partial \mathcal{L}}{\partial f'}\right)\varepsilon. By substituting these in (we keep the first term there and don’t evaluate, we only perform integration by parts on the second term) we have:

q1q2Lfηεdtno need to evaluate+q1q2Lfdηdqε dtintegrate by parts=q1q2Lfηεdt+[Lfηεq1q2q1q2ηddq(Lf)εdt]result of integration by parts\begin{align} \underbrace{\int_{q_1}^{q_2} \dfrac{\partial \mathcal{L}}{\partial f} \eta\,\varepsilon\, dt}_\text{no need to evaluate} &+ \underbrace{\int_{q_1}^{q_2} \dfrac{\partial \mathcal{L}}{\partial f'}\dfrac{d\eta}{dq}\varepsilon\ dt}_\text{integrate by parts} \\ &= \int_{q_1}^{q_2}\dfrac{\partial \mathcal{L}}{\partial f}\eta\, \varepsilon\, dt + \underbrace{\left[\dfrac{\partial \mathcal{L}}{\partial f'} \eta\,\varepsilon\bigg|_{q_1}^{q_2} - \int_{q_1}^{q_2}\eta\dfrac{d}{dq} \left(\dfrac{\partial \mathcal{L}}{\partial f'}\right)\varepsilon \, dt\right]}_\text{result of integration by parts} \end{align}

But recall from earlier that we defined η(q)\eta(q) such that η(q2)=η(q1)=0\eta(q_2) = \eta(q_1) = 0, meaning that the Lfηεq1q2\dfrac{\partial \mathcal{L}}{\partial f'} \eta\,\varepsilon\bigg|_{q_1}^{q_2} term goes to zero. Therfore, we are only left with:

q1q2Lfηεdt+Lfηεq1q2q1q2ηddq(Lf)εdt=q1q2Lfηεdtq1q2ηddq(Lf)εdt=q1q2[Lfηεddq(Lf)ηε]dt\begin{align} \int_{q_1}^{q_2}\dfrac{\partial \mathcal{L}}{\partial f} \eta\,\varepsilon\, dt &+ \cancel{\dfrac{\partial \mathcal{L}}{\partial f'} \eta\,\varepsilon\bigg|_{q_1}^{q_2}} - \int_{q_1}^{q_2}\eta\dfrac{d}{dq} \left(\dfrac{\partial \mathcal{L}}{\partial f'}\right)\varepsilon \, dt \\ &= \int_{q_1}^{q_2}\dfrac{\partial \mathcal{L}}{\partial f}\eta\, \varepsilon\, dt - \int_{q_1}^{q_2}\eta\dfrac{d}{dq} \left(\dfrac{\partial \mathcal{L}}{\partial f'}\right)\varepsilon \, dt \\ &= \int_{q_1}^{q_2}\left[\dfrac{\partial \mathcal{L}}{\partial f} \eta\,\varepsilon - \dfrac{d}{dq}\left(\dfrac{\partial \mathcal{L}}{\partial f'}\right)\eta\,\varepsilon\right] \, dt \\ \end{align}

Where in the last term we re-joined the sum of the integrals (which will make the next steps much easier). We know that the integral quantity we derived in the last step must be equal to zero, given that δS=0\delta S = 0 is our fundamental requirement for finding the stationary points (minima, maxima, etc.) of functionals. Therefore we have:

q1q2[Lfηεddq(Lf)ηε]dt=q1q2[Lfddq(Lf)]ηεdt=0\begin{align} \int_{q_1}^{q_2}&\left[\dfrac{\partial \mathcal{L}}{\partial f} \eta\,\varepsilon\, - \dfrac{d}{dq}\left(\dfrac{\partial \mathcal{L}}{\partial f'}\right)\eta\,\varepsilon\right] \, dt \\ &= \int_{q_1}^{q_2}\left[\dfrac{\partial \mathcal{L}}{\partial f} - \dfrac{d}{dq}\left(\dfrac{\partial \mathcal{L}}{\partial f'}\right)\right] \eta\,\varepsilon\, \, dt \\ &= 0 \end{align}

Where we factored the common terms out of the integral in the last step. But since our integral is zero, by the fundamental lemma of the calculus of variations, our integrand must be zero as well, and resultingly our quantity in the squared brackets must also be zero. That is:

q1q2[Lfddq(Lf)]ηεdt=0Lfddq(Lf)=0\int_{q_1}^{q_2}\left[\dfrac{\partial \mathcal{L}}{\partial f} - \dfrac{d}{dq}\left(\dfrac{\partial \mathcal{L}}{\partial f'}\right)\right] \eta\,\varepsilon\, \, dt = 0 \quad\Rightarrow \quad \dfrac{\partial \mathcal{L}}{\partial f} - \dfrac{d}{dq}\left(\dfrac{\partial \mathcal{L}}{\partial f'}\right) = 0

The last result is the general form of the Euler-Lagrange equation for our functional SS. Since our functional is a very general functional, the Euler-Lagrange equation applies to a huge set of functionals - indeed, all functionals in the form S[f(q),f(q),q]S[f(q), f'(q), q] (it is customary to use squared brackets when writing out the functional in its full form, but for our short form S(f,f,q)S(f, f', q) it is permissible to simply use parentheses). Thus, it is an extremely crucial and useful equation, so let us write it down one more time:

Lfddq(Lf)=0\dfrac{\partial \mathcal{L}}{\partial f} - \dfrac{d}{dq}\left(\dfrac{\partial \mathcal{L}}{\partial f'}\right) = 0

Application of the Euler-Lagrange equation to arc length

Now, let us return to our example of the arc length functional:

S(y,y,x)=ab1+y2dxS(y, y', x) = \int_a^b \sqrt{1 + {y'}^2}\, dx

We want to find y(x)y(x) that minimizes this functional, and for this we can use the Euler-Lagrange equation. In this case, f=y(x)f = y(x), f=yf' = y', and L=1+y2\mathcal{L} = \sqrt{1 + y'^2}, so the Euler-Lagrange equation for this particular functional reads:

Lyddx(Ly)=0\dfrac{\partial \mathcal{L}}{\partial y} - \dfrac{d}{dx}\left(\dfrac{\partial \mathcal{L}}{\partial y'}\right) = 0

We may now compute the derivatives (which is much-simplified by the fact that L=1+y2\mathcal{L} = \sqrt{1 + y'^2} does not depend on yy, but we must be careful to remember that ddxf(y)=f(y)y\dfrac{d}{dx} f(y') = f'(y')y'' due to the chain rule):

Ly=0Ly=122y1+y2=y1+y2=y(1+y2)1/2ddxLy2=y(1+y2)1/2+(12)(1+y2)3/2(2y)y=y(1+y2)1/2y(1+y2)3/2y=y[(1+y2)1/2(1+y2)3/2]\begin{align} \dfrac{\partial \mathcal{L}}{\partial y} &= 0 \\ \dfrac{\partial \mathcal{L}}{\partial y'} &= \dfrac{1}{2} \dfrac{2y'}{\sqrt{1 + y'^2}} = \dfrac{y'}{\sqrt{1 + y'^2}} = y'(1 + y'^2)^{-1/2} \\ \dfrac{d}{dx} \dfrac{\partial \mathcal{L}}{\partial y'^2}&= y''(1 + y'^2)^{-1/2} + \left(-\frac{1}{2}\right)(1 + y'^2)^{-3/2}(2y')y'' \\ &=y''(1 + y'^2)^{-1/2} -y'(1 + y'^2)^{-3/2}y'' \\ &= y''[(1 + y'^2)^{-1/2} - (1 + y'^2)^{-3/2}] \end{align}

Thus, substituting into the Euler-Lagrange equation, we have:

Lyddx(Ly)=y[(1+y2)1/2(1+y2)3/2]=y[(1+y2)3/2(1+y2)1/2]=0\begin{align} \dfrac{\partial \mathcal{L}}{\partial y} - \dfrac{d}{dx}\left(\dfrac{\partial \mathcal{L}}{\partial y'}\right) &= -y''[(1 + y'^2)^{-1/2} - (1 + y'^2)^{-3/2}] \\ &=y''[(1 + y'^2)^{-3/2}-(1 + y'^2)^{-1/2}] \\ &=0 \end{align}

While this may look very complicated, remember that the quantity in the square brackets is zero:

y[(1+y2)3/2(1+y2)1/2]=0y[(1+y2)3/2(1+y2)1/2]=0y=0\begin{align} &y''[(1 + y'^2)^{-3/2}-(1 + y'^2)^{-1/2}] = 0 \\ &\Rightarrow y''[(1 + y'^2)^{-3/2}-(1 + y'^2)^{-1/2}] = 0 \\ &\Rightarrow y'' = 0 \end{align}

This becomes a differential equation that is straightforward to integrate:

y=0y=ydx=0dx=b=const.y=ydx=bdx=mx+b\begin{align} &y'' = 0 \\ &y' = \int y'' dx = \int 0 \, dx = b = \text{const.} \\ &y = \int y' dx = \int b \, dx = m x + b \end{align}

Where m,bm, b are constants. This is simply an equation of a straight line! By applying the calculus of variations, we have therefore shown that the shortest path between two points a,ba, b - in functional terms, the path that minimizes the arc length - is a straight line. It may seem to be an obvious result, but proving it required quite a bit of calculus!

Note that the one restriction we must place on this result is that we assume S=1+y2dxS = \displaystyle \int \sqrt{1 + y'^2}\, dx is the right equation for the arc length. For regular Euclidean space, this is always the right equation, and Euclidean space is what we’ll work with 99% of the time. But in higher dimensions, and especially in non-Euclidean geometries, the arc length equation is no longer the correct equation for the arc length. We must then use differential geometry to construct the right equation for the arc length. But that is a topic we will cover in Chapter 3.

The Euler-Lagrange equation in physics

In physics, we consider a specific case of the Euler-Lagrange equation, where (as mentioned at the beginning) q=tq = t is the time, f=x(t)f = x(t) is the position, and f=x˙=dxdtf' = \dot x = \dfrac{dx}{dt} is the velocity. Therefore, the Euler-Lagrange equation, in its common form used in physics (specifically, Lagrangian mechanics), becomes:

LxddtLx˙=0\frac{\partial \mathcal{L}}{\partial x} - \frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot x} = 0

Again, the Euler-Lagrange equation can be used to solve for the equations of motion as long as the Lagrangian is known. Note that for a more general set of coordinate systems, where the system is not one-dimensional motion along the xx axis, there is an Euler-Lagrange equation that applies to each coordinate, each of which takes the following form:

ddtLq˙iLqi=0\frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot q_i} - \frac{\partial \mathcal{L}}{\partial q_i} = 0

where qiq_i stands in for the particular coordinate, so qiq_i can be any one of x,y,zx, y, z when working in Cartesian coordinates, or any one of r,θ,ϕr, \theta, \phi when working in spherical coordinates. And in the specific case when we are interested in solving for the motion of a system of objects, and not just of one individual object, it should be noted that the kinetic and potential energies are those of the system - that is, the sum of the kinetic and potential energies of every object in the system:

K=iKi=K1+K2+K3++KnU=iUi=U1+U2+U3++Un\begin{align} K &= \sum_i K_i = K_1 + K_2 + K_3 + \dots + K_n \\ U &= \sum_i U_i = U_1 + U_2 + U_3 + \dots + U_n \end{align}

Note that the Euler-Lagrange equations apply primarily to closed systems, i.e. systems with no external force acting on them. If there is an external applied force on the system that does work WW, then the Euler-Lagrange equations become:

ddtLq˙iLqi=Wqi\frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot q_i} - \frac{\partial \mathcal{L}}{\partial q_i} = \frac{\partial W}{\partial q_i}

Applying Lagrangian mechanics

Having examined the fundamental theory behind Lagrangian mechanics, we will now look at a few examples of increasing difficulty, to illustrate its usefulness and mathematical elegance.

Using Lagrangian mechanics to solve the simple pendulum

We consider the classical problem of the simple pendulum, an idealized model of a pendulum that is frequently introduced as an example of a harmonic oscillator. A diagram of the simple pendulum configuration is shown below:

Simple oscillator

For the single pendulum problem, we first find the equations x(t)x(t) and y(t)y(t) given our coordinate system. Our coordinate system is based on the point (0,0)(0, 0) located at the point where the pendulum is attached to the ceiling. Using basic trigonometry, we find that:

x(t)=sinθ(t)y(t)=cosθ(t)\begin{align} x(t) &= \ell \sin \theta (t) \\ y(t) &= -\ell \cos \theta (t) \end{align}

Where y(t)y(t) is negative because the pendulum is at a negative height relative to our origin. Using our expressions for x(t)x(t) and y(t)y(t), we want to find the expression for the kinetic energy KK. We know that:

K=12mv2K = \frac{1}{2} mv^2

And that:

v=vx2+vy2=(dxdt)2+(dydt)2v = \sqrt{v_x^2 + v_y^2} = \sqrt{\left(\frac{dx}{dt}\right)^2 + \left(\frac{dy}{dt}\right)^2}

To do this, we solve for dxdt\frac{dx}{dt} and dydt\frac{dy}{dt}. This takes a bit of care, because we need to implicitly differentiate x(t)x(t) and y(t)y(t) with respect to tt, where:

dxdt=dxdθdθdtdydt=dydθdθdt\begin{align} \frac{dx}{dt} &= \frac{dx}{d\theta} \frac{d\theta}{dt} \\ \frac{dy}{dt} &= \frac{dy}{d\theta} \frac{d\theta}{dt} \end{align}

Implicitly differentiating, we have:

vx=dxdt=cosθdθdtvy=dydt=sinθdθdt\begin{align} v_x &= \frac{dx}{dt} = \ell \cos \theta \frac{d\theta}{dt} \\ v_y &= \frac{dy}{dt} = \ell \sin \theta \frac{d\theta}{dt} \end{align}

Now, we plug our values for vxv_x and vyv_y into the kinetic energy equation, which gives us:

K=12m(2cos2θdθdt+2sin2θdθdt)K = \frac{1}{2} m \left(\ell^2 \cos^2 \theta \frac{d\theta}{dt} + \ell^2 \sin^2 \theta \frac{d\theta}{dt}\right)

If we do a little simplification by factoring and remembering that cos2θ+sin2θ=1\cos^2 \theta + \sin^2 \theta = 1, we get:

K=12m2(dθdt)2K = \frac{1}{2} m \ell^2 \left(\frac{d\theta}{dt}\right)^2

Now we find the potential energy. Remember that close to Earth, the potential energy is determined by and only by the vertical distance between the origin (which is the reference height of zero) and the measured point. This means that:

U=mghU = mgh

The height in this case is negative (because it’s below the origin) and we’re only taking the vertical component of the height (hence cosθ\cos \theta) so:

U=mgcosθU = -mg \cos \theta \ell

Putting it all together, our Lagrangian is:

L=12m2(dθdt)2+mgcosθ\mathcal{L} = \frac{1}{2} m \ell^2 \left(\frac{d\theta}{dt}\right)^2 + mg \cos \theta \ell

Here, our coordinates are determined in terms of the angle θ\theta only, so the Euler-Lagrange equations take the form:

ddtLθ˙Lθ=0\frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot \theta} - \frac{\partial \mathcal{L}}{\partial \theta} = 0

If we plug this into the Euler-Lagrange equation, we get:

md2θdt22+mgsinθ=0m \frac{d^2 \theta}{dt^2} \ell^2 + mg \sin \theta \ell = 0

We can rewrite this as:

md2θdt22=mgsinθm \frac{d^2 \theta}{dt^2} \ell^2 = - mg \sin \theta \ell

And cancel out the common factors to yield:

d2θdt2=gsinθ\frac{d^2 \theta}{dt^2} = - \frac{g}{\ell} \sin \theta

We have arrived at our answer. This is the differential equation of the simple pendulum. Note that while this equation is impossible to solve analytically directly, we can use the small-angle approximation of sinθθ\sin \theta \approx \theta to get:

d2θdt2=gθ\frac{d^2 \theta}{dt^2} = -\frac{g}{\ell} \theta

Of which the solution is:

θ(t)=θmaxsin(gt+ϕ0)\theta(t) = \theta_{max} \sin \left(\sqrt{\frac{g}{\ell}}t + \phi_0 \right)

Using Lagrangian mechanics to solve the orbit equation

We want to derive the orbit of Earth around the Sun. To do so, we again first derive the expressions for x(t)x(t) and y(t)y(t) in terms of the solar-earth system:

x(t)=r(t)cosθ(t)y(t)=r(t)sinθ(t)\begin{align} x(t) &= r(t) \cos \theta(t) \\ y(t) &= r(t) \sin \theta(t) \end{align}

Differentiating both (and remembering to use the product rule), we find that:

vx(t)=cosθ(t)drdtr(t)sinθ(t)dθdtvy(t)=sinθ(t)drdt+r(t)cosθdθdt\begin{align} v_x(t) &= \cos \theta(t) \frac{dr}{dt} - r(t) \sin \theta (t) \frac{d\theta}{dt} \\ v_y(t) &= \sin \theta(t) \frac{dr}{dt} + r(t) \cos \theta \frac{d\theta}{dt} \end{align}

Which we can use to find the kinetic energy (after lots of algebra and several passes at using the identity sin2θ+cos2θ=1\sin^2 \theta + \cos^2 \theta = 1):

K=12m[(drdt)2+r2(dθdt)2]K = \frac{1}{2} m \left[\left(\frac{dr}{dt}\right)^2 + r^2 \left(\frac{d\theta}{dt}\right)^2\right]

The potential energy is given by U=GMmrU = -\frac{GMm}{r}, so the Lagrangian is:

L=12m(r˙2+r2θ˙2)+GMmr\mathcal{L} = \frac{1}{2} m (\dot r^2 + r^2 \dot \theta^2) + \frac{GMm}{r}

Applying the Euler-Lagrange equations to each coordinate, rr and θ\theta, present in the Lagrangian, we have:

ddtLr˙Lr=0ddtLθ˙Lθ=0\begin{gather} \frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot r} - \frac{\partial \mathcal{L}}{\partial r} = 0 \\ \frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot \theta} - \frac{\partial \mathcal{L}}{\partial \theta} = 0 \end{gather}

Solving both equations yields the equations of motion for the Earth, we find that:

d2rdt2=r(dθdt)2GMr2=2rdrdtdθdt\begin{align} \frac{d^2 r}{dt^2} &= r \left(\frac{d\theta}{dt}\right)^2 - \frac{GM}{r^2} \\ &= - \frac{2}{r} \frac{dr}{dt} \frac{d\theta}{dt} \end{align}

These can be solved analytically, but for the sake of simplicity here they will be solved using a numerical differential equation solver:

import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import solve_ivp
def newtonian_d_dt(t, X, G=6.67e-11, M=2e30):
    r, theta, u, v = X
    dr_dt = u
    dtheta_dt = v
    du_dt = r * v ** 2 -(G * M) / (r ** 2) 
    dv_dt = -(2 * u * v) / r
    return dr_dt, dtheta_dt, du_dt, dv_dt
r0 = 1.5e11
theta0 = np.pi/2
u0 = 0
v0 = 1.99e-7 # 2pi / 365 days
newtonian_initial = [r0, theta0, u0, v0]
tmax = 365 * 24 * 60 * 60 # 1 year
samples = 5000
t = np.linspace(0, tmax, samples)
newtonian = solve_ivp(newtonian_d_dt, (0, tmax), y0=newtonian_initial, dense_output=True)
sol = newtonian.sol(t)
fig = plt.figure()
ax = plt.axes()

r = sol[0]
theta = sol[1]

# Convert from polar to cartesian
x1 = r * np.cos(theta)
y1 = r * np.sin(theta)

ax.plot(x1, y1)
ax.set_title('Plot of Newtonian orbit')
plt.show()

As it can be seen, the orbit is a ellipse, and we have arrived at this result using Lagrangian mechanics!

Using Lagrangians to solve the double pendulum problem

We will now tackle a problem that would be very difficult to solve using Newton’s laws, but much easier with Lagrangian mechanics. Here we have a system as follows:

Double pendulum diagram

Here, the notable difference is that we have a system as opposed to a single object, and we need to find the kinetic and potential energies of the entire system. To do this, we divide the kinetic and potential energies into two parts:

K=K1+K2U=U1+U2\begin{align} K &= K_1 + K_2 \\ U &= U_1 + U_2 \end{align}

Where K1K_1 and K2K_2 are respectively the kinetic energies of the first pendulum mass and second pendulum mass, and likewise with U1U_1 and U2U_2 and their potential energies.

We will first derive the kinetic energies, because they are harder :( As we know, we first setup a coordinate system where the point (0,0)(0, 0) is centered on the point the double pendulum is attached to the ceiling. Then, we write the position functions of the first pendulum:

x1(t)=1sinθ1(t)y1(t)=1cosθ1(t)\begin{align} x_1(t) &= \ell_1 \sin \theta_1 (t) \\ y_1(t) &= -\ell_1 \cos \theta_1 (t) \end{align}

We figure these out from basic trigonometry and the fact that y1(t)y_1(t) is negative, as it is below the origin. We then take the derivatives to find the xx and yy components of the velocity:

dx1dt=1cosθ1(t)dθ1dtdy1dt=1sinθ1(t)dθ1dt\begin{align} \frac{dx_1}{dt} &= \ell_1 \cos \theta_1 (t) \frac{d\theta_1}{dt} \\ \frac{dy_1}{dt} &= \ell_1 \sin \theta_1 (t) \frac{d\theta_1}{dt} \end{align}

Using this, we can find K1K_1:

K1=12m1(x˙12+y˙12)=12m1[(1cosθ1(t)dθ1dt)2+(1sinθ1(t)dθ1dt)2]\begin{align} K_1 = \frac{1}{2} m_1 \left(\dot x_1^2 + \dot y_1^2\right) \\ &= \frac{1}{2} m_1 \left[ \left(\ell_1 \cos \theta_1 (t) \frac{d\theta_1}{dt} \right)^2 + \left(\ell_1 \sin \theta_1 (t) \frac{d\theta_1}{dt}\right)^2 \right] \end{align}

Using the trig identity sin2θ+cos2θ=1\sin^2 \theta + \cos^2 \theta = 1, this is trivial to simplify into:

K1=12m112(dθdt)2K_1 = \frac{1}{2} m_1 \ell_1^2 \left(\frac{d\theta}{dt}\right)^2

Then, we write the position functions of the second pendulum:

x2(t)=x1(t)+2sinθ2(t)y2(t)=y1(t)+(2cosθ2(t))\begin{align} x_2 (t) &= x_1(t) + \ell_2 \sin \theta_2 (t) \\ y_2 (t) &= y_1(t) + (- \ell_2 \cos \theta_2 (t)) \end{align}

Here, we add the xx and yy displacement of the second pendulum with the xx and yy displacement of the first to find the total displacement from the origin, because remember, we’re using the same coordinate system for both pendulums. If we sub in the values of x1(t)x_1(t) and y1(t)y_1(t), we have:

x2(t)=1sinθ1(t)+2sinθ2(t)y2(t)=1cosθ1(t)2cosθ2(t))\begin{align} x_2 (t) &= \ell_1 \sin \theta_1 (t) + \ell_2 \sin \theta_2 (t) \\ y_2 (t) &= - \ell_1 \cos \theta_1 (t) - \ell_2 \cos \theta_2 (t)) \end{align}

We compute their derivatives:

dx2dt=1cosθ1(t)dθ1dt+2cosθ2(t)dθ2dtdy2dt=1sinθ1(t)dθ1dt+2sinθ2(t)dθ2dt\begin{align} \frac{dx_2}{dt} &= \ell_1 \cos \theta_1 (t) \frac{d\theta_1}{dt} + \ell_2 \cos \theta_2(t) \frac{d\theta_2}{dt} \\ \frac{dy_2}{dt} &= \ell_1 \sin \theta_1(t) \frac{d\theta_1}{dt} + \ell_2 \sin \theta_2(t) \frac{d\theta_2}{dt} \end{align}

And then plug them into the kinetic energy formula to find:

K2=12m2[(1cosθ1(t)dθ1dt+2cosθ2(t)dθ2dt)2+(1sinθ1(t)dθ1dt+2sinθ2(t)dθ2dt)2]K_2 = \frac{1}{2} m_2 \left[ \left(\ell_1 \cos \theta_1 (t) \frac{d\theta_1}{dt} + \ell_2 \cos \theta_2(t) \frac{d\theta_2}{dt}\right)^2 + \left(\ell_1 \sin \theta_1(t) \frac{d\theta_1}{dt} + \ell_2 \sin \theta_2(t) \frac{d\theta_2}{dt}\right)^2 \right]

Expanding that out, and using sin2θ+cos2θ=1\sin^2 \theta + \cos^2 \theta = 1 to simplify, we have:

K2=12m2[12(dθ1dt)2+212dθ1dtdθ2dt(cosθ1cosθ2+sinθ2sinθ2)+22(dθ2dt)2]K_2 = \frac{1}{2} m_2 \left[ \ell_1^2 \left(\frac{d\theta_1}{dt}\right)^2 + 2\ell_1 \ell_2 \frac{d\theta_1}{dt} \frac{d\theta_2}{dt} (\cos \theta_1 \cos \theta_2 + \sin \theta_2 \sin \theta_2) + \ell_2^2 \left(\frac{d\theta_2}{dt}\right)^2 \right]

Here, we can use the identity cosxcosy+sinxsiny=cos(xy)\cos x \cos y + \sin x \sin y = \cos (x - y) to simplify to:

K2=12m2[12(dθ1dt)2+212dθ1dtdθ2dtcos(θ1θ2)+22(dθ2dt)2]K_2 = \frac{1}{2} m_2 \left[ \ell_1^2 \left(\frac{d\theta_1}{dt}\right)^2 + 2\ell_1 \ell_2 \frac{d\theta_1}{dt} \frac{d\theta_2}{dt} \cos (\theta_1 - \theta_2) + \ell_2^2 \left(\frac{d\theta_2}{dt}\right)^2 \right]

Now, combining the two kinetic energies together, we end up with:

K=12m112(dθdt)2+12m2[12(dθ1dt)2+212dθ1dtdθ2dtcos(θ1θ2)+22(dθ2dt)2]K = \frac{1}{2} m_1 \ell_1^2 \left(\frac{d\theta}{dt}\right)^2 + \frac{1}{2} m_2 \left[ \ell_1^2 \left(\frac{d\theta_1}{dt}\right)^2 + 2\ell_1 \ell_2 \frac{d\theta_1}{dt} \frac{d\theta_2}{dt} \cos (\theta_1 - \theta_2) + \ell_2^2 \left(\frac{d\theta_2}{dt}\right)^2 \right]

We use a similar approach for the potential energies - we add the potential energy of the first pendulum and the second to find the total system’s potential energy:

U=U1+U2=m1g(1cosθ1(t))+m2g(1cosθ1(t)2cosθ2(t))\begin{align} U &= U_1 + U_2 \\ &= m_1 g (-\ell_1 \cos \theta_1 (t) ) + m_2 g (-\ell_1 \cos \theta_1(t) - \ell_2 \cos \theta_2(t)) \end{align}

After factoring, we have:

U=(m1+m2)g1cosθ1(t)m2g2cosθ2(t)U = -(m_1 + m_2) g \ell_1 \cos \theta_1(t) - m_2 g \ell_2 \cos \theta_2 (t)

Using L=KU\mathcal{L} = K - U, we substitute into the two Euler-Lagrange equations (one for θ1\theta_1 and one for θ2\theta_2):

ddtLθ˙1Lθ2=0ddtLθ˙1Lθ2=0\begin{align} \frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot \theta_1} - \frac{\partial \mathcal{L}}{\partial \theta_2} = 0 \\ \frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot \theta_1} - \frac{\partial \mathcal{L}}{\partial \theta_2} = 0 \end{align}

With the Lagrangian:

L=12m112(dθdt)2+12m2[12(dθ1dt)2+212dθ1dtdθ2dtcos(θ1θ2)+22(dθ2dt)2]+(m1+m2)g1cosθ1(t)+m2g2cosθ2(t)\begin{align} \mathcal{L} &= \frac{1}{2} m_1 \ell_1^2 \bigg(\frac{d\theta}{dt}\bigg)^2 + \frac{1}{2} m_2 \bigg[ \ell_1^2 \bigg(\frac{d\theta_1}{dt}\bigg)^2 \\ &\qquad+ 2\ell_1 \ell_2 \frac{d\theta_1}{dt} \frac{d\theta_2}{dt} \cos (\theta_1 - \theta_2) + \ell_2^2 \bigg(\frac{d\theta_2}{dt}\bigg)^2 \bigg] \\ &\qquad+ (m_1 + m_2) g \ell_1 \cos \theta_1(t) + m_2 g \ell_2 \cos \theta_2 (t) \end{align}

To find the equations of motion, which are given by (source):

(m1+m2)l1θ¨1+m2l2θ¨2cos(θ1θ2)+m2l2θ˙22sin(θ1θ2)+(m1+m2)gsinθ1=0l2θ¨2+l1θ¨1cos(θ1θ2)l1θ˙12sin(θ1θ2)+gsinθ2=0\begin{align} (m_1 + m_2) l_1 \ddot{\theta}_1 + m_2 l_2 \ddot{\theta}_2 \cos(\theta_1 - \theta_2) + m_2 l_2 \dot{\theta}_2^2\sin(\theta_1 - \theta_2) + (m_1 + m_2) g \sin\theta_1 = 0 \\ l_2 \ddot{\theta}_2 + l_1 \ddot{\theta}_1 \cos(\theta_1 - \theta_2) - l_1 \dot{\theta}_1^2 \sin(\theta_1 - \theta_2) + g \sin\theta_2 = 0 \end{align}

These equations are completely unsolvable analytically, but they can be solved numerically to yield the position of a double pendulum with time.

Langrangian to Newtonian mechanics

Let’s see how we can recover Newton’s 2nd law from the Euler-Lagrange equation. Remember that the equation (in the case of one-dimenional motion along the xx axis) is given by:

ddtLx˙=Lx\frac{d}{dt}\frac{\partial \mathcal{L}}{\partial \dot x} = \frac{\partial \mathcal{L}}{\partial x}

And remember that the Lagrangian is given by:

L=12mx˙2U(x)\mathcal{L} = \frac{1}{2} m \dot x^2 - U(x)

Applying the Euler-Lagrange equations to the general Lagrangian, we get:

mx¨=dUdxm \ddot x = -\frac{dU}{dx}

Recalling that F=dUdxF = -\frac{dU}{dx} and x¨=a\ddot x = a, we have recovered Newton’s 2nd law!

F=maF = ma

We can even use Lagrangian mechanics on simple problems and check that it matches with Newtonian mechanics. Let’s do our freefall example from earlier. With K=12my˙2K = \frac{1}{2} m \dot y^2 and U=mgyU = mgy, we use the Euler-Lagrange equations to find:

md2ydt2=mgm \frac{d^2 y}{dt^2} = - mg

Which we can simplify to:

ay=ga_y = -g

Which reproduces the Newtonian result!

Hamiltonian mechanics

Noether’s theorem

Field Lagrangians

Besides working with particles and their trajectories, we are also often interested in fields in physics, such as the electromagnetic or gravitational fields. To find the differential equations that describe these fields, we need an Euler-Lagrange equation for fields rather than particles.

Let us consider a generic field φ(r,t)\varphi(\mathbf{r}, t). For reasons that will be elaborated in more detailed in the special and general relativity sections (we will give a rough outline for why in a note further down this section), it is conventional to group the space and time components in one vector X\mathbf{X} which has four components, one of time and three of space, and where cc is the speed of light:

X=(ct,r)=[ctxyz]\mathbf{X} = (ct, \mathbf{r}) = \begin{bmatrix} ct \\ x \\ y \\ z \end{bmatrix}

It is also convention to group the time derivative and gradient operators together, which we call the four-gradient and denote X\partial_\mathbf{X}:

X=(1c,)=1c,x,y,z\partial_\mathbf{X} = \left(-\dfrac{1}{c}, \nabla\right) = \left\langle-\dfrac{1}{c}, \dfrac{\partial}{\partial x}, \dfrac{\partial}{\partial y}, \dfrac{\partial}{\partial z}\right\rangle

With these new vectors we may write the field as φ(X)\varphi(\mathbf{X}). The action for the field is given by:

S[φ,Xφ,X]=ΩL(φ,Xφ,X)d4xS[\varphi, \partial_\mathbf{X} \varphi, \mathbf{X}] = \int_\Omega \mathcal{L(\varphi, \partial_\mathbf{X}\varphi, \mathbf{X})} d^4 x

Where L\mathcal{L} is our field Lagrangian, d4x=dVdtd^4 x = dVdt is an infinitesimal portion of space and time, and Ω\Omega is the domain of all space and all times. We will not repeat the full derivation of the Euler-Lagrange equation, but the steps are very similar to what we have already seen with the single-object Lagrangian case. The result is the Euler-Lagrange equation for fields, which takes the form:

LφX(L(Xφ))=0\dfrac{\partial \mathcal{L}}{\partial \varphi} - \dfrac{\partial}{\partial\mathbf{X}} \left(\dfrac{\partial \mathcal{L}}{\partial(\partial_\mathbf{X} \varphi)}\right) = 0

Lagrangians to the stars

The Lagrangian formulation of classical mechanics is so powerful, precisely because it relies on a differential equation that can be generalized. Beyond classical mechanics, the Lagrangian isn’t always necessarily L=KU\mathcal{L} = K - U, but the Euler-Lagrange equations still hold true, and so does the principle of stationary action. Thus, a theory - including those that involve fields - can be written as a Lagrangian, as the Euler-Lagrange equations yield the equations of motion for each theory, on which the rest of the theory is built on! This is the reason behind learning Lagrangian mechanics.

We will end with one final thought - one of the most successful theories in all of physics, the Standard model of particle physics (which is a quantum field theory), is encapsulated in one compact Lagrangian:

LSM=14FμνFμν+iψγβDβ+h.c.+ψiyijψjϕ+h.c.+Dμϕ2V(ϕ)\mathcal{L}_\mathrm{SM} = -\frac{1}{4} F_{\mu \nu} F^{\mu \nu} + i \overline \psi \gamma_\beta D^\beta + h.c. + \psi_i y_{ij} \psi_j \phi + h.c. + |D_\mu \phi |^2 - V(\phi)

And one of the most mathematically beautiful theories, in fact one we will see very soon, General Relativity (which is a classical field theory for gravity), is described in another compact Lagrangian:

LGR=12κRg+Lmatter\mathcal{L}_\mathrm{GR} = \frac{1}{2\kappa} R \sqrt{-g} + \mathcal{L}_\mathrm{matter}

Who knew that Lagrangians could take us deep into the hearts of atoms, and to the furthest stars...? :)