Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Special Relativity

“Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality...”

Hermann Minkowski (1908)

For over 200 years, the basic premises of Newtonian mechanics remained absolute. Space and time were separate entities, and the laws of physics followed Newton’s postulates. That is, until 1905, when Einstein broke both previously-assumed absolute truths, and set out to create a new theory. This groundbreaking theory was the theory of Special Relativity, a description of the motion of objects that supercedes Newtonian mechanics and completely revolutionized our understanding of the world.

import matplotlib.pyplot as plt
import numpy as np

Events

An event is anything that happens that can be measured. This can be a rocket flying through your window, a book that falls on your head, a crater that opens in your room, or all three at once! We can describe events with coordinates - perhaps your rocket-falling-book-crater event happened at position 3 meters to your right, 2 meters to your front, and 0 meters above your head, at time 2:55pm. Physicists obsess over events, because everything in physics is composed of a sequence of events, and so using the laws of physics, physicists can predict what events will happen.

Galilean Relativity

Imagine you were seated aboard a train, and the train was moving with constant velocity. Are you moving and the earth underneath you stationary? Or is the train stationary and the earth is moving under you? In physics, both of these interpretations can be true - your understanding of your motion must be considered relative to some other object. For instance, you can pick the stationary object to be the earth, in which case you would be considered moving, or you could pick the stationary object to be yourself on the train, in which case the Earth would be moving. In either case, the laws of physics remain the same.

Reference frames

A reference frame is a coordinate system with an origin centered at a chosen location. For example, you can choose your origin to be a house on the surface of the Earth, a moving train, a rocket, even a random point in interstellar space. You would then have a reference frame at that origin.

Typically speaking, however, when we refer to the reference frame of an observer, the origin is located at wherever the observer is located. So the reference frame of an astronaut in a rocket would have an origin centered at that rocket. In the astronaut’s reference frame, they themselves are located at the point (0,0,0)(0, 0, 0), and everything else (such as the motion of the Earth) is measured relative to them.

We use reference frames to measure everything around us, everything from the position and velocities of objects to forces between objects. In fact, without reference frames, we wouldn’t be able to measure any motion at all.

Transformations

In physics, it is sometimes easier to do calculations in one reference frame than another - no one would like to compute the trajectory of a baseball on Earth from the reference frame of another galaxy! So, we want to be able to convert measurements from one reference frame to another.

The reference frame of an observer is known as the unprimed frame. The observer is at rest with respect to the unprimed frame, and everything is moving relative to them. Any measurement in the unprimed frame is described by coordinates (t,x,y,z)(t, x, y, z).

Everything moving relative to the observer has a reference frame of its own. For example, our astronaut could be observing Earth, the Sun, the Moon, a space station - and each of those objects has its own reference frame. The observer measures the velocity vv at which each reference frame moves with respect to the observer. These other moving reference frames are known as primed frames because they are denoted with the prime (') symbol. Any measurement in a primed frame is described by the coordinates (t,x,y,z)(t', x', y', z').

These sets of different measurements are related as follows:

x=xvty=yz=zt=t\begin{align} x' &= x - vt \\ y' &= y \\ z' &= z \\ t' &= t \end{align}

One thing we specifically notice is that here, time is absolute - the same in every reference frame. As we will see, this will no longer hold true in special relativity.

The constant speed of light

In the late 19th-century, physicists finally came up with one unified theory of electromagnetism using Maxwell’s equations, which we saw earlier. Recall that the equations are given by:

E=ρϵ0×E=BtB=0×B=μ0J+μ0ϵ0Et\begin{align} \nabla \cdot \vec E &= \frac{\rho}{\epsilon_0} \\ \nabla \times \vec E &= -\frac{\partial \vec B}{\partial t} \\ \nabla \cdot \vec B &= 0 \\ \nabla \times \vec B &= \mu_0 \vec J + \mu_0 \epsilon_0 \frac{\partial \vec E}{\partial t} \end{align}

Let’s take Faraday’s law:

×E=Bt\nabla \times \vec E = -\frac{\partial \vec B}{\partial t}

Suppose we take the curl of both sides:

×(×E)=×Bt\nabla \times (\nabla \times \vec E) = \nabla \times -\frac{\partial \vec B}{\partial t}

The right hand side can be rearranged to be:

×(×E)=t(×B)\nabla \times (\nabla \times \vec E) = -\frac{\partial}{\partial t} (\nabla \times \vec B)

Which simplifies to:

×(×E)=μ0ϵ02Et2\nabla \times (\nabla \times \vec E) = -\mu_0 \epsilon_0 \frac{\partial^2 \vec E}{\partial t^2}

We can use the vector identity ×(×E)=2E\nabla \times (\nabla \times \vec E) = -\nabla^2 E to simplify further to:

2E=μ0ϵ02Et2\nabla^2 E = \mu_0 \epsilon_0 \frac{\partial^2 \vec E}{\partial t^2}

Using the same technique for Ampére’s law and then Faraday’s law yields the same result for magnetic fields:

2B=μ0ϵ02Bt2\nabla^2 B = \mu_0 \epsilon_0 \frac{\partial^2 \vec B}{\partial t^2}

Note that this looks very much like the wave equation:

2f=1v22ft2\nabla^2 f = \frac{1}{v^2} \frac{\partial^2 f}{\partial t^2}

Which means that oscillating electric and magnetic fields produce electromagnetic waves that move through space at a speed vv. We can solve for vv by noting that:

1v2=μ0ϵ0\frac{1}{v^2} = \mu_0 \epsilon_0

This yields:

v=1μ0ϵ0=299792458 m/s=cv = \frac{1}{\sqrt{\mu_0 \epsilon_0}} = 299792458 \mathrm{\ m/s} = c

Now notice something special. The velocity cc of electromagnetic waves - light waves - is a constant, because it is composed of the reciprocal of the square root of two other constants. This means that regardless of the velocity of the reference frame, it must be the same speed.

However, remember that in Galilean relativity, we defined that velocities add by v+u\vec v + \vec u. So we’d expect that an observer in a moving reference frame would measure a higher speed of light, while observers in a stationary reference frame would measure a lower speed of light. Through numerous experiments, this was proven not to be the case - we are certain that the speed of light is constant, regardless of the reference frame.

Therefore, the Galilean transformations must be wrong, and a new set of transformations - the Lorentz transformations - must supercede them.

The Lorentz Transformations

The Lorentz transformations are Einstein’s revision to Galilean relavity, derived from two postulates:

To derive the Lorentz transformations, let’s start with the Galilean transformations for xxx \rightarrow x' and xxx' \rightarrow x:

x=xvtx=x+vt\begin{align} x' &= x - vt \\ x &= x' + vt' \end{align}

To correct Galilean coordinate transformations, we intuitively need to multiply the Galilean transformations by a factor γ\gamma, which we can think of as the “correcting factor” to make sure that the Galilean transforms preserve the speed of light in every reference frame:

x=γ(xvt)x=γ(x+vt)\begin{align} x' &= \gamma (x - vt) \\ x &= \gamma (x' + vt') \end{align}

Now, we can multiply the left and right hand sides of the equation together, to combine the two equations into one equation:

xx=γ(xvt)γ(x+vt)xx=γ2(xx+xvtxvtv2tt)\begin{align} x' x &= \gamma (x - vt) \gamma (x' + vt') \\ x' x &= \gamma^2 (xx' + xvt - x'vt - v^2 t t') \end{align}

Remember the second postulate of special relativity is that the speed of light is an invariant in every reference frame - that is c=xt=xtc = \frac{x}{t} = \frac{x'}{t'}. Rearranging, we can say that x=ctx = ct and x=ctx' = ct'. Substituting that in, we have:

c2tt=γ2(c2tt+ctvtctvtv2tt)c^2 tt' = \gamma^2 (c^2 tt' + ct vt - ct' vt - v^2 tt')

The two middle terms cancel each other out, so we have:

c2tt=γ2(c2ttv2tt)c^2 tt' = \gamma^2 (c^2 tt' - v^2 tt')

We isolate γ2\gamma^2 by dividing the right-hand side of the equation by the left, to obtain:

γ2=c2ttc2ttv2tt\gamma^2 = \frac{c^2 tt'}{c^2 tt' - v^2 tt'}

We can factor out the common factor of tttt', to get:

γ2=c2tttt(c2v2)=c2(c2v2)\begin{align} \gamma^2 &= \frac{c^2 tt'}{tt'(c^2 - v^2)} \\ &= \frac{c^2}{(c^2 - v^2)} \end{align}

We can then simplify by dividing both the numerator and denominator by c2c^2, which gives us:

γ2=11v2c2=11(vc)2\begin{align} \gamma^2 &= \frac{1}{1 - \frac{v^2}{c^2}} \\ &= \frac{1}{1 - \left(\frac{v}{c}\right)^2} \end{align}

Finally, taking the square root, we have:

γ=11(vc)2\gamma = \frac{1}{\sqrt{1 - \left(\frac{v}{c}\right)^2}}

We can use this to derive the Lorentz transformations for xx, yy, and zz, but we need a little more algebra to figure out the Lorentz transform for tt. To do this, we first write out the Lorentz transformations from xxx \rightarrow x' and xxx' \rightarrow x:

x=γ(xvt)x=γ(x+vt)\begin{align} x' &= \gamma(x - vt) \\ x &= \gamma (x' + vt') \end{align}

We take the second equation to solve for tt':

t=xγvxvt' = \frac{x}{\gamma v} - \frac{x'}{v}

And we can now plug in the first Lorentz transformation equation into xx':

t=xγvγ(xvt)vt' = \frac{x}{\gamma v} - \frac{\gamma(x - vt)}{v}

We can now distribute to find:

t=γ(xγ2vxv+t)t' = \gamma \left(\frac{x}{\gamma^2 v} - \frac{x}{v} + t\right)

We can further factor out the first two terms as:

t=γ(x[1γ2v1v]+t)t' = \gamma \left(x\left[\frac{1}{\gamma^2 v} - \frac{1}{v}\right]+ t\right)

This simplifies to:

t=γ(x[1γ2γ2v]+t)t' = \gamma \left(x\left[\frac{1- \gamma^2}{\gamma^2 v}\right] + t\right)

Which then simplifies to:

t=γ(xvc2+t)t' = \gamma \left(-\frac{xv}{c^2} + t\right)

Or:

t=γ(tvxc2)t' = \gamma \left(t - \frac{vx}{c^2}\right)

So, we now have the complete set of Lorentz transformations, which obey the laws of special relativity:

t=γ(tvxc2)x=γ(xvt)y=yz=z\begin{align} t' &= \gamma \left(t - \frac{vx}{c^2}\right) \\ x' &= \gamma (x - vt) \\ y' &= y \\ z' &= z \end{align}

Where the Lorentz factor γ\gamma is approximately 1 at speeds of everyday life, but rises to infinity as you approach cc:

c = 299792458
v = np.linspace(0, 0.999 * c, 1000)
gamma = 1 / np.sqrt(1 - (v / c) ** 2)
one = np.ones(len(v))

plt.plot(v, gamma, label="Gamma")
plt.plot(v, one, label="y = 1")
plt.legend()
plt.title("Gamma factor as a function of speed (in m/s)")
plt.show()

The idea of spacetime

In classical physics, time had always been thought of as a steady feature in the background of the universe, something that was universal, and crucially, experienced the same way by everyone. But now, with the Lorentz transforms, it was clear that time was a dimension, like any other, and it couldn’t be separated from the dimensions of space. Hence, the new idea of spacetime was born - a 4-dimensional space that contained space and time.

But how would this new spacetime be described? One way to describe spacetime is by defining a metric, which we saw back in tensor calculus. A metric can be used to define how distances are measured in space. For instance, in 2D Euclidean space, we can measure distances with:

ds2=dx2+dy2ds^2 = dx^2 + dy^2

which is just the Pythagorean theorem. Hermann Minkowski, Einstein’s former professor, recognized that this metric would not work in special relativity when applied to spacetime. He instead proposed a different metric, which we today know as the Minkowski metric.

We start from the Euclidean metric in three dimensions:

ds2=dx2+dy2+dz2ds^2 = dx^2 + dy^2 + dz^2

Now, if we add a time dimension, and still consider Euclidean space, we have:

ds2=dt2+dx2+dy2+dz2ds^2 = dt^2 + dx^2 + dy^2 + dz^2

We want to use the same units for time as well as space (units of meters). Otherwise, we would have incompatible units in our metric. Thus, we add a factor of c2c^2 to get:

ds2=dt2+dx2+dy2+dz2ds^2 = dt^2 + dx^2 + dy^2 + dz^2

Now, note the observation that in special relativity, as an object moves faster, it experiences less time, instead of more time, as in Euclidean space. Therefore, the time component of the metric must be negative:

ds2=c2dt2+dx2+dy2+dz2ds^2 = -c^2 dt^2 + dx^2 + dy^2 + dz^2

We have arrived at the Minkowski metric.

Consequences of special relativity

Relativity of simultaneity

Let’s go back to the Lorentz transform for time coordinates:

t=γ(tvxc2)t' = \gamma \left(t - \frac{vx}{c^2}\right)

We can rewrite this equation equivalently in terms of changes in time:

Δt=γ(ΔtvΔxc2)\Delta t' = \gamma \left(\Delta t - \frac{v \Delta x}{c^2}\right)

From this equation, it is clear that two events with Δt=0\Delta t = 0 - happening simultaneously in one frame - do not necessarily imply that Δt=0\Delta t' = 0 - that is, they are happening simultaneously in the other frame. In fact, the actual case is that:

Δt=0Δt=γvΔxc2\Delta t = 0 \Rightarrow \Delta t' = -\frac{\gamma v\Delta x}{c^2}

Which implies that events that are simultaneous in frame SS are separated by a time of γvΔxc2-\frac{\gamma v\Delta x}{c^2} in frame SS'. This is the relativity of simultaneity.

Time dilation

Again, we return to the equation:

Δt=γ(ΔtvΔxc2)\Delta t' = \gamma \left(\Delta t - \frac{v \Delta x}{c^2}\right)

In this case, suppose we have two clocks, one at rest aboard a moving spaceship (frame SS'), and one at rest on Earth (frame SS). Because both clocks are not moving, we can say that Δx=0\Delta x = 0. But this leads us to find something strange:

Δt=γΔt\Delta t' = \gamma \Delta t

That is, more time passes on Earth than aboard the spaceship during the same measured time interval. Or put it another way, clocks tick more slowly when placed in a moving reference frame. For instance, a 3-second time interval for a spaceship going at 80% light speed would be measured as 7-seconds by the earthbound clock. In practice, this effect doesn’t show up until a reference frame is moving at more than 50% of the speed of light, but in interstellar spaceflight, the effects can be dramatic - at 99.999% of the speed of light, one year aboard a spacecraft would be 223 years on Earth! In fact, this is one mode of time travel into the future - passengers aboard a very fast craft would experience little time, while a lot more time passes for stationary observers, allowing passengers to seemingly magically travel into the future.

Length contraction

We first take the x-coordinate Lorentz transform equation, and we write it in terms of the change in xx:

Δx=γ(ΔxvΔt)\Delta x' = \gamma (\Delta x - v \Delta t)

We set Δt=0\Delta t = 0 as we want a snapshot of one moment in time - therefore, we have:

Δx=γΔx\Delta x' = \gamma \Delta x

We can solve for Δx\Delta x with:

Δx=Δxγ\Delta x = \frac{\Delta x'}{\gamma}

This means in the stationary frame, a moving object would be contracted along the direction of motion. This means in the paradoxical case that a meterstick travelling at 90% or more than the speed of light would be able to fit into a barn house less than a meter long.

Relativistic addition of velocities

u=dxdt=γ(dxvdt)γ(dtvdxc2)=dxdtv1(vc2)(dxdt)=uv1uv/c2u'={\frac {dx'}{dt'}}={\frac {\gamma (dx-vdt)}{\gamma \left(dt-{\frac {vdx}{c^{2}}}\right)}}={\frac {{\frac {dx}{dt}}-v}{1-\left({\frac {v}{c^{2}}}\right)\left({\frac {dx}{dt}}\right)}}={\frac {u-v}{1-uv/c^{2}}}

Proper length and proper time:

Proper time τ\tau is the time measured by an observer’s own clock as the observer moves through spacetime. It is related to the coordinate time tt, which is the clock of an external observer, by:

Δt=γΔτ\Delta t = \gamma \Delta \tau

Proper length \ell is the length measured by an observer’s own ruler as the observer moves through spacetime. It is related to the coordinate length LL, which is the ruler of an external observer, by:

ΔL=γΔ\Delta L = \gamma \Delta \ell

Generalizing Newtonian mechanics to special relativity

Consider a particle moving along a path through spacetime xμ(τ)x^\mu (\tau). The four-velocity of that particle is given by:

Uμ=dxμdτ=γdxμdt=(cdtdτ,dxdτ,dydτ,dzdτ)U^\mu = \frac{dx^\mu}{d\tau} = \gamma \frac{dx^\mu}{dt} = \left(c\frac{dt}{d\tau}, \frac{dx}{d\tau}, \frac{dy}{d\tau}, \frac{dz}{d\tau} \right)

It can also be written as:

Uμ=(cγ,γv)U^\mu = (c\gamma, \gamma v)

Relativistic four-momentum is given by:

Pμ=mUμ=mγv=(mcγ,mγv)P^\mu = m U^\mu = m \gamma v = (mc\gamma, m\gamma v)

Relativistic four-force is given by:

Fμ=dPμdτF^\mu = \frac{dP^\mu}{d\tau}

Relativitic kinetic energy is given by:

K=(γ1)mc2=γmc2mc2K = (\gamma - 1)mc^2 = \gamma mc^2 - mc^2

Total relativistic energy is given by:

E=γmc2=K+mc2E = \gamma mc^2 = K + mc^2

When the object is stationary, γ=1\gamma = 1, so the equation simplifies to:

E=mc2E = mc^2

This provides another way to write relativistic momentum:

Pμ=(E/c,mUμ)P^\mu = (E/c, mU^\mu)