Previously, we built up the idea of acceleration as being indistinguishable from gravitational force, and found the geodesic equation, which governs how objects move in curved spacetime. But to truly accurately describe relativity, we also need a mathematical description of what curvature in spacetime is. This will be what we’ll explore in this section.
As we’ve seen by this point, the laws of physics are primarily written in partial differential equations, and so it would be natural to think that General Relativity can be characterized by partial differential equations too. The issue is, consider a vector V=Vaea. If we were to take its partial derivative, we’d have (using the product rule):
But remember that tensors should transform like tensors, where each component is at most a partial derivative multiplied by the original tensor? The additional ∂xb∂eaVa term means that the regular partial derivative doesn’t transform like a tensor. So instead of partial derivatives, we need to define a new type of derivative, the covariant derivative, which compensates for the additional term in the partial derivative. The covariant derivative takes the form:
for covectors (those with a lower index). To take the covariant derivative of tensors formed from both vectors and covectors, such as the metric tensor gμν, we add a term for each upper index the tensor has and subtract a term for each lower index the tensor has (you’ll see how this works in just a moment). For example, for the metric tensor, we first write out the covariant derivative as a partial derivative, plus an unknown term:
Then, we notice that the metric tensor has two lower indices, so we need two correction terms. The first correction term is for the index μ, and the second correction term is for the index ν. To emphasize which index each correction term is for, there is a little hat on that index:
Now comes the slightly bizarre part. We’re going to replace whichever index we’re interested in (the one with a hat on) with a dummy index α. This is so that the rules of tensor algebra work out such that the covariant derivative transforms like a tensor. So:
To figure out the correct index convention for A and B, we use the rule that we multiply Γγbα for each lower index correction term, and multiply Γαbγ for each upper index correction term. Here:
α is the dummy index we’re using
γ is the index of the term we’re interested in
b is the index we take the covariant derivative with respect to.
Also, it should be noted that “covariant derivative” is a bit of a misnomer - here, the definition of the word “covariant” pre-dates the idea of contra- and covariant tensors (tensors with upper/lower indices), and referes to the earlier definition of “invariant”. Thus, the covariant derivative is really just a fancy way of saying a derivative of a tensor that is invariant of the coordinates used.
Lastly, the covariant derivative of a field with respect to the same index as the index of the field is equal to simply the divergence:
What can we use to measure the curvature of spacetime? We already know that with the covariant derivative, we can take fully-invariant derivatives in spacetime. But if spacetime is to be curved, then if we take a derivative of a vector along direction μ, then another along direction ν, we’d expect to get a different result than if we were to take a derivative along direction ν, then along direction μ. We can qualitatively describe this as:
The difference between the two sets of derivatives is going to tell us how much the curvature of spacetime varies between the two points. So we simply need to compute:
We can write out the remaining correction terms as the tensor multiplied by several coefficients (add correction term if upper index, subtract correction term if lower index), with hats indicating which terms we’re interested in:
Be careful! The second and third terms are just products, but the first term is a derivative, so we have to use the product rule to expand - ∂μ(∂νVα+VσΓσνα)=∂μ∂νVα+∂μ(VσΓσνα). Using that, and expanding the rest of the terms out, we get:
Phew! We’re almost there, just hang in there for the remaining derivation. Good news! Things are going to look simpler from this point on. We’ve already solved the left double covariant derivative, ∇μ∇νVα. The right double covariant derivative is just the left double covariant derivative with an index swap μ↔ν (that means every time we see a μ, we replace it with a ν, and every time we see a ν, we replace it with a μ). So it is:
Now is the glorious part - when we subtract one from the other, the terms cancel each other out. Because second partial derivatives are equal no matter what order you take them, ∂μ∂ν=∂ν∂μ, so those cancel. The last two terms are identical for both (given the symmetry of the Christoffel symbols, so they cancel as well. We’re left with:
Notice a third less obvious cancellation where ∂μ(VσΓσνα)=∂μVσΓσνα+∂μΓσναVσ which cancels out Γλνα∂μVλ on the right (because dummy indices don’t matter). This simplifies the expression to:
The is the Riemann curvature tensor, and it measures how vectors diverge due to the curvature of space. It is a monster tensor - it has 256 components in 4D space, making it a 4×4×4×4 matrix.
To make this tensor easier to work with, we often contract it by making the 1st and 3rd indices identical, creating the Ricci tensor, which is defined by: