The speed of an object can never be equal to or greater than the speed of light – we all know it more or less. Even having no knowledge of special relativity, it is no longer unknown to curious readers with the help of popular science content. Naturally, many of them wonder why an object’s speed cannot be greater than light speed. And while searching, they mostly arrive at a conventional explanation and get satisfied with it, which is simply: “The mass of an object is relative. As the velocity increases, its mass also increases. So if one tries to make the speed closer to the light speed by applying force to increase the velocity, he will notice that the mass increases without increasing the velocity.” How logical and appropriate is the concept of ‘relative mass’ based on this interpretation? Does the mass really change with speed?

The concept of special relativity was first published in 1905. More than a hundred years have passed. Alongside time-dilation and length-contraction, the relativistic mass is clearly mentioned in various writings of popular science, in the A-level physics books, and in many textbooks at the undergraduate level. So why and how can there still be the question of the appropriateness of this relativistic mass?

In fact, in the beginning, the various concepts of relativity were not properly understood by contemporary physicists, including Einstein himself. Such a relativistic mass. It took a while to understand this idea clearly. But by then, it had got a place in the textbooks. The convention of rest mass and relativistic mass in the many old books has continued over generations. And the imitation is still ongoing in many current textbooks. Because, according to those book-writers, this concept helps to easily explain relativity at the introductory level. From generation to generation, this idea has become entrenched in students, teachers, and writers in such a way that it has been termed a pedagogical virus. However, the current physicists who are professionally relativitists no longer use the terms rest-mass, and relativistic-mass. According to their new convention, the mass has only one type. It’s the one which was known as rest mass in the earlier convention. You will find this new convention in the modern textbooks of relativity. By the way, don’t think that the concept of mass has changed recently. It’s been used by physicists for the last half-century.

In today’s article, we will see how the mass remains unchanged mathematically, and why the idea of relativistic mass was introduced. Also, I will discuss some drawbacks and illogicalities of this concept. Let’s get started. I have divided the article into three sections:

Readers are requested to read the whole without omitting any section. Many readers may lose interest in the beginning due to mathematics. But if you have a real interest in physics, the reluctance in mathematics is not desirable. In popular science culture, there’s a tendency to practice these subjects without understanding maths. So, many people feel illusory superiority (Dunning-Kruger effect) in these matters even with vague knowledge. Hopefully, this effect can be somewhat reduced here if the necessary math is present.

### Vectors and Coordinate Change

Before entering into the mathematics of special relativity, let us first review vectors and how they behave under coordinate change.

Let’s consider a position vector be $\vec{X}$. We can write it in 2D Cartesian coordinate as: $$\vec{X} = (x,y) = x\hat{\imath} + y \hat{\jmath}$$

Here, instead of $\hat{\imath}, \hat{\jmath}$, the basis vectors can also be represented as $\hat{x}, \hat{y}$ or $\vec{e}_x, \vec{e}_y$.

$$\vec{X} = x\hat{x} + y \hat{y}= x \vec{e}_x + y \vec{e}_y$$

Now, let’s take another coordnate which is rotated anti-clockwise by an angle $\theta$ from out previous coordinate. In the figure, the inital coordinate is showed in green color (unprimed) and the new one is in blue (primed). From this new coordinate, the same vector $\vec{X}$ appears as: $$\vec{X}’ = x’ {\vec{e}’_x} + y'{\vec{e}’_y}$$

The following two equations shows the relation between the components of the vector $(x, y)$ and $(x’,y’)$ in X-Y and X’-Y’ coordinates.

\begin{align*} x’ &= x\cos\theta + y\sin\theta \\ y’ &= -x\sin\theta + y\cos\theta \end{align*}

The situation is equivalent if the vector itself is rotated clockwise at angle $\theta$. I mean, the counter-clockwise rotation of coordinate is equivalent to clockwise rotation of the vector itself. Now, above two equations of transformation can be written in matrix form like this:

\begin{align*} \begin{pmatrix} x’ \\ y’ \end{pmatrix} &= \begin{pmatrix} \cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \\ \implies X’ &= R X \end{align*} Where, $ R = \begin{pmatrix} \cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{pmatrix} $ is the 2D rotation matrix.

This was about changing the vector component. The basis vectors change as well as the components.

If you look at the picture, you will understand that the change in the coordinate has obviously changed the calculated component of the vector and the base vectors. But has the vector itself become smaller or larger in size? It hasn’t, right? Let’s see if it is possible to express mathematically that the vector’s own size remains the same.

We know, that the norm or length of a vector is a scalar quantity.

In X-Y coordinate, the length of our discussed position vector is: \begin{align*} ||\vec{X}||&= \sqrt{x^2+y^2} \\ \implies ||\vec{X}||^2 &= x^2+y^2 \end{align*} And, in X’-Y’ coordinate, the length is: \begin{align*} ||\vec{X}’||&= \sqrt{{x’}^2+{y’}^2} \\ \implies ||\vec{X}’||^2 &= (x\cos\theta+y\sin\theta)^2+(-x\sin\theta+y\cos\theta)^2 \\ \implies ||\vec{X}’||^2 &= x^2+y^2 \end{align*}

It means that there is no change in the length of the vector for rotating the coordinates. That is, the length of the vector is invariant under rotation. The length can be expressed in another way, which is the scalar or dot product with the vector itself (it is equal to the vector length squared).

$||\vec{X}||^2 = ||\vec{X}’||^2 = \vec{X}\cdot \vec{X}=$ Invariant scalar product

This scalar product form is going to be more useful for us than the length.

So far, we have discussed about two-dimensional vectors or 2-vectors. The properties are also true for three-dimensional vectors or 3-vectors. The 3-position vector can represented as follows: \begin{align*} \vec{X} &= (x,y,z)\\ &= x \vec{e}_x + y \vec{e}_y + z \vec{e}_z \\ &= X^1 \vec{e}_1 + X^2 \vec{e}_2 + X^3 \vec{e}_3 \\ &= ( X^1, X^2 , X^3) \end{align*} In relativity, alongside $(x,y,z)$, it’s also written with $1, 2, 3$ indices as $( X^1, X^2 , X^3)$.

So, the essence of our previously coordinate-rotation for the 3-position vector can be summarized as:

Vector component: $X^i \leftarrow $ (Change or Transform) $ \rightarrow X’^j$, where $i,j=1,2,3$.

Invariant under rotation: $\vec{X}\cdot \vec{X} = x^2 + y^2 + z^2$.

### Mass in Special Relativity

Special relativity deals with four-dimensional spacetime. In spacetime, the path in which an object travels it’s called world-line. Here, to explain the position and motion, we need four-dimensional vectors or 4-vectors or Minkowski vectors. This is similar to the 3-vector we know, with the addition of an extra vector component along the time axis, because time is also a dimension like space.

We know, 3-vector has 3 components $X^i$ where $i=1,2,3$. So $ (X^1, X^2, X^3) = (x, y, z)$.

And 4-vector has 4 components $X^\mu$, where $\mu=0,1,2,3$. The $\mu=0$ refers to the component along the time axis. The rest three ($\mu=1,2,3$) refer to the three spatial components which were denoted earlier by $i=1,2,3$ in 3-vectors.

It means, here $(X^0, X^1, X^2, X^3) = (t, x, y, z)$.

It should be noted, I am using in the natural unit. In this unit (considering $c=1$), we can write the 4-position vector as: \begin{align*} \mathbb{X} &= (X^0, X^1, X^2, X^3) \\ &= (t, x, y, z) \\ &= t \vec{e}_t + x \vec{e}_x + y \vec{e}_y + z \vec{e}_z \end{align*}

In the Vector Review section, we have seen how a vector component changes in two-dimensional coordinate rotation. Here, we will see the change of a 4-vector component in the context of Lorentz transformation. It is a kind of four-dimensional coordinate rotation, but not exactly the same.

4-position vector under Lorentz transformation can be written as:

Vector component: $X^\mu \leftarrow $ (Lorentz Transform) $ \rightarrow X’^\nu $, where $\mu,\nu=0,1,2,3$

In matrix form: $X’ = \Lambda X$

Here $\Lambda$ is the Lorentz transformation matrix. $$\Lambda = \begin{pmatrix} \gamma & -\beta\gamma & 0 & 0\\ -\beta\gamma & \gamma & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix}$$ This matrix expresses the coordinate transformation of two different reference frames that have a relative motion along the x axis.

Here, the relative speed $=v$ and the Lorentz coefficient $\gamma = \frac{1}{\sqrt{1-\beta^2}}$, where $\beta = \frac{v}{c}$. I am referring to be in natural unit. So, inserting $c=1$, we get $\beta=v$, $\gamma = \frac{1}{\sqrt{1-v^2}}$.

As we discussed earlier, the dot product of 2-vectors does not change under 2D coordinate rotation. Similarly, the 4-vectors’ dot product does not change under Lorentz transformation. And since its product is a scalar quantity and invariant in Lorentz transformation, it is also called Lorentz scalar. This means that the length of any 4-vector is a Lorentz scalar.

Now, let’s see how our 4-vector dot product can be written. The dot product, \begin{align*} X_\mu X^\mu &= (X^0)^2 – (X^1)^2 – (X^2)^2 – (X^3)^2 \\ &= (t)^2 – (x)^2 -(y)^2 – (z)^2 \\ &= t^2 – {r}^2 \end{align*}

This product has a name, it is called spacetime interval. The concepts like proper time and time-dilation can be derived from this interval – they are not the topic of discussion today.

To understand how the negative sign has appeared in the dot product of 4-vectors, you need to understand how the dot product is performed. You need to know the metric matrix used here. It has two conventions, called metric signatures. One is $(-,+,+,+)$, another is $(+,-,-,-)$.

Simply put, in case of 3-vectors, we use $\hat \imath \cdot \hat \imath = \hat \jmath \cdot \hat \jmath = \hat k \cdot \hat k = 1$ or in another way, $\vec{e}_x \cdot \vec{e}_x = \vec{e}_y \cdot \vec{e}_y = \vec{e}_z \cdot \vec{e}_z = 1$. For 4-vectors with the signature $(-,+,+,+)$, an extra thing is used that is $\vec{e}_t \cdot \vec{e}_t = -1$।

On the other hand, the conversion is the opposite in the signature $(+,-,-,-)$. It means, in this case $\vec{e}_t \cdot \vec{e}_t = 1$ and $\vec{e}_x \cdot \vec{e}_x = \vec{e}_y \cdot \vec{e}_y = \vec{e}_z \cdot \vec{e}_z = -1$. This opposite sign is brought to distinguish the spatial axes from the temporal axis. These two types of signatures are in demand in different branches of physics. In the context of today’s discussion, I am using the $ (+, -, -, -)$ signature.

Mathematical chatter seems to be getting a little too much, but these were needed. Readers interested in learning more about these mathematical things are suggested to check the books in reference. Now let’s get back to the main discussion.

We can write Einstein’s famous equation of mass-energy equivalance in natural unit: \begin{align*} E^2 &= m^2+{p}^2 \\ \implies m^2 &= E^2 – {p}^2 \label{1} \tag{1} \end{align*} We know displacement, velocity, acceleration all are 4-vectors in the context of special relativity. So, the 4 dimensional momemtum i.e. 4-momentum can be expressed as: \begin{align*} \mathbb{P} &= E\vec{e}_t + P^x \vec{e}_x + P^y \vec{e}_y + P^z \vec{e}_z \\&= E\vec{e}_t + \vec{p} \end{align*} This momemtum vector components in matrix form, $$P^\mu = \begin{pmatrix} P^0 \\ P^1 \\ P^2 \\ P^3 \end{pmatrix} = \begin{pmatrix} P^t \\ P^x \\ P^y \\ P^z \end{pmatrix} = \begin{pmatrix} E \\ P^x \\ P^y \\ P^z \end{pmatrix} = \begin{pmatrix} E \\ {p} \end{pmatrix} $$ A while ago, we saw how the scalar or inner product of a 4-position is. Similarly, we can get to the scalar product of 4-momentum: \begin{align*} P_\mu P^\mu &= (P^0)^2 – (P^1)^2 – (P^2)^2 – (P^3)^2 \\ &= E^2 – {p}^2\label{2} \tag{2} \end{align*} Now, it is seen by comparing the equations \eqref{1} and \eqref{2} that \begin{align*} m^2= P_\mu P^\mu \label{m^2} \tag{2} \end{align*}

Then what is happening here! As you can see, the scalar product $P_\mu P^\mu$ remains unchanged or invariant under the Lorentz transformation. And that is the mass. Strictly speaking, mass is the length of 4-momentum.

If the Lorentz transformation of mass were possible, then it could not be expressed in the form of such a scalar product. So we see that just as a 3-vector length is invariant in three-dimensional coordinate rotation, so is the mass of an object (Lorentz scalar) in a four-dimensional Lorentz transformation.

### Emergence of relativistic mass

At this stage you can ask the question – if the mass is invariant or Lorentz scalar, then why the relativity of the mass shown in your book is as follows:

$$m_{\text{rel}} = \frac{m}{\sqrt{1- \frac{v^2}{c^2}}} = \gamma m$$ How did it come? Let’s dig into it.

We will look at the matter with the previously discussed Lorentz transformation matrix. Suppose two frames S and S’ are moving relative to each other at a constant speed $v$ along the x-axis. So, we can write the relation of an object’s 4-momentum calculated from the two frames in matrix form:

\begin{align*} P’ &= \Lambda P\\ \implies \begin{pmatrix} E’ \\ P’^x \\ P’^y \\ P’^z \end{pmatrix} &= \begin{pmatrix} \gamma & -\beta\gamma & 0 & 0\\ -\beta\gamma & \gamma & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} E \\ P^x \\ P^y \\ P^z \end{pmatrix} \end{align*} Spitting it into equations: \begin{align*} E’ &= \gamma (E – \beta P^x) \label{E’} \tag{4} \\ P’^x &= \gamma (- \beta E + P^x ) \label{P’^x} \tag{5} \\ P’^y &= P^y \\ P’^z &= P^z \end{align*} Since the relative motion is only along the x axis, $P’^y = P^y=0$ and $P’^z = P^z = 0$.

Now lets consider any one frame in rest with respect to the motion of the object. Suppose the frame S’ is the rest frame. So, in this frame, $P’^x = 0$ and according to \eqref{1}, the measured energy: \begin{align*} (E’)^2 &= m^2 + (P’^x)^2 \\ \implies (E’)^2 &= m^2 + 0 \\ \implies E’ &= m \label{m} \tag{6} \end{align*}

From \eqref{E’}, we get: \begin{align*} m &= \gamma (E – \beta P^x) \\ \implies E &= \frac{m}{\gamma} + \beta P^x \end{align*} From \eqref{P’^x}, we get: \begin{align*} & 0 = \gamma (- \beta E + P^x ) \\ \implies & P^x = \beta E \\ \implies & P^x = \beta ( \frac{m}{\gamma} + \beta P^x ) \\ \implies & (1 – \beta^2)P^x = \beta \frac{m}{\gamma} \\ \implies & \frac{P^x}{\gamma^2} = v \frac{m}{\gamma} \\ \implies & P^x = \gamma m v \\ \therefore \quad & \vec{p} = \gamma m \vec{v} \label{p} \tag{7} \end{align*}

We have come close to the main point. As we can see here that the Lorentz coefficient $\gamma$ exists (multiplied with $mv$) in the formula of 3-momentum. But in Newtonian mechanics, we have learnt in the definition of momentum, \begin{align*} \vec{p} = m \vec{v} \label{c_p} \tag{8} \end{align*} So, to match the form of the familiar Newtonian momentum, the contemporary physicists preferred to write the equation \eqref{p} as: \begin{align*} \vec{p} = m_{\text{rel}} \vec{v} \label{r_p} \tag{9} \end{align*} And in doing so, a new definition of mass was introduced – variable mass or relativistic mass: \begin{align*} m_{\text{rel}} := \gamma m \label{def_m_rel} \tag{10} \end{align*} An advantage was gained by introducing this new definition. With it, the form of relativistic 3-momentum ‘looks’ same as the Newtonian momentum.

But there are disadvantages or difficulties in using it. There is a significant difference between special relativity and Newtonian mechanics. So, it is not a good idea to bring another new definition of mass here to match the old formula of momentum. $\vec{p}=m\vec{v}$ only works for heavy objects moving at low speeds. And the definition that applies everywhere is related to inertia – momentum is the thing whose rate of change gives the force. By using $m_{\text{rel}}$, the two forms of momentum \eqref{p} and \eqref{c_p} can be forcibly matched with other. But in the case of other terms, it fails to bring the proper matching between special relativity and Newtonian mechanics. I mean, even with $m_{\text{rel}}$, they do not ‘look’ the same. For example, check the case of $\vec{F}$: \begin{align*} \vec{F} &= \frac{d \vec{p}}{dt} \\ &= \frac{d}{dt}(m_{\text{rel}} \vec{v}) \\ &= \frac{d}{dt}(\gamma m \vec{v}) \\ &= \gamma m \frac{d \vec{v}}{dt} + m\vec{v} \frac{d\gamma}{dt} \\ &= m_{\text{rel}} \vec{a} + m\gamma^3 (\vec{v} \cdot \vec{a}) \vec{v} \end{align*} Here, an extra term has been arised with $m_{\text{rel}} \vec{a}$. So, without imposing the special condition $(\vec{v} \cdot \vec{a}=0)$, the Newtonian form $\vec{F}= m_{\text{rel}} \vec{a}$ is not true or usable for relativistic 3-force. You can also check when it comes to kinetic energy, $E_k = \frac{1}{2} m_{\text{rel}} v^2$ is not true. That is why physicists who are professional relativists find the use of $m_{\text{rel}}$ unreasonable, illogical and have been discouraged from using it for many years.

In essence, there is only one recognized mass in special relativity. The mass you used to refer to as the ‘rest mass’ is the only mass. The measured energy and momentum of an object are affected and changed due to the relative motion; not the mass. This is because the mass of an object here is defined as an invariant (scalar) quantity which indicates the length of the 4-momentum\eqref{m^2} of the object and its total energy\eqref{m} in its own frame.

Back to the question I started writing today. Why can’t the speed of an object be equal to the speed of light? Mass does not change or increase, then why? Yes, the mass does not increase, but the inertia increases as the speed increases by applying force. Because the increase of inertia is not just about mass; it also depends on the increase of momentum and kinetic energy of the object. While approaching the light speed, the increase in the inertia of the object itself starts to limit the increase of its speed and make it impossible to further speed up.

I am ending today’s writing here with a relevant quote from Einstein.

“It is not good to introduce the concept of the mass $M= m/ \sqrt{1-\frac{v^2}{c^2}}$ of a moving body for which no clear definition can be given. It is better to introduce no other mass concept than the ‘rest mass’ $m$. Instead of introducing $M$, it is better to mention the expression for the momentum and energy of a body in motion.”

— Albert Einstein’s letter to Lincoln Barnett, 19 June 1948.

##### References:

- L. Susskind, A. Friedman. Special Relativity and Classical Field Theory: The Theoretical Minimum. 2017.
- D. Fleisch. A Student’s Guide to Vectors and Tensors. 2011.
- J. R. Taylor. Classical mechanics. 2005.
- S. Thornton, J. Marion. Classical Dynamics of Particles and Systems. 2004.
- A. Momen. Relativistic Mass: A Lost Cause. 2013 – Facebook Note.
- Why is relativistic mass considered a bad concept? – Quora.
- Z. K. Silagadze. Relativistic Mass and Modern Physics. 2014. arXiv: 1103.6281.
- E. Hecht. How Einstein Confirmed $E_0=mc^2$. 2011. DOI: 10.1119/1.3549223.
- E. Hecht. Einstein Never Approved of Relativistic Mass. 2009. DOI: 10.1119/1.3204111.
- L. B. Okun. Mass versus Relativistic and Rest Masses. 2009. DOI: 10.1119/1.3056168.
- P. M. Brown. On the Concept of Relativistic Mass. 2007. arXiv: 0709.0687.
- L. B. Okun. The Concept of Mass in the Einstein Year. 2006. arXiv: hep-ph/0602037.
- G. Oas. On the Abuse and Use of Relativistic Mass. 2005. arXiv: physics/0504110.
- L. B. Okun. The Concept of Mass. 1989. DOI: 10.1063/1.881171.
- C. G. Adler. Does mass really depend on velocity, dad? 1987. DOI: 10.1119/1.15314.

## Leave a Reply