post

In conversation with Jens Marklof

On a rare sunny afternoon in London, Jens Marklof welcomed us at De Morgan House, in the south-eastern corner of Russell Square, in the heart of London. It is the home of the London Mathematical Society (LMS), of which Jens has been president since last November. We sat down with Jens to chat all things quantum, life as a mathematician and diversity within academia.

A somewhat rocky start

Jens grew up in Munich before moving to Hanover at 11 years old. Although he is now an accomplished mathematician, his first interaction with the subject was not as enjoyable as one might expect. He told us: “My very first maths lesson in primary school was set theory, where we had to group triangles into one family, squares into another and circles into the third.” Instead of drawing lines around each family of shapes, Jens had a different approach: “I was putting everything in one set and then just marking the boundary lines between the different families. My teacher was quite angry that I could do such a bad job. I had to stay behind at break time and redo the whole booklet, which I was pretty upset about. So that wasn’t a good start for my mathematical career.” He raises a faint smile. “I think I actually got it right… just my solution was a bit unconventional maybe.”

Continue reading

post

Markov unchained

Diophantine equations are polynomial equations with integer coefficients for which we are interested in finding integer solutions. These equations are named after Diophantus of Alexandria, a third century Greek mathematician, who was interested in finding out whether these equations were soluble in integers.

These equations have held mathematicians’ interest ever since. In 1900, David Hilbert outlined 23 problems to shape the future of mathematics. The tenth of these was to provide an algorithm, to decide whether a Diophantine equation is solvable in integers.

In 1970, Matiyasevich proved that a general algorithm was not possible, and now researchers try to solve specific classes of equations. There are also a number of questions researchers like to ask in this area, many of which are shown in the flowchart below.

Flowchart starting "Does the equation have an integer solution?"

The question you are trying to answer can dramatically affect the difficulty of the problem. For example, you may easily check that the equation
$$
xy-zt=1
$$
has an integer solution $(x,y,z,t)=(2,5,3,3)$. The eagle-eyed among you may spot that this equation has infinitely many integer solutions. For example, for any integer $u$,
\[(u+1)\times 1- u \times 1 = 1,\]
hence we have an infinite family, namely $(x,y,z,t)=(u+1,1,u,1)$. However, describing all integer solutions to this equation in parametrised form is a much more difficult task. In fact, the solution to this equation requires 46 parameters, as proved by Vaserstein in 2010, but due to the complex nature of these polynomials, the solution has never been written explicitly; but it is possible, in theory.

Let us now look at a well-known example, the Pythagoras equation
$$
a^2+b^2=c^2.
$$
In school, you may have been tasked with finding three positive integers that ‘work’ in the above equation. You may instantly think $3^2+4^2=5^2.$ This quickly answers ‘yes’ to the question: does the Pythagoras equation have an integer solution?

A 3-4-5 and a 6-8-10 right-angle triangle

It is easy to see that if we enlarge the triangle by any scale, we obtain a similar triangle. We can apply this same idea to the solution $(a,b,c)=(3,4,5)$: if we multiply this solution by any integer $u$, we obtain another integer solution to the Pythagoras equation, ie $(a,b,c)=(3u,4u,5u)$. We can confirm that this really is a solution:
\begin{align*}
a^2+b^2&=(3u)^2+(4u)^2=9u^2+16u^2 \\ &=25u^2=(5u)^2=c^2.
\end{align*}
By finding the infinite family $(a,b,c)=(3u,4u,5u)$ for all integers $u$, we have now also positively answered the question: ‘Does the equation have infinitely many integer solutions?’ However, describing every Pythagorean triple is not as easy a task. Thankfully it is doable. In fact, we can write all integer solutions to the Pythagoras equation as either
$$
(a,b,c)=(2uvw,u(w^2-v^2),u(w^2+v^2)),
$$
for integers $u,v,w$, or in the same way but with the expressions for $a$ and $b$ swapped.

Challenge: Prove that all integer solutions to the Pythagoras equation are covered by the above description.

The cuboid equation

Diophantine equations are not all about Pythagoras though. We could also consider problems involving products and squares. We’ve already considered triangles, so now let’s look at cuboids. For example, we could ask: For what integer dimensions of a cuboid is its volume equal to the square of its diagonal? Mathematically we can write this as:

Find all positive integers solutions $(x,y,z)$ to
$$
x^2+y^2+z^2=xyz.
$$

We will refer to this equation as the cuboid equation.

As we are considering the problem geometrically, $x,y,z$ must all be positive. In fact, if we find all positive integer solutions to the cuboid equation, we can use this to describe all integer solutions to the equation, by taking all combinations of signs that make the right hand side positive. That is, if $(x,y,z)$ is a solution, then $(x,-y,-z)$, $(-x,y,-z)$, and $(-x,-y,z)$ are also solutions.

Where are the solutions?

So, how can we find all positive integer solutions to the cuboid equation?

One strategy could be to try values of $n\geq 0$ in increasing order, set $n=xyz$, and see if any of its finitely many divisors are solutions to the equation. We can do these searches on a computer, such as by using the Reduce function in Mathematica. If we try all values of $n \leq 50$, we obtain the integer solutions $(0,0,0)$ and $(3,3,3)$. This strategy seems inefficient as we have looked at the factors of $50$ numbers and only found two integer solutions.

Instead of searching for factors of $n=xyz$, another method could be to try to find integer solutions to the cubic equation satisfying $\max(x,y,z)\leq m$. In this case, if we let $m=6$, we obtain the solutions found previously, as well as $(3,3,6),$ $(3,6,3)$ and $(6,3,3)$. This seems like a more efficient way of finding integer solutions to the cuboid equation, as we obtain five solutions when $m=6$, as opposed to the two solutions we found when $n=50$.

An observant reader may notice that the three extra solutions we have obtained are equivalent up to the permutation of $x,y,z$, so you could argue that we have only found one extra solution, which is still better, but it seems less impressive. These permutations are possible because the cuboid equation is symmetric: if we permute the variables in the equation, we obtain the same equation. Therefore to reduce the necessary computation power and increase the efficiency of finding all positive integer solutions up to a bound, we may only consider the solutions with $x \geq y \geq z \geq 0$, and then take all permutations.

We can then try using Mathematica to find all the solutions satisfying $x \geq y \geq z \geq 0$ with $\max(|x|,|y|,|z|)\leq m$. Taking $m=100$ gives $6$ solutions. Increasing $m$ to 1000, we obtain $11$ solutions. Interestingly, increasing $m$ to 10,000 only finds $17$ integer solutions.

This experimentation suggests that the equation may have infinitely many integer solutions, but the distance between them significantly increases. It is clear that using a ‘brute force’ style method is inefficient and impossible.
So, we need another strategy.

Looking at the solutions found earlier, we could conjecture that all integer solutions are divisible by $3$. In fact, if $x$ is divisible by $3$, then $y^2+z^2$ must also be divisible by $3$. Perfect squares modulo $3$ are either equal to $0$ or $1$, so for the sum of two squares to be equal to $0$ modulo $3$, they must both be equal to $0$ modulo $3$. This proves that if one of $x,y$ and $z$ is divisible by $3$, then they must all be divisible by $3$. Further analysis modulo $3$ proves that $x,y$ and $z$ all being multiples of $3$ is, in fact, the only possibility when seeking a solution.

If we make the substitutions $x=3X$, $y=3Y$ and $z=3Z$ for integers $X,Y,Z$, and cancel $9$, we obtain the equation
$$
X^2+Y^2+Z^2=3XYZ.
$$
This is a famous equation, known as the Markov equation, first studied by Andrey Markov in around 1880. We can prove that for every integer solution to the Markov equation, the solutions obtained by the transformations
\begin{equation}\label{markov:these_three}\left.\begin{split}
(X,Y,Z) &\to (3YZ-X,Y,Z), \\
(X,Y,Z) &\to (X,3XZ-Y,Z), \\
(X,Y,Z) &\to (X,Y,3XY-Z),
\end{split}\hspace{6mm} \right\} \hspace{-10mm} \tag{1}
\end{equation}
are also integer solutions. We will see where these transformations come from later. For now, we can show that this is true by substituting these transformed solutions into the Markov equation. For example, for $(3YZ-X,Y,Z)$, we see that
\begin{align*}
\text{LHS}
&=(3YZ-X)^2+Y^2+Z^2 \\
&=9Y^2Z^2-6XYZ + X^2+Y^2+Z^2\\
&=9Y^2Z^2-6XYZ + 3XYZ\\
&=3(3YZ-X)YZ\\
&=\text{RHS}.
\end{align*}
Hence, if we find one non-trivial integer solution—ie a solution $(X,Y,Z)\neq (0,0,0)$—then we can make the three transformations and obtain new integer solutions. Similar analysis can be applied to the new $Y$ and $Z$ values, and we can show that these are genuinely new solutions, not ones we have found before.

You may have noticed that $(X,Y,Z)=(1,1,1)$ is a non-trivial integer solution to the Markov equation. In fact, excluding $(0,0,0)$, we can obtain all integer solutions to this equation from $(X,Y,Z)=(1,1,1)$ by applying the transformation $(X,Y,Z) \to (3YZ-X,Y,Z)$, permuting the variables, and swapping between positive and negative numbers. Therefore, the equation has infinitely many integer solutions, which implies that the cuboid equation also has infinitely many integer solutions.

Now, we just need to figure out how to describe them all.

Happy descriptions

Describing all integer solutions to equations can be done in many different ways. My personal preference is to describe the integer solutions by listing a solution and the necessary transformations to obtain all other solutions to the equation. Some people may not like this method of describing solutions, and may instead prefer to present them differently, such as by using a recursive description. It is a matter of taste for how solutions are presented and whether you think a description is ‘acceptable’. If you aren’t keen on the descriptions to solutions presented in this article, you are more than welcome to describe them in an alternative manner.

Let’s now return to the Markov equation. First, we will determine the necessary transformations to find every ‘similar’ solution: we can change the signs of a solution as long as the right hand side is positive, just like we did for the cuboid equation, and we can permute the variables. We can summarise these transformations more concisely as
\begin{equation}\label{markov:this_one}\left.\begin{split}
(X,Y,Z) &\to (-X,-Y,Z), \\
(X,Y,Z) &\to (Y,X,Z), \\
(X,Y,Z) &\to (X,Z,Y).
\end{split}\hspace{6mm} \right\} \hspace{-10mm} \tag{2}
\end{equation}
Note that applying any of these transformations twice returns us to the solution we started with. However, by applying these transformations in different orders, we can obtain all the ‘similar’ solutions to $(X,Y,Z)$.

It can be proven that all non-trivial integer solutions to the Markov equation can be obtained by a sequence of the three transformations in \eqref{markov:this_one} and
$$
(X,Y,Z) \to (3YZ-X,Y,Z)
$$
to the solution $(X,Y,Z)=(1,1,1)$.

If we want to solve the cuboid equation, then we can either take every integer solution to the Markov equation and multiply it by $3$; or we can describe all integer solutions to the cuboid equation from the solution $(x,y,z)=(3,3,3)$ and apply a sequence of the transformations in \eqref{markov:this_one} and $$(x,y,z) \to (yz-x,y,z).$$

This method of finding solutions can be drawn as a tree, as shown at the top of the next page. The tree diagram shows the rapid growth of integer solutions. Here, we only show integer solutions satisfying $x \geq 0$, $y \geq 0$ and $ z \geq 0$ to the cuboid equation: if we wanted to show all the ‘similar’ solutions too, we would need much more space!

The positive integer solutions to the cuboid equation

The positive integer solutions to the cuboid equation

The method we have used to describe all integer solutions to the Markov equation
is called Vieta jumping and it is very powerful.

Vieta jumping

If we start with a one-variable quadratic equation $a x^2 + b x + c = 0$ with two solutions, say $x_1$ and $x_2$, then $x_1 + x_2 = -b/a$. So, for any given equation, if we find one solution, we can obtain another.

Let’s see this on a two-variable equation,
\[x^2 + x y + y^2 = 1.\]
We can easily see that $(x,y)=(1,0)$ is a solution. If we substitute $x=1$ into the equation, we obtain $1+y+y^2=1$, which is a quadratic equation in $y$ and it has a solution $y=0$. Applying Vieta jumping to it, we find that it also has solution $y=-1/1-0 = -1$, and we obtain $(x,y)=(1,-1)$ which is a solution to the original two-variable equation. If we do the same trick with the variables $x$ and $y$ swapped, we find another solution $(x,y)=(0,-1)$ to our two-variable equation.

So, taking any integer solution $(x,y)=(x_0,y_0)$, we can substitute $y=y_0$ into the equation and obtain an equation that is quadratic in $x$, which has a solution $x=x_0$. Then, by Vieta jumping, it must also have a solution $x=-y_0-x_0$.
Similarly, by substituting $x=x_0$ in the equation and doing Vieta jumping in $x$, we can conclude that $(x,y)=(x_0,-x_0-y_0)$ is also a solution. By starting with any integer solution to the equation, we can obtain an infinite chain of solutions.

The Markov equation can be considered as a quadratic in any of the variables. If we apply this trick to the equation separately for each of the variables, we have three jumps, and these are the transformations in \eqref{markov:these_three}.

While in this article we have used the Vieta jumping method to describe all infinitely many integer solutions to equations, for some equations, we can also use this method to show that the equation has finitely many or no integer solutions!

post

Significant figures: Sofya Kovalevskaya

As Chalkdust readers, we know of more mathematicians than the average person, but if someone were to say “name a female mathematician”, the first answer would still usually be Ada Lovelace. This is why I want to talk about someone you might not have heard of who has opened many doors for women in academia: Sofya Kovalevskaya.

Sofya Vasilyevna Kovalevskaya was born in Moscow on 15 January 1850. Her father was in the Imperial Russian army, before retiring when she was 8 years old, and was minor nobility. She had scientists and mathematicians on both sides of her family and due to her family being relatively wealthy, she had a better education than most, being tutored in science, maths and languages. At age 11, she had pages of old lecture notes on differential and integral analysis from her father’s student days on her bedroom walls. She wrote in her autobiography that these “acted on my imagination, instilling in me a reverence for mathematics as an exalted and mysterious science which opens up to its initiates a new world of wonders, inaccessible to ordinary mortals.”

When she was 16, Nicholas Tyrtov—a neighbour who was a physics professor—accidentally left one of his textbooks at her home. When he came back to pick it up, she explained to him how she understood some of the trigonometric functions in the book, none of which she had seen in her studies before. This led him to convince her father to allow her to study further in mathematics. Hence, the family spent their winters in St Petersburg so that she could be tutored by AN Strannoliubsky, who was considered the best Russian mathematics teacher of the time and a well-known supporter of higher education for women. His time with Sofya just solidified his beliefs.

Unfortunately, after finishing her studies with Strannoliubsky, she encountered a big problem: women weren’t allowed to attend university in Russia. Studying abroad wasn’t an option either, as women couldn’t live outside their family home without written permission from their father or husband, and her father wouldn’t give her this. Meanwhile, her younger sister Anna wanted to leave for Europe as well to study writing, as she had been exchanging letters with Fyodor Dostoevsky who had published one of her stories in his literary journal. So, Sofya and her sister came up with a plan: when she was 18, Sofya contracted a marriage with Vladimir Kovalevskij, a young palaeontology student and publisher who was the first to translate Darwin’s work to Russian. A year later they left Russia with Anna being allowed to leave under her guardianship, Sofya acting as her chaperone. Once out of Russia, they parted ways, with Sofya and Vladimir first going to Vienna, where Sofya was told she could attend physics lectures but not mathematics. Soon, Sofya moved to Heidelberg to study mathematics and natural sciences, only to discover that women couldn’t matriculate at the University of Heidelberg. After much persuasion, the university allowed her to attend lectures provided she could get permission from each lecturer. Here, she was able to attend physics lectures with Gustav Kirchhoff and mathematics lectures with Paul du Bois-Reymond and Leo Königsberger. Her resilience set a precedent, and another female friend of hers was also able to attend these lectures.

A year later, she moved to Berlin hoping that she could be a student of Weierstrass, who was then considered one of the most noted mathematicians in the world, even though the University of Berlin wouldn’t allow her to be a student. She decided to appeal to Weierstrass personally, who was against the idea of women studying in university but decided to give her tests to assess her abilities after the recommendation letters from Königsberger (who used to be his student) and Du Bois-Reymond. He was so impressed by her mathematical skills that he decided to take her on as a private student since, unlike Heidelberg, the University of Berlin didn’t allow her to attend classes even unofficially. He taught her the exact lectures he gave at the university but he also discussed his latest work and theories with her.

While Weierstrass in general didn’t believe that married women needed university degrees or careers, Sofya convinced him to help her work towards a doctoral degree. To achieve this Sofya wrote three papers, one on partial differential equations (PDEs), one on elliptic integrals and one on the dynamics of Saturn’s rings.

The introduction to Sofya's paper on PDEs

The introduction to Sofya’s paper on PDEs

Her paper on PDEs came to be her most important work, as it included what is known today as the Cauchy–Kovalevskaya theorem. Cauchy had already proven that, for certain types of first order PDE, there exist unique analytical solutions. Sofya’s achievement was to generalise his result to higher order PDEs.

In her third paper on the dynamics of Saturn’s rings, she used the assumptions of the time that Saturn’s rings were made of a continuous liquid to prove that the rings were egg-shaped ovals, symmetric about a single point. She wasn’t inclined to do too much precise calculation, as she believed that new research would disprove the assumptions she was working with. Indeed, it was shown later that Saturn’s rings were made of discrete particles and not a continuous liquid.

Saturn and its rings

Saturn and its rings, which are not a liquid

In 1874 and due to Weierstrass’s efforts, she was granted a doctorate from the University of Göttingen in absentia without an oral defence, thus becoming the first woman to be awarded a doctorate in mathematics.

After this, she and her husband Vladimir returned to Russia, where she wanted to teach mathematics but found she wasn’t allowed to attain the certificate she needed for this. She helped her old tutor Strannoliubsky to set up higher education courses in St Petersburg, but wasn’t allowed to teach these either. This and her rejection by most of the mathematical community in Russia, led to her taking a mathematical hiatus. A few years later, this hiatus was ended when she was asked to present a paper at the Congress of Russian Naturalists and Physicians. One of those attending her talk was Gösta Mittag-Leffler, who was one of Weierstrass’s old students. Being impressed by her talk and the high regard Weierstrass had for her, he offered to help her find a teaching position in Europe.

During their time in Russia, she and her husband had decided to act like an actual married couple and have children. However, sadly, after the birth of their daughter, they started growing apart and were having financial difficulties. They separated and later Vladimir committed suicide due to the possibility of getting arrested for financial crimes.

In 1883, Sofya was finally able to find a position at Stockholm University with the help of Gösta Mittag-Leffler, who had become the head of mathematics there. She was offered a temporary position as a sub-professor without an official affiliation to the university, and her payment coming via private arrangements with her students rather than as a salary. She hoped that by accepting this position she would become a role model for other women hoping to be part of academia. Her appointment even made front page news.

A year later, despite the prejudice against her gender, nationality and political beliefs, she was given a 5 year position of assistant professor and became an editor of the scientific journal Acta Mathematica, becoming the first woman on the board of a major scientific journal. It was during this time that she won the Prix Bordin award of the French Academy of Sciences, the second most prestigious award of the academy. Her submission for the prize was her discovery of the Kovalevskaya top. Spinning tops are rigid bodies that are fixed at one point and otherwise free to move and rotate under influence of gravity. Before Kovalevskaya, two types of top were known for which the equations that describe their motion could be solved analytically; Sofya discovered a third such type of top.

Kovalevskaya top

An example Kovalevskaya top, built using three spheres and a ring. If this top’s position is fixed at the orange point but is allowed to rotate, the equations that describe its motion can be solved analytically thanks to Sofya.

Her winning the prize and the high acclaim of her work meant that the university didn’t want to lose her and therefore she was offered a permanent position. This meant that in 1889, Sofya became the first woman to be appointed to a full professorship in modern times. Just two years later, aged 41 and at the height of her academic career, she passed away due to pneumonia.

Despite her recognition in Sweden and France, Sofya Kovalevskaya never won her dream professorship in Russia and her accomplishments weren’t valued there until after her death. Her influence on maths and the precedents she set for women in academia will never be forgotten.

post

A daredevil’s disaster

The loop-the-loop is by now a standard stunt—a vehicle completes a vertical circle without losing contact with the track in a seemingly gravity-defying feat. One of the earliest mentions of the stunt is in the early 1900s when the American cyclist Conn Baker, stage name Diavolo (pictured above in action), travelled the country performing the feat on a track about 20 feet in diameter. His crowds, naturally, feared for his life.

The anxiety is understandable: most people grasp intuitively that if the speed is too low, the vehicle will come off the track at some point beyond where the track becomes vertical. But have you ever wondered about the maths behind it all? What, exactly, would happen to a dawdling Diavolo?

In order to understand how this trick works, we will need to understand something about the mechanics of the motion. In what follows we will make some modelling assumptions:

  1. That the track is circular.
  2. That there are no forces directed along the track, such as driving forces (ie Diavolo isn’t pedalling during the feat), or forces of resistance, such as friction.
  3. That the vehicle is point-like.

Under these simplifying assumptions we can make progress using Newton’s laws. These laws are all about the concept of force. His first law says that, in the absence of a net force, an object will carry on moving in the same direction and with the same speed (which might be zero) for ever. His second law $F=ma$, tells us that an object’s acceleration, $a$ (the rate of change of its speed and direction) is determined by the forces acting upon it. His third law introduces the concept of a reaction force. This force is present between any two objects in contact. For example, when you stand barefoot on the ground your weight (the force of gravity acting on your body) is exactly balanced by the reaction forces between your feet and the ground. Newton’s first law then says that you will stay in that state indefinitely—or until a new force is applied to your body.
Continue reading

post

We don’t talk about the integers

Looking for structure comes naturally for human beings. Gazing over a landscape and identifying patterns; reading poems and feeling where the beat lies; staring at paintings and figuring out the stories being told. Structure both guides us and reassures us.

Structure lies at the core of mathematics. We look for abstract patterns and scaffolding in all sorts of mathematical phenomena. Counting is no exception in this hunt. When we are in elementary school, we learn rules for computing, for example, $3 + 10$, or $10 – 4$, or $1284 \times 12$. We learn that if we want to compute $(3+10)\times 4$, we can just compute $3 \times 4 + 10 \times 4$. Later, in maths at university, these kinds of properties are baked into the abstract definition of a ring. A ring is a set $R$ together with two operations: $+$ and $\times$, with special properties (which we’ll get to later) and two distinguished elements: $0$ and $1$, which are the identities of these operations. A simple example of a ring is the integers $\mathbb{Z}$, where $+$ and $\times$ are precisely the elementary school operations we’ve known for years.

To the eyes of a certain type of mathematician, however, the integers are not the wholesome and friendly object that they might seem at first.

Are we secretly linguists?

Model theorists, who live and work in the realms of mathematical logic, see mathematics as a language to be understood through the lenses of mathematics itself. While this might seem strange at first, it is not any stranger than linguists analysing English through the lenses of English itself. The bread and butter of mathematical research is proofs, which can be modelled mathematically in the same way that a fluid flowing can be modelled, the flight of a bird can be modelled, the magnetic field emanating from my Christmas lights can be modelled.

Bread and butter pudding

The bread and butter of mathematical research is proofs, so what is the bread and butter pudding? Image: Wikimedia Commons user codepo8, CC BY 2.0

When we say that a theorem is true, the model theorists argue, what we are really doing is playing a game. We have some rules—which mathematicians call ‘axioms’—and we reason to deduce which moves are allowed. If you have ever played Dungeons & Dragons, you may have realised that the rulebooks don’t cover all possibilities. Some reasoning is left to the players’ deduction (often wreaking havoc on friendships and destroying years-long relationships, not unlike mathematical research). Starting from the rules that are written down (the axioms), those who approach the game can try and argue that certain moves (the theorems) are possible and allowed.

What does linguistics, and logic by extension, have to do with this? Any language needs an alphabet: a set of symbols with which we can write down our phrases and statements. If we want to talk about apples, we better have the letters necessary to build the word apple in our alphabet. So, what do we want to talk about in mathematics?

Let’s consider rings again. We have already seen the ring of integers $\mathbb{Z}$, but one can also be a bit more creative, and come up with all sorts of rings. For example, $2 \times 2$ matrices with real entries form a ring $M_2(\mathbb{R})$, which is radically different from $\mathbb{Z}$: the multiplication of matrices is not commutative! For example,
\begin{align*}
\begin{pmatrix}
2 & 1 \\
1 & 1
\end{pmatrix}
\times
\begin{pmatrix}
1 & 2 \\
1 & 1
\end{pmatrix}
=
\begin{pmatrix}
3 & 5 \\
2 & 3
\end{pmatrix},
\\
\begin{pmatrix}
1 & 2 \\
1 & 1 \\
\end{pmatrix}
\times
\begin{pmatrix}
2 & 1 \\
1 & 1 \\
\end{pmatrix}
=
\begin{pmatrix}
4 & 3 \\
3 & 2 \\
\end{pmatrix}.
\end{align*}
If we want to understand rings like model theorists do, then, we need to have symbols to talk about its operations and identities: a symbol $+$ for addition, a symbol $\times$ for multiplication, a symbol $0$ and a symbol $1$. To express mathematical statements, we will need to expand our alphabet a little bit. Model theorists speak a language called first-order logic, where alphabets are assumed to also contain variables ($x,y,z,\dots$), connectives ($\land$, which represents and; $\lor$, which represents or; $\Rightarrow$, which represents then; $\neg$, which represents not), and quantifiers ($\forall$, which represents for all; $\exists$, which represents there exists) and a symbol for equality, $=$.

We now have a rather large alphabet. Let’s try to express some mathematical fact. Suppose we want to state multiplication $\times$ is a commutative operation. What this means is that whenever I pick two elements $a$ and $b$, $a\times b$ is the same as $b\times a$. In first-order logic, this looks like this:
\[
\forall a \,\, \forall b \, (a\times b = b\times a).
\]
As another example, suppose we want to state that all polynomials of degree $2$ have a root. This means that no matter how we choose coefficients $a, b, c$, there is a root of the polynomial $ax^2+bx+c$. In first-order logic, this looks like this:
\[
\forall a \,\, \forall b \,\, \forall c \,\, \exists x \, (ax^2+bx+c=0).
\]
First-order logic, together with the language for rings that we have chosen above, allows us to express many interesting properties of rings. Certain properties will be true in certain rings, and not in others: for example, the operation $\times$ is going to be commutative in the integers $\mathbb Z$, but not in $M_2(\mathbb R)$. We can collect all the statements that are true in a certain ring $R$ in the theory of that ring, denoted $\operatorname{Th}(R)$. The sentence $\forall x \,\, \forall y \, (x\times y = y\times x)$ is an element of the set $\operatorname{Th}(\mathbb Z)$, but not of the set $\operatorname{Th}(M_2(\mathbb R))$.

While first-order statements can capture many of the important facts about a mathematical object, in this case a ring, they do not typically capture everything. We think of two rings $(R_1,+_1,\times_1)$ and $(R_2,+_2,\times_2)$ as being the same if they are isomorphic, in other words if there exists a bijection $f\colon R_1 \to R_2$ which behaves well with operations, ie

  1. $\forall x, y \in R_1$, $f(x+_1y) = f(x)+_2f(y)$,
  2. $\forall x,y \in R_1$, $f(x\times_1 y) = f(x)\times_2 f(y)$.

Sometimes rings with the same theory are isomorphic: this is the case if we look at the complex numbers $\mathbb{C}$ and their theory $\operatorname{Th}(\mathbb{C})$. If another ring has the same theory (and the same cardinality, since we need a bijection), then it is isomorphic to $\mathbb{C}$. In a way, the theory knows everything there is to know. This does not hold true for all theories across different kinds of mathematical object. For example, there are fields that share the same theory as $\mathbb R$, but are not isomorphic to it, since they admit infinite elements (namely, the so-called Robinson hyperreals).

Robinson hyperreals

Hyperreal numbers are a rigorous formulation of infinitesimal numbers. A number ${\varepsilon \in \mathbb{R}^*}
$ is infinitesimal if it is smaller than every positive real number and larger than every negative real. An infinite number is any element of ${\mathbb{R}^*}$ that is either greater than or less than every real number.

The set of hyperreals is denoted as ${\mathbb{R}^*}$. This is an extended set of the real numbers as it includes infinite numbers and infinitesimals. Robinson showed that they could be rigorously defined using model theory. Hyperreals are used in non-standard analysis.

There are many reasons why $\operatorname{Th}(\mathbb{R})$ and $\operatorname{Th}(\mathbb{C})$ are very different in their behaviour, but there is a deep underlying one: $\mathbb R$ can define an ordering (ie, an element is non-negative if and only if it is a square), while $\mathbb C$ cannot. In this sense, $\mathbb C$ is less ‘complicated’, as it showcases less structures and patterns than $\mathbb R$.

The upshot is: the more complicated a ring $R$ is, the less its theory $\operatorname{Th}(R)$ will know about it. This leads us to a situation where rings with the same theory might not be isomorphic. Especially if the rings are complicated, containing many patterns and structures.

The hunt for forbidden patterns

Enter Shelah. Saharon Shelah was born in 1945, and graduated Tel Aviv University with a maths degree in 1964. He was awarded his PhD at the Hebrew University of Jerusalem in 1969, for research focusing on stable theories.

Saharon Shelah in front of what might be his favourite office paper.

Saharon Shelah in front of what might be his favourite office paper. Image: Andrzej Roslanowski, CC BY-SA 2.5

The story of modern model theory begins in the 1960s, when Shelah proposed a method to map out theories based on how complicated they are, building a tentative map of the universe of mathematical theories.

In the case of rings, we think of $\mathbb{C}$ as being less complicated than $\mathbb{R}$, since the theory of $\mathbb{C}$ describes it very well (even up to isomorphism), whereas the theory of $\mathbb{R}$ does not.

Among all rings, some more complicated, some less, there is one that sits at the borders of this mapping, model theorists’ own hic sunt leones: the ring of integers $\mathbb Z$.

Long-time fans of true crime know that, whenever some deeply unsettling truth about somebody is unearthed, there will always be some neighbour commenting “Oh, but they were so kind… so polite.” No matter if the person in question has committed several gruesome murders, there is a high chance that those who knew them on a daily basis will say that they would have never expected it. While the community might disagree on which objects are easy and which ones aren’t, there is a structure that nobody would expect to create trouble: $\mathbb{Z}$, the ring of integers. After all, Kronecker wrote “God created the integers.” Model theorists, however, are able to see beyond the facade of this seemingly harmless mathematical structure. To understand why, we have to go back some 40 years before Shelah’s work.

The breaking point of the idyllic picture of the integers is a procedure known as Gödelisation. The representation was introduced, as the name suggests, by Kurt Gödel in an effort to prove his famous incompleteness theorems. If you think back to the games metaphor, Gödel wanted to argue that if a game was complicated enough, then there would always be a move that is neither allowed nor prohibited by the rules of the game.

A set of axioms is considered ‘effective’ if it can be listed by an algorithm. In the 1930s, Gödel proved that if an effective set of axioms was able to perform the arithmetic of the integers $\mathbb{Z}$, then it would not be able to prove or disprove all possible statements about $\mathbb{Z}$. By ‘able to perform the arithmetic’, we mean that it must be able to reconstruct within itself the operations $+$ and $\times$ of $\mathbb{Z}$. While his result is impressive in itself, our focus today is rather on his technique. To prove his theorems, he created a translation procedure that transformed first-order statements in the language of rings (like $2=2$, or $n+1 = m+1 \Rightarrow n = m$, or ‘every polynomial of degree $2$ has a root’) into (very big) natural numbers. For a given first-order statement, this big number is known as its Gödel number.For instance, the first-order statement $x = y \Rightarrow y = x$ is encoded with the natural number $120061121032061062032121061120$.

The encoding was done in a way such that whether a statement was true or not (whether there was a proof of it or not) was a matter of the elementary arithmetic properties of its Gödel number. This means that the truth of a certain statement can be checked by verifying arithmetical identities—just like we used to do in elementary school. Think of questions like ‘is $5 \times 3 = 15$?’, or ‘is $1289$ divisible by $29$?’. This seems easy, perhaps; but try it with dozens and dozens of digits, and soon enough you’ll either grow tired or ask a computer to do it for you (or both).

A grid with points distributed around it

Shelah’s thesis is that it is worthwhile to classify theories. In doing so, mathematicians suggest dividing lines between different theories based on good test problems, and hence are able to place ‘similar’ theories in shared regions, creating a map. Each dot is a theory. (Interactive map on forkinganddividing.com)

Gödelisation allows us to encode entire mathematical objects and theories inside the integers. Complicated theorems, emerging from all over mathematics, reduce to simply checking whether two integers divide each other—a procedure that a computer can do, given perhaps infinite time. The mathematical fact that the polynomial $X^7+X^2+3$ has a solution in $\mathbb R$ can be translated into an equality between (very big) integers, which we can compute in $\mathbb Z$.

This is a remarkable fact. Many mathematical statements could be checked by a computer performing basic arithmetic, only with numbers which have billions and billions of digits. It is also a massive problem for the theory of $\mathbb{Z}$. Gödel’s encoding means that the seemingly elementary operations of $\mathbb Z$ can understand patterns which might very well be infinitely complex, even beyond our imagination and comprehension. No matter how hard we try, $\mathbb{Z}$ will always be too complicated for us to understand with first-order logic.

There is a fine line between complex enough to be interesting and too complex to be dealt with. The integers sport an impossible, almost cosmic level of expressivity. To many mathematicians, $\mathbb Z$ is the simplest object we can think of; they are, after all, not too far from just putting one pebble after the other. They provide the base to many of our grandiose castles of ideas and theories, the foundations of many explorations into unknown mathematical lands. To model theorists, however, they reveal a different face. A twisted expression, a creepy smile. The source of all darkness, the original Pandora’s box. We don’t talk about them, hoping that in our cautious adventures we will never find ourselves alone with the ring of integers in a dimly lit alley.

post

It’s a particle… It’s a wave… No, it’s a soliton!

On a bright August day in 1834, John Scott Russell, a Scottish engineer, became acquainted with a wave which would forever be associated with him within the field of fluid dynamics and beyond. He was following a boat which only just fit into the canal it was travelling down.

Suddenly the boat stopped. From the bow of the ship, a lump of water swelled and burst forward. On horseback, Scott Russell continued on, and became increasingly intrigued as the lump seemed to travel without changing shape, sustaining itself down the channel.

In this way they travelled, man and wave, at a pace of about nine miles per hour. They travelled for almost two miles before the wave disappeared among the twists and turns of the canal. This wave, standing about 40cm in height and extending about nine metres along the canal, never left Scott Russell’s thoughts. In September 1844 he gave a report on his scientific investigations into what we now call the solitary wave.

Just smile and wave boys, smile and wave…

The solitary wave is a surface wave: one that occurs at the fluid–air interface. Rudimentary surface wave theory, and usual intuition, predicts that surface waves disperse, rather than keeping their form as the solitary wave did.

Think of the last time you splashed around in a swimming pool. If you put your hands at the water level and push, a lump of water would roll off your hands. But unlike the solitary wave, it would disperse into a bunch of ripples rather than retaining its shape, stopping you from launching long-ranged splash attacks.

Mathematically, dispersion describes how the velocity of a sinusoidal wave depends on its wavelength. For simplicity, consider a fluid interface where the surface height only varies in one direction, say the $x$-direction. Typically water waves aren’t perfectly sinusoidal, but we can build up any surface profile from sine waves. The system then evolves by the sine waves of varying wavelengths travelling independently at the velocity prescribed by the dispersion relation.

A brass plaque saying "Near this spot JOHN SCOTT RUSSELL discovered the SOLITARY WAVE in August 1834"

Scott Russell’s discovery commemorated on the canal. Image: Jim Barton CC BY-SA 2.0

Scott Russell

Scott Russell smiling but not waving

The times, and space, they are a-changin’

Dynamical systems where the time evolution depends on spatial variation are described by partial differential equations. These are equations involving both time derivatives and spatial derivatives. When there are multiple variables in play, a derivative with respect to only one of the variables is known as a partial derivative. For example, $\partial u/\partial x$, also written $u_{x}$, denotes the spatial derivative of the function $u(x,t)$.

Many physical phenomena are described by such equations, such as diffusion (by the diffusion equation), motion in fluids (by the Navier–Stokes equations), and gravity in the setting of general relativity (by Einstein’s field equations).

Linear PDEs, where linear means we do not have products of derivatives, tend to be easier to solve, as you can add two known solutions to get another. Complicated starting configurations can be built out of the sum of many simpler solutions, like when we built an arbitrary wave profile from many sine waves.

On the other hand, nonlinear PDEs cannot be solved with this strategy, and are solved mostly on an ad-hoc basis. Solitons, as they were historically understood, are particular solutions to some special nonlinear PDEs.

KdV solitons

The KdV equation is
\[u_t-6uu_x + u_{xxx} = 0.\]
If we look for solutions which are travelling waves, that is to say we try the form $u(x,t) = f(x-ct)$, we can find a family of solutions of lumps which are small outside of a small region and travel at the speed $c$.

These are the one-soliton solutions
\[u(x,t) = -\frac{c}{2}\operatorname{sech}^2
\left[\frac{\sqrt{c}}{2}(x-ct)\right],\]
one of which is plotted below.

The more particle-like qualities of solitons can be seen more clearly when there are multiple solitons, most simply illustrated with the two-soliton solution. At early and late times it looks like a superposition of one-solitons.

There are also non-soliton solutions, such as the periodic and difficult to pronounce cnoidal wave solution, which contains the Jacobi elliptic function $\operatorname{cn}$. This is like a ‘lattice’ of one-solitons placed next to each other.

For a wave to keep its shape, all the constituent sine waves must travel at the same velocity. In other words, there cannot be dispersion. This holds for many waves in nature: red light and blue light differ in wavelength, but travel at the same speed. The same is true of sound at different pitches. But it’s not so for typical surface waves, where waves with longer wavelength travel faster than those with shorter wavelength. Any lump of water built from many sine waves quickly falls apart.

What saves the solitary wave is that dispersion is delicately balanced by a nonlinear effect. This is neglected in rudimentary surface wave theory, which is a linearised theory. Linearity is what allowed us to think of the wave profile as being the sum of many independent sine waves, and the nonlinearity causes the sine waves to interact in potentially complicated ways.

Linearising is valid when the variations in height are small compared to the depth of the water. Our hands are small, and the lumps of water we can make aren’t big enough to bump us into the regime where the nonlinear effect becomes important. But boats are quite big, and can make swells of water large enough (relative to the depth of the canal) for nonlinearity to come into play.

The equation describing water waves which includes the nonlinear effect is known as the KdV equation, named after the Dutch duo Korteweg and De Vries. They weren’t the first to derive the equation but they found a solution that travelled with constant speed: the one-soliton solution to KdV.

The KdV one-soliton

Scott Russell’s solitary wave had been proven. He was sure the self-sustaining wave was hugely important, but it hadn’t caught the attention or imagination of many of his contemporaries. It wasn’t until the next century when the study of solitons really took off.

A two soliton solution to the KdV equation evolving in time. It starts off with two separated lumps, which combine, then pass through each other and continue on unchanged.

Solitons

The next chapter of our soliton story picks up at the Los Alamos national laboratory in 1953. Four physicists (Fermi, Pasta, Ulam and Tsingou) were puzzling over what they were seeing: an animation simulating a vibrating string with nonlinear terms in its dynamics. The simulation was programmed on the Maniac computer (great name) by the most computer-savvy among them, Mary Tsingou.

They were interested in a hypothesis—called the ergodic hypothesis—that, roughly speaking, for systems which were mostly linear but perturbed by a nonlinear term, the initial energy would eventually (potentially after a long, long time) be evenly distributed over different degrees-of-freedom of the system. For their string, this would mean the amplitudes of all different frequencies would eventually be comparable.

The Maniac had also only recently been finished. With the Maniac at their door, the physicists tried what few could do before; study a problem computationally. They were looking in particular for a problem that would be easy to formulate, but which was intractable by hand or even mechanical computer.

In order to simulate the string on a computer, it had to be modelled by a discrete, finite number of points: in their case, 64 points. This is linear in the strain, and assuming that the strains are small, the next force to consider would be a quadratic one, and this was precisely the nonlinear dynamics considered, with neighbouring forces $F = k(\delta + \alpha\delta^2)$. They started the simulation with only an excitation in the fundamental frequency of the string. Early on, they saw the behaviour they expected, with successively higher frequencies of the string starting to receive small excitations. To their surprise, only one of the frequencies would have large excitations at any one time, starting from the fundamental frequency, then passing to the second mode, and then the third. This only continued among the first few modes, then the large excitation returned to the fundamental frequency, showing quasi-periodic behaviour.

This was unresolved until around twelve years later when two Americans, Kruskal and Zabusky, began to investigate the KdV equation computationally. They noticed that the continuum limit of the system studied by the Los Alamos four was described by the KdV equation.

By simulating the KdV equation using a more direct discretisation scheme, they found something amazing. No matter what initial configuration they prescribed for the water displacement height, the surface would break into a train of solitary waves with a profile like a one-soliton, each with their own amplitude and speed. Moreover, when these solitary waves collided, they would interact, maybe raising or dipping their peaks slightly as they passed one another, but then recover the shape and speed they had before the collision.

They knew they were onto something important, and they christened these resilient lumps solitons: ‘solit-’ for solitary, ‘-on’ to denote a particle, the suffix which appears at the end of a zoo of particle physics terms: electron, proton, hadron, baryon, and so on…

The particle-like behaviour of these self-sustaining lumps of energy is their ability to enter into, then emerge from, collisions with a well-defined shape and speed. The presence of both wave- and particle-like behaviour suggests quantum shenanigans, but there is nothing quantum here, just water waves.

In 1967 a team of four scientists, including Kruskal, found a construction of exact solutions to the KdV equation which consist of precisely $n$ solitons, which start at large separation, intersect and mingle, then regain their original size and shape, and drift apart.

The details of the construction of the explicit solutions are near magical. It turned out to borrow ideas from the scattering theory of quantum particles, then requires the application of a twisted functional transform to the scattering data. This transform is something like a nonlinear version of the Fourier transform, used extensively in applied maths due to its effectiveness in analysing signals.

The forward scattering problem in quantum mechanics is to determine the reflection and transmission of different frequency particles off a given potential. It’s like a mathematical description of how we see things: the object we look at is the potential, and the amount of light of different wavelengths that gets reflected into our eyes allows us to construct an image of that object.

In quantum mechanics, the potential is a function of space, while the reflection and transmission data is packaged into a function of frequency called the spectral data. The KdV equation has an associated scattering problem, and the complicated dynamics of KdV turns into a very simple time dependence for the spectral data.

On the KdV side, the wave profile $u$ becomes the potential in the scattering problem. In the scattering picture, we have a grasp on how the spectral data evolves, so to recover a solution $u(x,t)$ for KdV, we need to reconstruct the potential from its spectral data, which is an inverse scattering problem, and where the twisted transform comes in.

This brilliant but byzantine technique, known as the inverse scattering method, allowed the team to explicitly write down $n$-soliton solutions to KdV. Kruskal and Zabusky believed it was the presence of these solitons that meant the physicists’ string didn’t obey the ergodic hypothesis. The discovery of this inverse scattering method began a flurry of research into solitons.

Beyond water waves

Shortly after the inverse scattering method was found for KdV, it was adapted to two other PDEs. Alongside the KdV equation, these PDEs have since taken on a celebrity status within the study of solitons.

Of the three, the sine-Gordon equation is the one which has been studied for the longest time. It began its life in the field of classical differential geometry. Surprisingly, solutions to the sine-Gordon equation correspond one-to-one with pseudospheres, which are surfaces of constant negative curvature immersed in three-dimensional space. Here, immersed means the surfaces might do funky things like intersect themselves or have sharp cusps. But that’s an article for another day.

The sine-Gordon equation has also drawn attention in several parts of physics, from particle physics and statistical physics to material science, where it was used to study screw dislocations in crystals. The one-soliton solution to sine-Gordon has the interesting property that it limits to $2\pi$ instead of zero as $x$ tends to infinity. That is, if we plot the sine-Gordon soliton as a function of position, it does not look like a lump, like the KdV soliton, but has an S shape.

Soliton solution to the sine-Gordon equation

A PDE zoo

Soliton solutions are present for many PDEs. Here’s a brief who’s who of the most famous ones. Subscripts denote partial derivatives.

  • The sine-Gordon equation:
    \[ \varphi_{tt} – \varphi_{xx} + \sin\varphi = 0.\]
    For this soliton, the limiting value at positive infinity in position is $2\pi$.

  • The nonlinear Schrödinger equation:
    \[ \mathrm{i}\psi_t = -\frac{1}{2}\psi_{xx} + \kappa |\psi|^2 \psi.\]

  • Everyone knows that real canals are not one dimensional. A more realistic description of surface waves in a canal is the KP equation, a 2D generalisation of KdV:
    \[
    (u_t + uu_x + \varepsilon^2 u_{xxx})_x + \lambda u_{yy} = 0.
    \]
    As with the KdV equation there is a lattice of soliton solutions to the KP equation. This is what you can see at the Ile de Ré at the top of the article!

The sine-Gordon soliton is an example of a topological soliton. In fact, all solutions of the sine-Gordon equation have the property that the difference of limiting values between $x$ at positive and negative infinity is an integer multiple of $2\mathrm{\pi}$. This integer is called the winding number of the solution, and offers a glimpse of the interplay between topology and soliton theory.

The last of the three gold standard soliton PDEs is the nonlinear Schrödinger equation. The solitons of this PDE were the closest to having technological applications. Light signals in optical fibres are well modelled by solitons of this equation. Their ability to keep their shape was desirable for sending signals over vast distances, and solitons made it to lab trials before other technological advances in the field made them obsolete. In an alternate universe, the online edition of this magazine may have been brought to you courtesy of solitons.

There is a dizzyingly rich array of PDEs with soliton solutions. The majority of known PDEs admitting solitons are defined with only one spatial dimension, although of course more realistic systems incorporate two or three spatial dimensions. One such PDE is the Kadomtsev–Petviashvili equation, which describes a two-dimensional model of shallow water waves and generalises the KdV equation.

From humble beginnings in the Union canal, our soliton story has taken us worldwide, where they were one of the first phenomena explored using mathematical computing, and even into the materials and optics labs. The rich theory of solitons has sustained itself now for almost 200 years and solitons are now ubiquitous in mathematical physics, fulfilling Scott Russell’s vision. And there is still so much left to explore.

post

Twist and shout! …and we’re fresh out of shout*

Having turned 18, I have now swapped out the impromptu school parties with their questionable music, eccentric concoctions which come about after a few drinks—Shloer, vodka and salad dressing, anyone?—and soon-to-be sticky carpets, to attend city centre institutions with equally interesting music, drinks and floor-al adhesion.

I and a group of close friends (who will be referred to as A, B, C, D throughout, leaving me as E) decided to paint the town red and celebrate the arrival of my newfound rights to drink, drive a forklift truck, and go bungee jumping (though not simultaneously). We decided unanimously on a venue whose name I will refrain from mentioning (as I am certain I saw the bouncer doing the Chalkdust prize crossnumber, and hence may be liable to read this article).

We approached the bar, and the bartender—thrilled for what must have been their first clientele of the night—took our orders with great gusto. Although the bartender was blessed with many things, short-term memory was not among them: each order was perfectly procured, but much like the pianist who could play all the right notes but not necessarily in the right order, the hapless bartender had no clue which order was whose. In fact, by some combinatorial miracle, he managed to give everyone the wrong drink: whereas it ought to have gone ABCDE, instead, he went DAECB.

“Just our luck,” said C, exasperatedly.

“A derangement no less.” said D.

“Who are you calling deranged?” responded B indignantly.

“No—a derangement—a kind of permutation,” I explained to defuse the situation.

“What’s a permutation?” pondered D thoughtfully.

“It’s simply a bijection $\pi: \{1, \dots, n\} \rightarrow \{1, \dots, n\}$: a map sending each element of $\{1, \dots, n\}$ to a different element,” I explained.

“Right-o. But a derangement?” enquired B.

“A derangement is a permutation without fixed points: so $\pi(i) \neq i$ for all $1 \leq i \leq n$.” I answered.

Deranged or otherwise, we needed to work out how to return the beverages to their rightful owners. Our attempt to slide them across, wild west style, was scuppered by one inadvertent collision and the stern eye of the bouncer (who was getting started on the across clues) so we decided to work it out rigorously. While most people would get off their seats and swap drinks, we mathematicians know that no night out is complete until it becomes a maths problem. We may make things difficult for ourselves—but at least we’ll get a paper out of a night out.

“Let’s set some ground rules,” said C, enjoying his newfound role as leader. “Rule one: we must return our drinks in some series of swaps. Rule two: people may only swap with their immediate neighbour to prevent any collisions. Rule three: people may not engage in two swaps at once or have more than one drink at once. That constitutes cheating, and will result in the aforementioned cheater being required to buy everyone a round of drinks. And rule four: we will carry this out with a minimal number of swaps to prevent anyone from ‘sampling’ someone else’s drink. I know who you are,” he said, eyeing A suspiciously.

With this, we took our seats and set to work, drawing frenzied diagrams on cocktail napkins with whatever we had to hand.

Ticket to ride

I stared quizzically at my sketches, resisting the urge to sip the Guinness which had been placed before me. “We can guarantee we will only need to do at most 10 swaps if we don’t need to swap each drink more than once.”

“Actually, only six pairs of drinks are in the wrong order,” added B.

“What do you mean, ‘the wrong order’?” I asked.

“I mean that for our permutation, there are only six pairs of people X and Y such that X sits to the left of Y but X’s drink is to the right of Y’s—like A and D. So, at some point, the drinks will have to cross past each other.”

As I struggled to make sense of this, A made a suggestion. “Let’s use some shorthand to make it slightly easier. How about we use $\sigma_i$ to denote the swap between the $i$th and $(i + 1)$th person, counting from A to E.”

“So a total of $n-1$ kinds of swap for $n$ people,” I added.

“Right,” replied A. “And when we do more than one swap, we work from left to right—because we’re indexing positions of drinks, not the drinks themselves.” We agreed this was a splendid idea.

“We can actually draw out an exact map of how to do this, and which exchanges to do… like a tube map!” said D, drawing the following diagram:

“All you need to do is draw lines between the drink and its destination. All the points at which the drinks should ‘collide’ give you the exact order in which we should exchange our drinks. So we can actually trace which drinks go where and when—like a tube map for cocktails!” said C, brimming with excitement.

“And lo and behold… only six exchanges, like you said!” chimed in B.

“Lower bound and upper bound: since the lines will only cross precisely when their drinks start on the wrong side, it’s optimal!” A surmised.

“And we can read off the orders of drinks as follows: DAECB, ADCEB, ADCBE, ADBCE, ABDCE, ABCDE. Hey presto—six swaps, like you said!” exclaimed C.

“Or, with our newfound notation, simply $\sigma_1 \sigma_3 \sigma_4 \sigma_3 \sigma_2 \sigma_3$.” B added.

We tried it, and lo and behold, our drinks were returned to the right order.

I continued: “So we can write a formula for the number of swaps in general: for any permutation $\sigma$, where the $i$th element is notated $\sigma(i)$, the number of exchanges we need to use is
\[|\{i < j: \sigma(i) > \sigma(j)\}|\text{.}\]

“And so for five drinks, we’ll never need more than 10 swaps—because we only have $\left(\begin{smallmatrix}5\\2\end{smallmatrix}\right) = 10$ pairs $\{i, j\}$ where $1 \leq i < j \leq 5$," parried B. "Which could happen if we started out with EDCBA,” added A.

They’re the same picture

While we supped our drinks, C opined the following.

“I think the diagram is rather splendid, because we can actually prove things with the diagrams themselves.”

“How so?” I asked.

“Well, say we wanted to multiply out two permutations. We can just stack them up like so:”

“But then how would you undo it?” asked A, unsatisfied.

“By reflecting the diagram, top-to-bottom,” C countered.

Yet D looked suspicious. “But surely your diagrams aren’t unique—you can change the diagram, but the permutation remains the same. How else could we have done this?”

“If you do a swap twice, it undoes itself: that checks out from the diagram!” said B, drawing the following:

“For some pairs of swaps, like you mentioned, it doesn’t matter which order we do them—they may as well have been in either order, or at the same time,” said A.

“So we could say $\sigma_i \sigma_j = \sigma_j \sigma_i$,” I surmised. “And that would look a little like this:” I added, displaying my proof:

“But that only happens if $i$ and $j$ are at least 2 apart. So what if they are next to each other?” A asked trenchantly.

“There’s an even cleverer one. If we have a permutation like $\sigma_i \sigma_{i + 1} \sigma_i$, we can ‘pick up’ the middle strand and place it over the other two without changing the result of the permutation, and get $\sigma_{i + 1} \sigma_i \sigma_{i + 1}$ back.” intoned C. “Like this:”

“Or equivalently, $(\sigma_i \sigma_{i + 1})^3$ should return everything to where it started, since both $\sigma_i \sigma_{i + 1} \sigma_i$ and $\sigma_{i + 1} \sigma_i \sigma_{i + 1}$ swap drinks $i$ and $i + 2$,” I added.

In the heat of excitement, C urged A and B to show him in order to demonstrate this. The drinks had returned like clockwork.

“So we now have a set of rules for our permutations: we have permutations $\sigma_i$ for $1 \leq i \leq n$ such that:

  • $\sigma_i^2 = 1$ (where $1$ is just the identity permutation);
  • $\sigma_i \sigma_j = \sigma_j \sigma_i$ if $|i-j| > 1$; and
  • $\sigma_i \sigma_{i + 1} \sigma_i = \sigma_{i + 1} \sigma_i \sigma_{i + 1}$,”

summed up C. “Or, alternatively, $(\sigma_i \sigma_{i + 1})^3 = 1$ and $(\sigma_i \sigma_j)^2 = 1$ otherwise.”

“And we can get to any diagram of any permutation we want this way?” B enquired.

“Using C’s diagram idea.” I offered.

“So we’ve given a full description of the collection of permutations!” D said, eagerly awaiting their drink.

We sat in awe of our realisation.

You’ve really made the braid

Then C once again saw fit to throw the cat among the pigeons. “Hold on. Let’s go back to the original diagrams. What if they were actual pieces of string?”

“So the strands would go over and under each other?” asked D.

“Right. You’d multiply and take inverses just like we did earlier—so it’s as well-defined as our diagrams,” replied C.

“If you did a swap twice, you wouldn’t get the same thing back—it would be all tangled!” I said, drawing a version of our original map:


“Braided!” suggested C.

“Ah—but we’d still have our previous rules, that $\sigma_i \sigma_j = \sigma_j \sigma_i$ if $|i-j| > 1$, and $\sigma_i \sigma_{i + 1} \sigma_i = \sigma_{i + 1} \sigma_i \sigma_{i + 1}$.” A elicited.

“In fact, I don’t think we can say anything more—these are the only ways we can move any of the strands without tangling our strings. So that’s it—that’s our new algebra,” I concluded.

To be or knot to be

Then D picked up his bar napkin and made it into a loop: “What if we joined the ends together?” he asked.

“We get some kind of knot—or maybe a link if there happens to be more than one connected component,” said B:

“Could we get any kind of knot like that? And which braids give us the same thing?” I asked.

“Well, I suppose if you did braids $\sigma \tau$, it would be the same as doing $\tau \sigma$—we would just loop round and get the same thing back, because it doesn’t matter where you start,” said C:

“And if we add an extra strand, $\tau \sigma_n$ on $n + 1$ strands should look the same as $\tau$ on $n$ strands,” I added:

“Wait—this all feels an awful lot like physics.” D said. “Imagine we watch how our drinks proceed at some point in time. Your original diagram, C, shows us the worldlines of the individual drinks as they pass through time from top to bottom, like frames in a movie:”

“But originally, if we swapped them twice, we would get the same thing back.” A replied. “Now, if we swap them twice, the particles would remember what they were swapped with! In fact, you could make a computer with these—they would store the way in which they were exchanged!”

“No more—you’re giving me a headache!” implored B.

Hexagonator the almighty

“But how could we do this?” I asked.

“Say we have 5 particles, say, A, B, C, D and E. We denote their states by $|A\rangle$ and so on.” said C. “We’d write $|A\rangle \otimes |B\rangle$ for the superposition of two particles.”

“And this would be different from $|B\rangle \otimes |A\rangle$?” I enquired.

“Yes,” replied C, “but we’d need some kind of operator $b_{A, B}$. One which would do \[|A\rangle \otimes |B\rangle \rightarrow |B\rangle \otimes |A\rangle.\text{“}\]

“And if you did it twice with our braided particles, you wouldn’t get the same configuration back,” added A.

“What about if we have three particles? Wouldn’t \[(|A\rangle \otimes |B\rangle) \otimes |C\rangle\] give us the same thing as \[|A\rangle \otimes (|B\rangle \otimes |C\rangle)\text{?”}\] I asked.

“Ah, but they’re not—be careful,” said D, toyingly. “We’d need some kind of map, $a_{A, B, C}$ which would carry out \[(|A\rangle \otimes |B\rangle) \otimes |C\rangle \rightarrow |A\rangle \otimes (|B\rangle \otimes |C\rangle)\] for us.”

“Call it the associator! I like the sound of that—like a superhero,” I said.

“There’s another way of seeing this,” added A. “This all sounds an awful lot like a category—a collection of objects (like our particles) and maps, or morphisms (like our $a$ and $b$ maps) between them.

“And since this superposition malarkey is associative, we have a monoid.”

“So our particles form a braided monoidal category!” I surmised.

“I remember reading about this—there’s a couple of results it should always satisfy… Ah, here’s one:” said D, scrawling across two napkins and a beer mat (overleaf).

“A hexagon diagram!” I added.

“What about the three equivalences C found earlier?” raised B, gesturing to the discarded diagram. “Couldn’t we encode this in our new categorical form?”

“Just considering the strands, swapping the first pair and fixing the third is simply $b_{A, B} \otimes 1_C$. That’s like our $\sigma_1$,” said C.

“Right… and similarly if we swap the second two, we get $1_A \otimes b_{B, C}$ —like our $\sigma_2$,” returned B.

“So,” said D, “our relation looks a little like this, up to association:
\begin{align*}
(1_A\otimes b_{B, C})(b_{A, C} \otimes 1_B)(1_C \otimes b_{A, B}) \hspace{2cm} \\ \hspace{1.6cm} = (b_{A, B} \otimes 1_C)(1_B \otimes b_{A, C})(b_{B, C} \otimes 1_A)\text{.”}
\end{align*}

“And that’s just the Yang–Baxter relation from theoretical physics!” A added.

“But what does that mean?” B asked.

“Having this relation in 2 dimensions (1 spacelike plus 1 timelike) means that a system will have certain quantities fixed—such as the momentum remaining the same in each particle—which is called being integrable,” answered A.

“Moreover, if we have a system in which 3 or more particles are scattering we can always reduce it to an equivalent system of 2 particles,” added C. “In fact, one can do a similar trick in higher dimensions—but unfortunately this beer mat is too narrow to contain such an explanation.”

“Isn’t it just remarkable how many different ways there are of seeing the same thing, just from considering our drinks?” I remarked, satisfied with the night’s work.

“Isn’t it just,” said B. “Now, whose round is it next?”


* I was planning on naming this “Merry Twist-mas” after the oft-overlooked masterpiece by The Marcels, which is objectively the best Christmas song of all time. Alas, the timing was not to be.

post

Forest fires

As a maths teacher, I often find that setting seemingly ordinary tasks for my students lead to surprisingly interesting mathematics. Recently, an investigation that on the surface seemed quite unremarkable ended with me taking a deep dive into areas of mathematics I had never seen before. It started with us thinking about the spread of a forest fire.

The trees in the forest in question are perceived to be squares on a sheet of squared paper. At the start, a random tree catches fire and grid looks this:

After a unit of time, all trees (squares) that touched a tree that is already on fire catch fire too. This leads to nine trees being on fire:

The spread of the fire repeats. After a second unit of time, there are 25 trees on fire:

If you continue, then the next numbers in the sequence are 49, 81, and 121. From immediate inspection, you can spot that these are the odd square numbers. Part of the task was to then take this and write a quadratic for the $n$th term in the sequence. Providing you know that $2n + 1$ is an algebraic way of writing odd numbers, the sequence jumps out immediately:

\begin{align*} a(n)&=(2n+1)^2\\ &=4n^2 + 4n + 1. \end{align*}

A forest of hexagons

Following the same process as for squared paper, we can look at the fire spreading but using differently tessellated paper. Squared paper produced a quadratic sequence, but it is a little harder to predict what would happen when the tiling changes.

Looking first at the hexagonal paper, we see that, as it did for squared paper, each iteration looks like an enlarged version of the previous one:

Listing out the sequence we have 1, 7, 19, 37, 61, 91, 127, and so on. Perhaps surprisingly, this too turns out to be a quadratic sequence. Quadratic sequences are sequences that have a general rule of $$a(n)=c_1n^2+c_2n+c_3,$$ where $c_1$, $c_2$, and $c_3$ are real numbers. To calculate $c_1$, we look at the second difference between terms of the growing sequence:

The second difference is a constant, and can be halved to obtain $c_1$. Next, you can subtract the sequence $c_1n^2$ from the original sequence. If your original sequence is quadratic and you’ve done it right so far, this should leave you with a linear sequence that you can then work out the rule for. For our sequence this ends up being:

\begin{align*} a(n) &= 3n(n+1) + 1\\ &= 3n^2 + 3n + 1. \end{align*}

A forest of triangles

Finally, the students looked at the growth on triangular paper. The spread here ends up being a little different to the previous two. For both the square and hexagonal paper the shape grows and replicates the original shape. However, for the triangular paper, as it spreads the triangle shape gets smoothed at the edges.

Continuing the sequence we get 1, 13, 37, 73, 121, 181, and so on. Once again, this is a quadratic sequence. Using a similar method as we did for hexagons, we obtain:

\begin{align*} a(n) &= 6n(n-1)+1\\ &= 6n^2-6n + 1. \end{align*}

Polygonal numbers

These results may look like just three quadratic sequences, but there’s more that connects them: they’re all examples of a type of number sequence called the centred polygonal numbers (also called the centred $k$-gonal numbers where the choice of $k$ gives a specific sequence). These sequences are created by drawing increasingly larger polygons made of dots around a central dot, hence the name.

The centred polygonal numbers are closely related to, but not the same as, another type of number sequence: the polygonal numbers. These include the well known triangular and square numbers, and are also created by drawing increasingly large polygons of dots, but in this case the polygons all share a vertex rather than all being around a central point. You can see the difference by looking at the two sequences for a square.

The first four square numbers: 1, 4, 9, and 16.

The first three centred square numbers: 1, 5, and 13.

The centred polygonal number can also be produced using the triangle numbers: you can start with the central dot, then make the centred $k$-gonal numbers by putting $k$ copies of the triangle numbers around the point. For the centred square and hexagonal numbers, that looks like this:

The centred square numbers can be built from a central dot and four copies of a triangle number.

The centred hexagonal numbers can be built from a central dot and six copies of a triangle number

If we call the central dot the 0th centred $k$-gonal number, then we see that the $n$th $k$-gonal number is built from the $(n-1)$th triangular number. This means that the formula for the $n$th term of the centred $k$-gonal number is:

\begin{align*} a_k(n) &= k\times(\text{$n$th triangular number}) + 1\\ &= \frac{kn(n+1)}2 + 1. \end{align*} Despite our sequence of fire on squared paper being the odd square numbers, the sequence is actually the centred octagonal numbers (ie $k = 8$). This is because we can rearrange an odd square number of dots to make increasingly larger octagons from a central point. It is also interesting to note that the sum of the reciprocals of this sequence is convergent: \[ \sum_{n=0}^\infty \frac1{(2n+1)^2} = \frac{\pi^2}8. \] The sequence we observed on hexagonal paper is much more aptly named: the centred hexagonal numbers (ie $k = 6$). Hexagons are notable shapes and can be seen in the natural world in such oddities as bee honeycombs or in popular media such as the board game Settlers of Catan.

Settlers of Catan’s board is based on hexagons.
Image: Catan Wikimedia Commons user Yonghokim, CC BY-SA 4.0

The final sequence was created from the triangular paper. This produces a sequence called the centred dodecagonal numbers (ie $k = 12$). Theses dots that make up the dodecagonal numbers can also be rearranged to produce evergrowing stars and so are also called star numbers. Like the centred hexagonal numbers, star numbers have been used in board games such as in a Chinese checkers board.

A Chinese checkers board set up for a three player game.
Image: Chinese-checkers Wikimedia Commons user Splattne, CC BY-SA 3.0

Centred dodecagonal numbers can be rearranged to make stars.

The centred polygonal numbers are interesting in their own ways for many other values of $k$, and we could talk about them for hours, but I’m going to return to my students and to something really interesting that they noticed.

The three choices of paper we used are the only three examples of regular tessellation. This is because squares, triangles and hexagons are the only three regular shapes whose interior angles are factors of 360°: six equilateral triangles, four squares, or three regular hexagons can fit around a single point. We’ve seen these numbers before: they are the coefficients of $x^2$ and $x$ in the general rules for the three sequences! This shows a remarkable link between the geometric property of tessellation and the algebraic representation of the sequences we produced.

This is what I love about mathematics. A task that on the surface seemed like a nice little investigation for students ended with varied classroom discussions as well as my own deep dive down a mathematical rabbit hole. Despite spending years learning and teaching, I still discover new ways of looking at things that I thought I had previously mastered. I hope that as I get older, I still get the joy of discoveries like this and can continue to share these with the students I teach.

post

On the cover: skyrmions

Let’s face it, physics is hard. Dirac may have opined that his eponymous equation “explains all of chemistry and most of physics”. However, even the most enthusiastic acolyte would admit that this is impractical at best. In particular, first principle calculations are only really possible for the simplest systems. If we want to understand anything more complicated than a harmonic oscillator or the electron energy levels of the hydrogen atom we need another approach.

This is where effective theories and simplified models come in. An effective field theory is essentially a model where you admit that you cannot describe a system in complete detail; instead, you try to describe what is effectively happening. A good example of this is how we study a fluid like water. We do not try and understand it at the level of the individual atoms; instead, we zoom out to a scale where we can describe it as a continuous substance.

An area where these effective models are particularly useful is when trying to understand the nuclei of atoms. You may remember from physics and chemistry lessons that atoms are like mini solar systems with a central structure, known as the nucleus, which contains most of the mass, and much lighter particles called electrons orbit around it. We can go one step further in and talk about the objects that make up the nucleus, the protons and neutrons. They are bound together by an incredibly strong force that acts over very short distances, and is creatively called the strong force. At this level the nucleus is a collection of protons and neutrons exchanging a triplet of other particles known as pions, $\pi$, which keep them stuck together. This is already starting to sound pretty complicated, but there is more. Protons, neutrons, and pions are not fundamental objects: they are formed from smaller particles known as quarks, bound together via the strong force. If you think that this is starting to sound ridiculous then you are not alone. We have started from a mini solar system and ended up with a churning sea of energy and mass.

A model that can sidestep some of these complications is the Skyrme model, introduced in the 1960s by Tony Skyrme describing nucleons, a catch-all term for protons and neutrons used in atomic physics. It has been extended to a model for the nuclei of atoms where we ignore the messy internal structure of protons and neutrons. Instead we zoom out to a scale where what we care about are pions, with nuclei appearing as static lumps of energy in the pion field.

Mathematically the Skyrme model is a field theory, a continuous model of a physical system, described by a non-linear partial differential equation. In other words, the equation describing how the pion fields evolve in time depends on how they change spatially and how the fields interact with themselves. In contrast to the integrable solitons met in Ricky’s solitons article, we can only solve the equations in the Skyrme model numerically. Another difference is that the Skyrme model involves a matrix field called $𝙐$ and is given by
\[𝙐(x)=\begin{pmatrix} \sigma(x)+\mathrm{i}\pi^{0}(x)&\mathrm{i}\pi^{1}(x)+\pi^{2}(x)\\ \mathrm{i}\pi^{1}(x)-\pi^{2}(x)&\sigma(x)-\mathrm{i}\pi^{0}(x) \end{pmatrix},\] subject to the constraint that the determinant of $𝙐(x)$ is $1$. The field $\sigma(x)$ is called the sigma field and is completely determined by physical constraints, so that we only care about finding the pion fields. We also need to include boundary conditions that specify how the pions behave asymptotically; to avoid describing a scenario with infinite energy, we set $\pi(\infty)=0$. Surprisingly, the skyrmion configurations found in this way have a conserved quantity: the baryon number, or the number of nucleons that the solution is equivalent to.

The nucleon

If we’re thinking about the classical model of atoms, the baryon number of a skyrmion is the same thing as an atomic mass number. However, this does not mean that skyrmions look like collections of well defined billiard-ball-like nucleons. On the contrary, there is a weird and wonderful world of skyrmion configurations so appetising that one of the authors referred to this as `a smörgåsbord of skyrmions’ in the title of a recent paper.

Skyrmions are examples of topological solitons: particle-like lumps of energy that are stabilised by the topology of the fields. In other words, the configurations described by $\pi^0, \pi^1$ and $\pi^2$ do not dissipate over time; all the energy has to stay in there somewhere. There is no continuous transformation of the field configurations to the one representing the vacuum: we say that they cannot be continuously deformed to it. This is where topology comes into play: it’s all about looking at structures, and asking whether they’re preserved by continuous transformations.

Now, where is the topology of the skyrmions coming from? It comes from the constraint that the determinant of $𝙐$ must be equal to $1$, which we can write as $\sigma^2+(\pi^0)^2+(\pi^1)^2+(\pi^2)^2=1$: the constraint tells us that the sum of the squared real numbers equals one. You might recall the formula for a circle of radius $r$: $x^2+y^2=r^2$. If we want to describe a (2-dimensional) sphere, we just extend it to $x^2+y^2+z^2=r^2$. We can keep going, building a formula with $n+1$ terms on the left to describe an $n$-sphere. So our constraint on the matrix field $𝙐$ is that the elements are coordinates for the 3-sphere of radius 1: skyrmion configurations start off with points $x,y,z$ in 3-dimensional Cartesian space, and map them to points on the 3-sphere.

We can even think of this as a map from one 3-sphere to another. To make it work, we use the clever trick of treating infinity as one point, which gives us a stereographic projection:

Stereographic projection

This lets us use a cool branch of mathematics known as homotopy theory, which tells us that maps from a circle to a circle are characterised by the number of times one circle is wrapped around the other circle. Think of wrapping a shoelace around your ankle lots of times before you tie it: to work out what’s happened to your leg, you just need to count the number of wraps.

Map between circles. When the first circle is mapped $N$ times to the second circle, we can visualise it as $N$ circles

This works in higher dimensions too: to understand how our maps $\pi^0, \pi^1, \pi^2$ work, we just need to look at the surface they describe, and work out how many times it coves the 3-sphere. This is the baryon number we mentioned earlier, and it’s a conserved quantity (so we should be able to predict the total). However, what we can’t predict is whether they are clustered or widely separated, just their total number throughout 3-dimensional space.

How do we understand what skyrmions look like? Well, we go back to the Skyrme model and look at the expression for the energy of these pion field configurations. We know that they correspond to physical atoms when the energy is minimised, so we minimise the energy and see what resulting configuration looks like.

We can think of the space of field configurations with a given baryon number as being a mountainous landscape. High energy configurations sit at the peaks of mountains and are very unstable, able to roll down the side of the hill to reach a valley where the low energy configurations live. Understanding this energy landscape is a challenging problem, particularly since we want to find the lowest valleys where our classical picture of an atomic nucleus lives.

Over hills and through the vales

Starting with a single skyrmion, its solution can be constructed by minimising the energy of a map that resembles a hedgehog. We draw the map of the surface described by the pions: usually by letting $\pi^0\propto z$ and $\pi^\pm=\pi^{1}\pm \mathrm{i}\pi^{2}\propto x\pm \mathrm{i}y$ (but any other permutation or rotation would do equally well). Why do we call it a hedgehog? If we plot all the vectors we get a sphere covered with arrows all of which are pointing outwards, like a hedgehog rolled up into a ball.

Skyrmions are best visualised with a colour that tells us the directions in which the pions are pointing. That is positive $\pi^0$ is white, negative $\pi^0$ is black, and $\pi^\pm$ is usually mapped to a standard choice of the colour wheel, going around from red to green to blue and back to red. The entire sphere of colours is also known as Runge’s colour sphere. This is what leads to the jazzy colours in our pictures of the nucleon.

What do we need to know about skyrmion interactions? Same colours attract. Opposite colours repel. It’s more or less that simple. If we place two skyrmions with the same colours close to each other, they will attract in a finite time and form a bound state of some sort. The first few shapes that will be formed are these: two skyrmions will attract and form a torus, three skyrmions will attract and form a tetrahedron, and finally four skyrmions in the attractive channel will form a cube:

Skyrmion solutions with Baryon number 2 through 4

How do we find a multi-skyrmion? This topic has been fascinating researchers for several decades and several very nice mathematical concepts have been invoked to cook up good approximations for skyrmions with a given baryon number. If we are happy to turn to a numerical algorithm and let the computer do the work, there is a nice simple recipe, that was in fact used in finding over 400 skyrmion solutions in the aforementioned smörgåsbord of skyrmions.

It goes like this: start with a single skyrmion $𝙐$. Rotate it with three random angles. Place it at a random position.

Add in another skyrmion, which you’ve rotated with random angles and placed at a random position (but not too far from the previous one). This ensures that we find a multi-skyrmion and not just several clustered skyrmions. Repeat $B-1$ times, until we reach baryon number $B$.

How do we combine the maps of these skyrmions? A simple way, which is not unique, is to use the fact that the determinant of the product of matrices is the product of determinants of the individual matrices. This guarantees that $𝙐=𝙐_1𝙐_2\cdots 𝙐_B$ has determinant $1$, if each $𝙐$ has determinant $1$. This gives us what is known as an initial condition.

An intuitive numerical method is to simulate the motion of a ball in a potential. If we think of the initial condition as the ball sitting at any hillside of the energy landscape, letting it go corresponds to evolving it in time. The ball accelerates toward the valley of the landscape.

Being interested in finding the valley, we need to measure the potential energy, ie the height of the position on the hill as we go. Once the height starts to increase, we have either crossed the minimum or a mountain pass—a saddle point—and the ball is now climbing up another hillside.

We alter the dynamical situation, by removing all the kinetic energy as soon as we discover that the ball is climbing another hillside. Then we start over and let the ball go from the new position. This numerical algorithm does not guarantee that we find the global minimum, that is the lowest valley of the landscape, but only that we have found a local minimum: the bottom of the nearest valley.

For skyrmions with baryon numbers 1–4, 6, and 7, there is only one known skyrmion solution. So whatever we start with for these baryon numbers, we will end up in the known lowest valley of the energy landscape. It becomes more complicated for larger baryon numbers. In fact, there are 16 skyrmions with $B=11$, including the one in the header picture, and more than 140 skyrmions with $B=16$ in the smörgåsbord.

$B=16$ skyrmion

Atoms and beyond

We know that atoms exist in the realm of quantum mechanics, so there is another chapter to this story. We have also not distinguished between protons and neutrons. However, it is well known that they are not exactly the same. The way to tell them apart comes from quantising the Skyrme model. However, that is a story for another day. Research into skyrmions and their role in understanding atoms is very much an active field. If you want to know more, then a great place to start is the Solitons at Work network. This is an online community of soliton researchers, giving talks about their work, organising conferences and workshops, and sharing pictures of skyrmions. As members of the committee we may be biased, but it is a great place to see more about the wonderful world of solitons.