post

In conversation with Alison Etheridge

During a red weather warning in January, we hunkered down, made a cup of tea, and had a chat with the soon-to-be Dame Alison Etheridge. We talked all things statistical modelling, her recent damehood, and the divides between different mathematical factions. Having dabbled in statistics at university, I was vaguely aware that we often make assumptions about our data before we apply a statistical model. But it wasn’t until talking to Alison that I realised how problematic this can be and how important Alison’s work on this is.

The beauty of cyclic quadrilaterals

Alison grew up in Wolverhampton and never imagined she would become a mathematician. At school, her favourite subject was chemistry, but she does remember being taught about inscribed quadrilaterals (you may know them as cyclic quadrilaterals). “I just sat there, thinking they are really beautiful,” she recalls—not that she would tell her fellow classmates this! Her chemistry teacher pushed her towards a maths degree, saying “you can do anything with mathematics”—as our A day in the life feature demonstrates. Thanks to huge support from the maths department, she was able to leave school a year early and take a gap year before starting a maths degree at Oxford.

Continue reading

post

The human mug

To topologists, mugs and doughnuts are the same. You may be wondering: how? “Are they mad?” I hear you cry. “I can drink coffee out of a mug but I certainly cannot drink coffee out of a doughnut.” By considering the number of holes in objects, we will explore whether humans are also doughnuts—or, if not, we will figure out which object humans are the same as.

With numerous words in mathematics, the mathematical use/meaning does not align with our everyday usage of the word; examples include ring and hole. Here, the topological definition of a hole does not align with the everyday use of the word. For example, if you had dug a grave, you would probably say “there is a hole in the ground,” but to a topologist, this does not count as a satisfactory hole in the Earth’s surface.

Two overlapping circles with the intersection containing an image of a mug and a doughnut

A topologist’s view

First, let’s figure out why topologists can’t tell the difference between a mug and a doughnut. Topology is concerned with properties of geometric objects; these objects can exist in any number of dimensions. The more precise version of our statement ‘to topologists, mugs and doughnuts are the same’ is ‘a mug and a doughnut are topologically equivalent’. (If we wanted to take this one step further, topologists would use the mathematical term solid torus rather than doughnut.) One ingredient of topological equivalence is homotopy equivalence. This is when one object can be ‘deformed’ into the other, by a series of stretching, squashing, twisting and bending. Things we are not allowed to do to deform our object include cutting it, opening/closing holes, and gluing parts together.

Homotopy equivalence is not exactly the same as topological equivalence, but it is easier to work with. One of the reasons why mugs and doughnuts are homotopy equivalent is that they each have a single one-dimensional hole. The centre hole of the doughnut aligns with the hole made by the handle of a mug. To see this more clearly, imagine that you have a stretchy and inflatable mug—If you find this difficult to imagine, have a look at the cartoon at the start of this article to help you along.

Continue reading

post

How to write a crossnumber

For ten years now, I’ve been setting the Chalkdust crossnumber. Over this time, I’ve developed a lot of tricks and tools for setting good puzzles.

Although puzzles involving words in grids of squares have been around since at least the 1800s, the first crossword (or word-cross as the author called it) puzzle was published in the New York World newspaper in 1913. Since this, crosswords have become a staple in newspapers and magazines around the world.

In the UK, cryptic crosswords were invented and grew in popularity in the 1900s and now appear in most newspapers and magazines. The cryptic crossword in the Listener magazine became infamous as the hardest such puzzle. the Listener ceased publication in 1991, but the Listener crossword still lives on and is now published on Saturdays in the Times newspaper.

Fun’s word-cross puzzle, written by Arthur Wynne in 1913. Each clue gives the position of the first and last letters where the entry should be written. The puzzle features well-known words such as ‘neif’ and ‘nard’, and the solution includes ‘dove’ twice.

In early 2015, we were discussing ideas for regular content in our brand new maths magazine, and wanted to include a more mathematical crossword-style puzzle. Like Venus emerging from the shell, the crossnumber was born.

Of course, we didn’t invent the crossnumber puzzle: the first known crossnumber puzzle was written by Henry Dudeney and published by Strand Magazine in 1926; four times a year, the Listener features a numerical puzzle; and the UKMT (United Kingdom Mathematics Trust) have included a crossnumber as part of their team challenge for many years. There are also plenty of other publications that include crossnumbers, including the very enjoyable Crossnumbers Quarterly, that’s been publishing collections of the puzzles four times a year since 2016. But perhaps we’ll be mentioned in a footnote in a book about the history of puzzles.

But anyway, we’d decided we needed a crossnumber, so I needed to write one…

Making a grid

The first step when creating a puzzle is to create the grid. Like many publications, we restrict ourselves to using grids of squares (for the main crossnumber at least—we allow more freedom in other puzzles that we feature).

Grids with order 2 (left) and order 4 (right) rotational symmetry.

Typically, a publication will impose some restrictions on the placement of the black squares in its puzzle. The most popular restrictions are:

  • The white squares must be simply connected (for any two white squares, it is possible to draw a path between them that only goes through white squares);
  • The arrangement of black squares must be in some way symmetric;
  • The proportion of squares that are black cannot be too high.

Grids with one, two, and four lines of symmetry.

There are two forms of symmetry that are used in crossword grids: rotational symmetry and reflectional symmetry. Both order 2 (rotate the grid 180° and it looks the same) and order 4 (rotate the grid 90° and it looks the same) rotational symmetry are commonly used in crossword grids, with order 2 rotational symmetry often being the minimal allowable amount of symmetry. You’ll want to be careful when making a grid with order 4 rotational symmetry, as it’s very easy to make your grid into an accidental swastika.

The grid for crossnumber #14 experimented with translational symmetry.

With a few notable exceptions, the grids for the Chalkdust crossnumber have some form of symmetry, and also follow the other two restrictions. We do, however, allow some flexibility on these restrictions if it allows us to use an interesting grid. For crossnumber #14 (shown on the left), we experimented with translational symmetry, leading to a grid that was not simply connected and with a greater than usual proportion of black squares. For crossnumbers #11 and #6, the grids had a nice property: rotating the grid leads to the same pattern of squares with the colours inverted. Once again we had to break the requirements on connectedness and the proportion of black squares to do this.

The grid for crossnumber #11. Rotating this grid 90° leads to the same grid with inverted colours. This grid also has order 2 rotational symmetry.

The grid for crossnumber #6. Rotating this grid 180° leads to the same grid with inverted colours.


The grid for crossnumber #17 was not symmetric, and instead included all 18 pentominoes in black.

For crossnumber #17, we skipped imposing symmetry entirely and instead formed all 18 pentominoes from the black squares in the grid. This really pleased me, as there were clues in the puzzle related to the number of pentominoes and the number of black squares in the grid.

For American style crosswords, there’s an additional restriction that is imposed: all white squares must be checked. A white square is called ‘checked’ if it is part of an across entry and a down entry—and so you can fill that square in by solving one of two different clues. Due to this, American crosswords will have large rectangles composed entirely of white squares and never have lines of alternating black and white squares as commonly seen in British puzzles.

A valid American crossword (if two letter entries were allowed)

A valid British crossword that is not a valid American crossword

When writing a crossword, it is common to pick the words to include while making the grid, as trying to find valid words or phrases to fill a predetermined grid is a challenging task. (Thankfully, there’s software out there that can help you make grids from a list of words.) For crossnumbers, filling the grid is a much easier task as any string of digits not starting with a zero is a valid entry. This ease of filling the grid is what allowed us to use the interesting restriction-breaking grids mentioned in this section.

For crosswords, it is also common to disallow the use of two-letter words. For the crossnumber, removing two-digit numbers would remove the potential for a lot of fun number puzzles, so we don’t impose this restriction here. We allow two-letter words in the Chalkdust cryptic too, as they can be really useful when trying to make a grid with our additional restriction that the majority of the included words should be related to maths.

Setting the clues

Once I’ve made the grid for a crossnumber, the next task is to write the clues.

Often, I start this task by picking a fun mathematical or logic puzzle to include in the clues. Sometimes, this is a single clue, such as this one from crossnumber #1:

Down

  • 6. This number’s first digit tells you how many 0s are in this number, the second digit how many 1s, the third digit how many 2s, and so on. (10)

Or this clue from crossnumber #5:

Across

  • 9. A number $a$ such that the equation $3x^2+ax+75$ has a repeated root. (2)

Other times, this could be a set of clues that refer to each other and reveal enough information to work out what one of the entries should be, such as these clues from crossnumber #10:

Across

  • 13. 49A reversed. (3)
  • 37. The difference between 49A and 13A. (3)
  • 47. 37A reversed. (3)
  • 48. The sum of 47A and 37A. (4)
  • 49. Each digit of this number (except the first) is (strictly) less than the previous digit. (3)

Once I’ve included a few sets of clues like this, it’s time to write the rest of the clues. As any crossnumber solvers will have noticed, my favourite type of clue to add from this point on is a clue that refers to another entry.

More recently, I’ve begun adding an additional mechanic to each crossnumber. This started in crossnumber #13, when all the clues involved two conditions which were joined by an and, or, xor, nand, nor or xnor connective. Mechanics in later puzzles have included the clues being given in a random order without clue numbers (#14), some clues being false (#16), and each clue being satisfied by both the entry and the entry reversed (#19). I really hope that you enjoy the ‘fun’ mechanic I used in this issue’s puzzle.

Around the same time as I started playing with additional mechanics, my taste in puzzles shifted. Older crossnumbers had been quite computational, and often needed some programming for a few of the clues, but more recently I have become a greater fan of logic puzzles and number puzzles that can be solved by hand. To reflect the change in the type of puzzle I was setting we added the phrase ‘but no programming should be necessary to solve the puzzle’ to the instructions, starting with crossnumber #14.

Thinking like a mega-pedant

One of the most important things to watch out from when writing and checking clues is accidental ambiguity due to writing maths in words.

For example, the clue ‘A factor of 6 more than 2D’ could be read in two ways: this could be asking the solver to add 6 to 2D, then find a factor of the result; or it could be asking the solver to add 1, 2, 3, or 6 (ie a factor of 6) to 2D.

As long as I spot clues like this, it can usually be fixed with some rewording. In this example, I’d rewrite the clues as either ‘2D plus a factor of 6.’ or ‘This number is a factor of the sum of 6 and 2D.’

In my time setting the crossnumber, I’ve got a lot better at spotting ambiguity in clues, and do this by reading through the clues and trying to be a mega-pedant and intentionally misinterpret them. It can be really helpful to get someone else to help with this check though, as remembering what you intended to mean when writing a clue can make it hard to read them critically.

Checking uniqueness

Perhaps the most difficult part of setting a crossnumber is checking that there is exactly one solution to the completed puzzle.

To help with this task, I’ve written a load of Python code to help me find all the solutions to the puzzle. I run this a lot while writing clues to make sure there’s no area of the puzzle where I’ve left multiple options for a digit. I intentionally use a lot of brute force in this code so that it’s really good at catching situations where there are multiple answers to a puzzle where I only found one solution by hand.

Once I’ve got all the clues and my code says the solution is unique, I do a full solve of the puzzle by hand. This is both to confirm that the code’s conclusion was not due to a bug, and to check that the difficulty of the puzzle is reasonable.

Following this checking, and a little proofreading, the puzzle is ready for publishing. Then the fun part begins, as I get to chill with a nice cup of tea and wait for people to submit their answers.

post

On the cover: Voronoi diagrams and generative art

Generative art is all about designing systems and letting those systems create the art. Different generative artists might define it in their own ways. Some see the system itself as the art, while others view the generated pieces as the art. Some might consider themselves the artist, while others might see the computer as the artist. There’s also a group I would call `parametric artists’, where people take an existing generative project and tweak the parameters until they get the desired output.

I see generative art as a unique blend of traditional art concepts with an element of randomness that’s hard to replicate with traditional media. It’s a balancing act between algorithmic precision and adding enough unpredictability to keep things interesting.

Curation is a big part of generative art as well. I often write programs that run overnight, generating images. In the morning, I sift through them to find the ones I like, sometimes using randomised parameters in the filenames to make sorting easier.

Some might want to compare it to AI art, but that is different to generative art. Recently, generative AI has become more common, and many people are calling their work ‘generative art’. While I think these are two different things, I understand why AI art falls under the same term. Both involve working with generative systems, but it’s important to distinguish between them.

We might start using new terms like procedural or algorithmic artists for traditional generative artists, and AI artists, parametric artists or prompt engineers for those working with AI. But for now, they all fall under the umbrella of generative art.

alternate artwork

‘Here’s what you could have won!’: Some more of Hayden’s artowrk.

About the artist

I’ve always had a passion for digital art and using computers in creative ways, exploring systems that behave uniquely. My journey into coding began in primary school when my older brother introduced me to GameMaker. From there, I started experimenting with fun little projects, often creating old-style roguelike games in my free time.

My shift to generative art happened in secondary school when I discovered Daniel Shiffman’s YouTube channel. I learned about the Processing library for Java and began making simulations. My friends and I would sit in the back of our physics class, writing simulations for each new concept we learned. Eventually, I started exploring more abstract routes, creating simulations related to gravity and objects interacting with each other. My first true generative art projects involved making strange attractors and re-implementing flocking algorithms.

strange attractor

The Lorenz attractor is perhaps the best-known example of a strange attractor

Over time, I discovered many generative artists who continue to inspire me, including Ben Kovach, Daniel Shiffman and Tyler Hobbs. Their talks and pieces available online have been a significant source of inspiration. This is not really inspired by any particular thing or work, but I wanted to pick a topic that comes up in a lot of fields, or even in a lot of different generative art projects. I wanted to make something where I wasn’t combining too many different ideas, that way the main focus of Voronoi diagrams could be seen easily.

While my Instagram page has been fairly inactive since I started teaching and attending grad school, I hope to become more active in sharing my work. My future aspirations include continuing to explore the intersection of art and technology, creating more generative art projects and inspiring others to find creativity in coding.

Voronoi diagrams

Voronoi diagrams are one of the more commonly used tools in a generative artist’s toolbox. Generative artists often aim to mimic organic patterns, and Voronoi diagrams are excellent for this purpose. They can approximate many natural forms, from the patterns on a giraffe’s coat to the structure of plant cells. I find it fascinating whenever we see such close relationships between maths and nature.

A Voronoi diagram is a way of dividing space into regions based on distance to a specific set of points. Each point has a corresponding region consisting of all locations closer to that point than to any other. These regions are called Voronoi cells.

Given a set of points, the Voronoi diagram can be constructed like so:

  • Start with a set of points;
  • For each point, determine the region of space that is closer to it than to any other point. This involves finding the perpendicular bisectors of the lines between each pair of points;
  • The intersection of these bisectors forms the edges of the Voronoi cells.

Additionally, I’m showcasing another concept on the cover closely related to Voronoi diagrams: Delaunay triangulations. This is a way of connecting a set of points to form triangles such that no point is inside the circumcircle of any triangle. Given a set of points, the Delaunay triangulation can be constructed like so:

  • Start with a set of points;
  • Connect the points to form triangles such that the circumcircle of each triangle does not contain any other points.

A Delaunay circle is the circumcircle of a triangle in the Delaunay triangulation. It is the circle that passes through all three vertices of the triangle. The property of Delaunay triangulation ensures that no other points lie inside these circumcircles.

The Voronoi diagram and Delaunay triangulation are duals of each other. This means that the vertices of the Voronoi diagram correspond to the faces of the Delaunay triangulation and vice versa. You can construct the Voronoi diagram from the Delaunay triangulation by connecting the circumcentres of the Delaunay triangles.

These diagrams have significant applications in various fields, including computer science, geography, biology and urban planning. They help solve proximity problems, such as finding the nearest neighbour, and are used in spatial analysis, map overlay and network planning. In computer graphics, Voronoi diagrams can be used to generate realistic textures and simulate natural phenomena.

Voronoi patterns are found in nature, such as in the distribution of seeds in a sunflower or the pattern of cracks in drying mud. Some of my favourite example use cases include shattering patterns, creating paths that avoid obstacles, and even applying machine learning techniques like $k$-means clustering.

delaunay construction

Example of Delaunay constructions.

Behind the scenes

Most of my projects follow an iterative process. I start by writing code that defines a system, then make random changes, experiment with different colour schemes, and tweak parameters to achieve the desired outcome. Here, I use p5.js, a JavaScript version of the Processing library, which allows me to add buttons and interactive elements easily. This makes it simple to tweak and modify the artwork live.

In this project I draw a few separate layers and stack them on top of each other to make a composite image. It takes a set of points to generate the Voronoi diagram and then takes it from there. The software allows me to apply a blur effect and a threshold to layers. It takes a buffer and the name of the layer as inputs. The function first applies a Gaussian blur, then iterates through each pixel to adjust the alpha value based on a threshold. This process helps generate more natural curves in the drawing.

Without blur or threshold, some blur but a threshold at 0, and finally some blur and threshold applied.

In graphics programming, we often work with a one-dimensional array of pixels. Each pixel is represented by four consecutive values (red, green, blue, $\alpha$); by iterating through this array and adjusting the $\alpha$ values, we can create smooth transitions and define edges more clearly.

I am sure there are much faster ways to do this with shaders, but I tend to stick to whatever tools are built in to what I am using. We can see the effects of this by increasing the blur amount for our Voronoi edge layer (see above). This allows us to get much more organic shapes.

This can be applied to many different layers in different ways, including the triangulation. Since the Delaunay triangulation is the dual graph of our Voronoi diagram, the vertices should be inside each tile. Additionally, the vertices should be darker and less likely to disappear when we blur and threshold them. This will cause star-like shapes to form inside each tile (see below).

Adding blur and threshold distorts the dual graph.

The same can be done for other layers until we have a cohesive image. I also colour in the tiles randomly, pulling from a predefined colour scheme.

Adding some colour to the tiles and a drop shadow completes the work.

The last thing I do is add a drop shadow to make some layers stand out a bit or make it more interesting. The shadows are made by copying the layer, blurring it, and setting the RGB values to be a consistent colour. Once the settings are fine, I generate a few iterations with the same settings, then pick the ones I like best. The variations you see below are all generated with the same parameters.

post

Oπnions: In defence of toy models

Many people think of mathematics as this incredibly austere and abstract subject, and mathematicians as people who are obsessed with abstraction and divorced from reality. It is true that mathematics can be abstract, and some mathematicians are quite proud of the separation between what they see as real mathematics and useful mathematics, cough Hardy cough. To some of them being able to show that an integral is finite, or that a solution exists to a system of equations, is enough. They do not actually feel the need to sit down and evaluate the integral, or solve the equations. To them the existence of the answer is the important thing, not actually constructing it.

However, in my opinion this characterisation of mathematics as purely an abstract approach is an unfair critique. There are plenty of mathematicians who evaluate integrals, construct explicit solutions to systems of equations, and who want some physical, biological, or computer science motivation behind a problem to consider it interesting and worth pursuing. And this is not just true of applied mathematics. Many pure mathematicians also love constructing explicit examples when testing out ideas, such as Dame Alison Etheridge (see here). If you are going to spend a lot of time and effort trying to tackle a problem, you want to know if you are likely to be able to solve it rather than simply banging your head against a brick wall. So we make approximations and simplifications to construct a toy model preserving the key mathematical and physical features but providing a sandbox where we can test our ideas.

Not that kind of toy model…

While there are people who study abstract foundations like quantifying the concept of oneness or, allegedly, spending hundreds of pages to establish that $1+1=2$, many mathematicians are interested in understanding problems motivated by studying the mathematics of fluids, modelling the spread of viruses, or—in my case—effective models of special magnetic materials. As a mathematical physicist, much of what I do is to study mathematical problems and questions motivated by real-world physical systems.

One such model uses a one-dimensional chain of atomic spins to understand magnetic materials. In this case we forget about the structure of an individual atom and just think of it as an arrow that can point in any direction. Next we come up with a rule for how neighbouring atoms interact, eg if they want to point in the same direction we call it a ferromagnet or in opposite directions it is called an antiferromagnet. One approximation that we make is that any given atom only cares about its nearest atoms and not the effect of all the other atoms which are further away. Finally we say how the atoms interact with a magnetic field or if they prefer to line up with a particular direction and off we go. We can understand the transition between different magnetic domains without having to study a complicated model incorporating all the details, and I can calculate this on the back of an envelope without having to touch a computer.

A spin chain of atoms modelling a domain wall in a ferromagnet.

A spin chain of atoms modelling a domain wall in a ferromagnet.

These simplified toy models can come under criticism from two sides. Proponents of austere abstraction will attack them for lacking mathematical rigour or for involving unfounded leaps of logic, while those who are all about applications can bemoan the time spent simplifying a problem to still attack it with, potentially, complicated mathematics. Why not just study it numerically on a computer? There is definitely merit to this argument; many toy models can be and are studied numerically. However, a numerical solution does not convey the same intuition as a more analytic approach. Yes, if I solve a differential equation numerically I can play with the parameters and see how the solution changes. However, this will often not tell us why the change is happening. A good toy model, like our spin chain, can convey this.

To me, a toy model is more than just a simplified setting to play around with some interesting mathematics. In building them you need to think about what the key features are that you want the model to describe, and how they can be described mathematically. A well constructed toy model will be sophisticated enough to preserve the qualitative, and some of the quantitative, features we want, but be simple enough that we can really get our teeth into it. It will enable us to gain intuition about how changing the parameters of the model change the results that you get out. In the best cases, this intuition will help you to understand the full situation. From SIR models in epidemiology to effective field theories in physics, toy models are put to good use in many places.

What about in more abstract settings? Again, we can study a simpler object to help us gain understanding. Want to understand something about groups? Study matrix groups first: you can leverage your linear algebra experience to understand how to formulate the problem and identify potential subtleties. In a way this is what a power series or Fourier series expansion of a function is doing. You have a function with some properties that are easy to understand and others that are harder to understand, and approximating it by a power series gives you something more tractable where you can do some quick computations to check what is going on.

I guess that all I am trying to say is next time someone says that they are studying a simplified situation or playing with a toy model, do not dismiss them out of hand. Remember toy models can convey intuition and understanding beyond their humble formulation. Maybe when you are stuck on a particularly difficult problem you should think about what simplifications you could make and try one out for yourself.

post

What a load of noise: A beginner’s mathematical guide to AI diffusion models

These two little vowels have infiltrated daily conversation, and we’re not talking about a decent draw from the Scrabble bag. From geopolitical debates to everyday chitchats—AI is everywhere, and people won’t shut up about it.

Who can blame them? It’s in your pocket when you ask your phone for the weather, on your commute when GPS (shout out to Gladys from issue 15’s significant figures) finds the best route, and in your inbox when your email filters out spam. AI helps pick your playlist, suggests what to watch after a long day, and powers the chatbot answering questions when you shop online.

Most of the time, you don’t even notice it. It’s just there making life a little smoother. But, as AI gets smarter people are asking bigger questions. Will it change the way we work? Is it sustainable? Can we trust it? Whether you’re excited, worried or disillusioned, one thing’s for sure: the conversation is just getting started.

I cannot keep up. The state of the art, the current highest-level of performance achieved in a particular domain, is continuously shifting. New models are constantly being released, outperforming predecessors as new techniques are discovered. Trying to chase and understand the ever-changing best feels like fighting a losing battle. But AI isn’t going anywhere. You could spend a lifetime learning, researching, using and arguing about it (increasingly many people are). So where do you start? I myself, as a trendy individual, am choosing to start with tackling one of the hottest AI tools of the season: diffusion models.

Continue reading

post

Unlocking sudoku’s secrets

Sudoku has long captivated puzzle enthusiasts worldwide with its logical challenges and addictive nature. While it may seem like a simple game of numbers, beneath the surface lies a fascinating connection to the realm of mathematics. Graph theory and abstract algebra both play a crucial role in unravelling the intricacies of sudoku. Sudoku puzzles consist of a $9 \times 9$ grid, divided into nine $3 \times 3$ sub-grids called regions. The objective is to fill the grid with numbers from 1 to 9, ensuring that each row, column, and region contains every digit exactly once.

A vertex colouring problem

In graph theory, a graph is a mathematical structure that comprises a set of vertices, or nodes, connected by edges. An area rich in mathematical questions, graph theory has a long history of famous thought-provoking problems. One of the most well-known of these problems is the vertex colouring problem: Given an undirected graph $G = (V, E)$, where $V$ represents the set of vertices and $E$ represents the set of edges, can one find an assignment of colours to each vertex in $V$ satisfying the condition that no two adjacent vertices (connected by an edge) have the same colour?

While sudoku may appear as a grid of numbers, we can view it through the lens of graph theory. Interestingly, we can represent a sudoku puzzle as a graph, where each of the $81$ cells in the sudoku grid corresponds to a vertex in the graph. We label the vertices with ordered pairs $(x, y),$ $x$ and $y$ being integers from $1$ to $9$. We then join two distinct vertices $(x, y)$ and $(x’, y’)$ by an edge if and only if any of these conditions apply:

  • $x = x’$ (the two cells are in the same row),
  • $y = y’$ (same column), or
  • $\displaystyle\left\lceil \frac{x}{3} \right\rceil = \left\lceil \frac{x’}{3} \right\rceil$ and $\displaystyle\left\lceil \frac{y}{3} \right\rceil = \left\lceil \frac{y’}{3} \right\rceil$   (same $3\times3$ region).

So the top-left region of the sudoku board would be connected like this:

We then propose the vertex colouring problem: Does there exist a 9-colouring of our sudoku graph? That is, can we colour each vertex in the graph, using no more than $9$ colours, in such a way that no vertices which are connected by an edge end up the same colour?

It turns out that this question is equivalent to: does there exist a solution to the original sudoku board? By applying vertex colouring algorithms, such as the greedy algorithm and/or backtracking, we can systematically assign colours (numbers) to the vertices (cells) to find a valid solution to any sudoku board, so long as one exists.

The greedy algorithm and backtracking

The purpose of the greedy algorithm is to produce a vertex colouring of a graph. The algorithm assigns colours to the vertices of the graph by iteratively selecting an uncoloured vertex and assigning it the smallest possible colour that does not conflict with its neighbouring nodes. This means in sudoku, the greedy algorithm takes a sudoku board and systematically assigns numbers (colours) to cells (vertices), ensuring that no conflicts arise within the rows, columns or regions. Backtracking is a systematic search algorithm that explores the solution space by making choices and undoing them when they lead to contradictions or dead ends. We will combine the greedy algorithm with backtracking, particularly when the greedy algorithm fails to find a solution. In sudoku, this will involve filling in cells with numbers, testing their validity, and if a contradiction is detected, `backtracking’ to the previous state to explore alternative choices until a valid solution is found or all possibilities are exhausted. The procedure for combining the greedy algorithm and backtracking in sudoku follows:

  1. Start with the initial board and select the first uncoloured cell.
  2. Assign the smallest possible number to the selected cell.
  3. Move to the next uncoloured cell and assign the next smallest number.
  4. Continue this process, moving from left to right and top to bottom, assigning the smallest valid number to each uncoloured cell.
  5. If a conflict arises, indicating an invalid placement, backtrack to the previous cell and select the next available number.
  6. Repeat the process until the entire grid is filled or until no valid number can be placed.

The algorithm in action

Step 1: Suppose our initial board looks like this. We select the first uncoloured cell:

Steps 2–4: Greedy algorithm. Note that $8$ was entered in $(1, 7)$ instead of $4$, $5$ or $6$ because although $4$ is the lowest available number, $8$ is the lowest available valid number.

Step 5.1: Conflict. The only remaining available numbers for $(1,8)$ are already present in column $8$.

Steps 5.2–6: Backtrack. Return to cell $(1,7)$. Enter the next lowest available valid number, $9$. Resume greedy algorithm.

By continuing this same algorithm through all rows, we can reach a solution to our sudoku board!

Other researchers seem to have been taken by the same curiosity as me and have learnt more about sudoku from graph theory—Joshua Cooper and Anna Kirkpatrick linked minimal sets to minimal fair puzzles, and Michael Haythorpe connected Hamiltonian cycles to sudoku puzzles of different sizes.

Graph theory is one way to unlock sudoku puzzles mathematically. But sudoku is really just solving a puzzle subject to a number of constraints. That sounds like something algebra could help us with. And it can! Algebraic geometry possesses a similarly beautiful application to sudoku.

Gröbner bases

If we could write the sudoku problem as a system of polynomials, then solving the system would allow us to complete the puzzle. Solving simultaneous equations might seem like an opportunity for Gauss–Jordan elimination, but this method only applies to linear systems and we will see shortly that ours is not. Moreover, we require our solutions to be the integers 1–9. A more general approach lies in Gröbner bases.

A Gröbner basis for a system of polynomials is a new system of polynomials with the same solutions as the original, but which is easier to solve. For example, the system \begin{align*} x+y-3 &= 0,\\ xy-2&=0 \end{align*} is coupled, as both equations involve both variables. However, computing a Gröbner basis of this system yields \begin{align*} y^2-3y+2 &= 0,\\ x+y-3&=0, \end{align*} where the first equation involves only $y$. We can solve this first equation and then substitute to find $x$, making the system easier to solve. In order to understand how Gröbner bases work, we will need to learn some abstract algebra terminology. A polynomial ring is a set of polynomials in a certain number of variables. For our purposes, the coefficients of polynomials will come from the field $\mathbb{Q}$ of rational numbers. An ideal is a subset $I$ of elements from a ring $R$ that forms an additive group and has the property that \[x \in R \text{ and } y\in I \implies xy \in I \text{ and }yx \in I.\] An ideal can be generated by a set of polynomials just like a vector space can be spanned by a set of vectors. For example, if \[ I= \{af+bg : a,b \in R\}, \] then $f$ and $g$ generate $I$, and we can write \[I= \langle f,g \rangle.\] A term ordering is a method for choosing the order in which elements are placed. The term ordering that we’ll use here is the lexicographical term ordering, abbreviated $\operatorname{Lex}$. This is the ordering you’d expect in a dictionary. For example, if we have variables $x$, $y$ and $z$ then we order the variables as \[x>y>z.\] When we concatenate variables, we look at the first variable in each element and compare, then the second, and so on. For example, \begin{align*} \operatorname{Lex}(xy) > \operatorname{Lex}(yz), \\ \text{and }\operatorname{Lex}(x^2) > \operatorname{Lex}(xy). \end{align*} Given a term ordering, the leading term of a polynomial $f$ is the term which is greatest in the ordering and will be denoted $\operatorname{lt}(f).$ Given any set $S$ of polynomials in a polynomial ring, we can then define the leading term ideal of $S$ to be the ideal $\operatorname{lt}(S)$ generated by the leading terms of the polynomials in $S.$ That is, \[\operatorname{lt}(S)= \langle \operatorname{lt}(f) : f \in S \rangle.\] Note that the leading term ideal of a set of polynomials is not always equal to the leading term ideal of the ideal generated by that set of polynomials. For Gröbner bases we use degree reverse lexicographic order. That means if $S=\{x,x+1\}$, $\operatorname{lt}(S)= \langle x\rangle$, but if $I= \langle x,x+1\rangle$, then \[x+1-x= 1 \in I,\] so $\operatorname{lt}(I)=\langle 1 \rangle =R$.

When $\operatorname{lt}(S)=\operatorname{lt}(I),$ where $I=\langle S \rangle,$ we say that $S$ is a Gröbner basis for the ideal $I.$ In other words, a set of nonzero polynomials \[G={g_1,g_2,\dots,g_t } \subset I\] is called a Gröbner basis for $I$ if and only if $\operatorname{lt}(G)= \operatorname{lt}(I).$

Gröbner bases are nice because they simplify problems. For example, if $G$ is a Gröbner basis for an ideal $I$, then we can use simple reduction to determine whether or not a given polynomial $f$ is in the ideal $I$.

A well-known theorem in Gröbner basis theory explains that the Gröbner basis puts a given system into triangular form. For example, if $G=\{g_1,\dots,g_s\}$ is a Gröbner basis in variables $x_1, \dots x_n$ ($n\leq s$), the theorem says that we can order the polynomials $g_1,\dots,g_s$ so that $g_1$ only involves the smallest variable $x_1$, $g_2$ involves only $x_1$ and $x_2$ and has leading term involving only $x_2$, and so on. This makes solutions possible through quick, easy substitution.

Bruno Buchberger introduced Gröbner
bases in his 1965 PhD thesis and named
them after his supervisor, Wolfgang
Gröbner

A process known as Buchberger’s algorithm computes the Gröbner basis for a given ideal and term order. The algorithm uses the multivariate division algorithm and least common multiples. This algorithm also guarantees the existence of a Gröbner basis for a given ideal and term order.

Gröbner basis in sudoku

This is useful for the problem of sudoku. We can:

  1. Create a system of polynomial equations to represent our sudoku problem
  2. Use Buchberger’s algorithm to compute a Gröbner basis for the ideal generated by the polynomials in our system
  3. If the puzzle has a unique solution, the Gröbner basis will consist of 81 linear polynomials, from which we can read off the solutions to our sudoku puzzle

We can represent the constraints given in the rules of sudoku as a system of polynomial equations. We introduce 81 variables, $x_0,\dots,x_{80},$ one for each cell in the sudoku. Formally, we are now dealing with the polynomial ring $\mathbb{Q}[x_0,\dots,x_{80}].$ We wish to encode the conditions on the sudoku as a set of polynomials in some subfield $F \subset \mathbb{Q}[x_0,\dots,x_{80}].$ Cells can only take on whole number values from 1 to 9, so can we define the following polynomials for all $i=0,\dots,80$: \[(x_i-1)(x_i-2)\cdots(x_i-9)=0.\]

Next, we define polynomials to represent the condition that each number can only be used once in each column, each row and each region. This can be done by defining that the sum of all columns/rows and of each region should be 45 and product should be $9! = \text{362,880}$. With these conditions we have no duplicate numbers. So if $\{x_{k_1},\dots,x_{k_9}\}$ are all in the same row or column, \[\sum_{n=1}^{9} x_{k_n} = 45 \quad\text{and}\quad \prod_{n=1}^{9} x_{k_n} = \text{362,880}.\] Finally, since some of the cells $x_j$ in the grid are already filled out with a number $a_j,$ we define new polynomials for each of these cells, \[x_j-a_j=0.\] These polynomials give a system of 135 equations, plus one for each known cell. We can solve this system by applying Buchberger’s algorithm to compute a Gröbner basis and from there, read off the solution to the initial grid.

An example: shidoku

As we’ve seen, the algorithm is quite long. So instead of using working through an example with a sudoku board, we will use a shidoku board. Shidoku is a variation of sudoku which uses a $4 \times 4$ grid, divided into four $2 \times 2$ regions.

Introduce 16 variables $x_0,\dots,x_{15}$ to represent the 16 cells. Each cell can only take on the value $1$, $2$, $3$ or $4$, so for $i=0,1,\dots,15$, \[(x_i-1)(x_i-2)(x_i-3)(x_i-4)=0.\] Similar to before, the sum and product of all numbers in a column, row or region must be $1+2+3+4=10$ and $1\times 2 \times 3 \times 4=24,$ respectively. With these conditions we have no duplicate numbers.

Let $\{x_i,x_j,x_k,x_l\}$ represent a set of cells that make up a row, column or $2 \times 2$ region, so \begin{align*} x_i+x_j+x_k+x_l-10&=0 \\ \text{and}\quad x_i x_j x_k x_l-24&=0. \end{align*}

This gives us a total of 40 polynomial equations.

We then add any values which are already given. Let’s suppose we have the board below with the given variable assignments:

 

 

 

 

 

 

We therefore need to add the equations \begin{align*} a-1=0, \quad f-2=0, \\ i-2=0, \quad k-3=0, \\ p-4=0.\hspace{8mm} \end{align*} All that remains is to give our equations to a computer algebra package—Matlab has the gbasis function—in order to perform Buchberger’s algorithm and compute the Gröbner basis for this system.

In doing so, it returns: \begin{align*} a-1&=0, & b-3&=0, & c-4&=0, & d-2&=0,\\ e-4&=0, & f-2&=0, & g-1&=0, & h-3&=0,\\ i-2&=0, & j-4&=0, & k-3&=0, & l-1&=0,\\ m-3&=0, & n-1&=0, & o-2&=0, & p-4 &= 0. \end{align*} This basis consists of a system of linear polynomials from which we can easily read off the solutions to the shidoku board:

Sudoku intertwines powerfully with the realm of both graph theory and abstract algebra. The algorithms we learn in our courses provide valuable insights and strategies that can be applied to conquer sudoku puzzles. So, the next time you pick up a sudoku puzzle, remember this beautiful layer of graphs and polynomial equations that lies beneath its surface.

post

A day in the life: finance

In this issue’s edition of our day in the life series, we hear from three recent graduates who work in finance:

  • Megan Tarry works in audit in London,
  • Bayley McNevin works in audit in Liverpool,
  • Anna Lee works in tax in London.

Megan Tarry

Hey! I’m Megan and I’ve worked in audit in London in a graduate role since August. I’m doing an apprenticeship to become ACA qualified which has involved studying for exams (so far I’ve done 6 out of 15) alongside full time work. Despite the controversial opinions, I’ve really enjoyed my time in audit so far. Every day, I’ve worked on something different and I’ve learned so much in such a short space of time. Although the maths that I use barely goes beyond GCSE level, the application challenges you to think in different ways which I have found really interesting! Welcome to a random day in my life!

My first challenge of the day is trying to find the cheapest, least congested, most reliable way to get to work… it’s rare to get the luxury of all three! Today I decide to take the tube, which gets me to work around 8.40am. This means I have 20 minutes to settle myself in, make a cup of tea and catch up on emails and admin.

Yesterday I sent out some requests for supporting documents to the client I am currently auditing, so I spend the beginning of my day checking our client portal to see if anything has been submitted.

Then, at 9.30, I have my morning catch up with the rest of my team where we discuss our progress and plans for the day, before moving onto supplier statement testing that I made a start on yesterday. This involves ensuring that all purchase amounts issued by suppliers of goods to the client have been correctly recorded in the client’s purchase records (AKA a lot of Excel work). I then spend a quick 20 minutes before lunch sending an email to our data analytics team to resolve some formatting issues with a document (it has over a million rows of data!).

We break for lunch for an hour and I spend it heading out for a walk in the city before sitting down to chat to others in my intake. The nice thing about a grad scheme like this is that there are a lot of us in the same position—a few years out of uni with a lot to learn about the job.

Megan's lunch (a tupperware of pasta) with some other lunches on a table.

Rate my meal deal (we give it a 6)

After lunch, I speak to my assistant manager who briefs me on my task for the afternoon. I then work on some other areas of the audit, checking that the accounts figures we have for this year reconcile with some supporting documentation that we have been provided.

My day wraps up at 5.30pm and I rush back home to get ready to head out again. Having to balance both work and study has made it harder to prioritise a work–life balance so I joined a local competitive cheerleading team to ensure that I step away from work for at least a couple of hours a week in my busiest times and get some endorphins!

It’s fun to stay at the AAAA

Bayley McNevin

My day starts at 7am, with the usual scramble to get up and ready. By 7.45am, I’m on the Merseyrail, heading into the office in the Liver Building (I know, I know—fancy). The early train is a little bit busy, but it’s worth it to get a head start in the morning. I mostly spend the journey scrolling through emails, mentally preparing myself for another busy day.

POV: your train is arriving and it’s a class 777

I get to the office, get myself set at my desk, and head to my 9am catch-up meeting. Today we’re discussing progress on our current client. It’s the usual—everyone gives updates, talks through any problems, and someone inevitably brings up Excel shortcuts that no one has actually mastered yet. But that’s the audit life—a mix of progress, spreadsheets, and always a dash of confusion.

After the meeting, I do some of my work before I grab a quick lunch. Today, it’s soup that I made last night. I also grab a free fizzy drink from the `can fridge’—a bit of an office legend at this point. The new office was specifically designed with this fridge in mind, so it’s basically a rite of passage to raid it when the midday hunger hits. It’s a small comfort in the middle of the hectic day.

Post-lunch, I’m in a meeting with the client. We’re walking them through some of our findings and discussing next steps. It’s always a bit tense, even when everything’s going well. I try to keep my nerves in check—presenting your work is never as easy as you think it’s going to be, no matter how many times you’ve done it. But we get through it without any disasters.

The highlight of the day? Getting my new laptop. My old one finally died last week, the screen just went black and never came back on. I swear, I heard it give a sigh of relief when it gave up. But today, the shiny new replacement arrives, and I can’t say I’m not a little excited. Setting it up takes far too long—but at least I’ll have something that works now.

By the time 5pm rolls around, I’m definitely ready to switch gears. I meet up with an old uni friend for a pizza at Rudy’s, which is just the break I need. We catch up on life and try not to talk about work too much. It’s the perfect way to unwind after a full day of client calls, deadlines and technology drama.

Spot the former DITL contributor

After tea I head home, where the real work begins: ACA exam revision. I’ve got a law exam coming up, and I still have three exams left to tackle after that. It’s not glamorous, but I know it’s part of the process. One step closer to finishing the qualification—and one step closer to knowing what it’s like to sleep again. And hopefully, one step closer to being able to drink from the can fridge without feeling guilty about it.

Anna Lee

Hi, I’m Anna! Last June, I graduated from Durham University with a BA in liberal arts, focusing on history and economics. Fast forward to today—I’m living in London and working as an indirect tax analyst on a graduate scheme. As part of my role, I’m also studying for two professional qualifications: the Association of Tax Technicians (ATT) and the Chartered Tax Advisor (CTA). That means a few months of the year, I swap spreadsheets for study leave, diving deep into tax law and compliance.

Do you think Anna’s textbook was typeset in \LaTeX?

Since joining my graduate scheme, no two days have been the same. The nature of my work keeps things fresh, and while I’ve tried to capture a typical week, every day brings something new.

We have a hybrid working model, but we’re encouraged to come into the office at least three times a week. Personally, I find I’m most productive in the office, so I usually go in Monday to Thursday. My day typically starts between 8.30 and 9.00am—cycling in when the weather allows (which is a great way to wake up!). With a coffee in hand, I check my inbox, follow up on queries, and review any responses to emails I’ve sent to our member firms.

Most of the client accounts I work on involve overseas trade and VAT compliance for UK-based companies with operations abroad. Businesses often need guidance on navigating international VAT regulations, and we help them through our global network.

By 9.30, the office is buzzing. Once a week, our team—12 people strong—gets together for a catch-up on recent tax updates and ongoing projects. Given our small size, we don’t need daily meetings, but we always stay in the loop on each other’s work.

As the day progresses, I tackle emails, client queries, and VAT return reviews—especially as deadlines approach. There’s a steady flow of work, but the variety keeps it interesting.

At midday, I break up the routine with a lunchtime run alongside a teammate, getting some fresh air before diving back into work. If I’m not running, I’m catching up with colleagues from my training cohort. One of the perks of working in a big firm is meeting loads of people my age who, like me, are navigating their first corporate job post-university.

Once the workday wraps up, I either hit the office gym or meet up with friends. Thursdays, in particular, are big social nights—grads often gather for a post-work drink at a nearby pub before heading home.

Life as a graduate in London is fast-paced, but I love the balance of work, learning and socialising. No two days are ever quite the same, and that’s what makes it exciting!

post

Skimming potatoes

As a mathematician, it is rare that I get to speak to the media. My papers don’t always ‘catch the eye’. Yet one did. On 4 January 2023, a random Wednesday in an arbitrary year, my inbox blew up and my phone kept ringing. That day, a piece of my research with Frank Smith, a professor in the department of mathematics at UCL, was published. As a result of this article, 83 news stories appeared around the world. I even had the chance to speak to various radio presenters and a couple of TV reporters to boot. And in all honesty, I was surprised!

What did we do to receive such global interest, I hear you ask? Well, we studied skimming—how objects hit water and bounce back off. Much like the humble pastime we all know and love of stone skimming, also known as rock skimming, ‘ducks and drakes’, skipping (if you are American—the internet made sure we knew this), or ‘piffing a yonnie’ (no idea!? That’s on you, Australia).

Perhaps it was the time of year. A slow news day. A piece of new year’s ‘fluff’ at a time when nobody wants heavy news—‘a scientist redefining rock skimming?’—that’s the stuff people want!

I had expected it to all blow over quickly, but then I made a fatal error. At some point, I mentioned the joy of trying to skim potato-shaped stones. And, well, that was it. They got obsessed. In one small sentence, I soon became globally (but not widely) known as the scientist who suggested potato stones are better for skimming. Their spud-based, mish-mash of interest and enthusiasm boiled over. After that one story, more reporters came, eager to hear about the mysterious potatoes that skim so well.

I learned two lessons that day:

  1. Be careful what you say when the microphone is on you (lest you become forever known as the ‘expert’ of ‘potato skimming’—ouch), and
  2. Don’t take the internet too seriously (people are passionate about skimming… skipping—sorry America).

While it was nice to receive some light-hearted positive recognition, we didn’t really set out to study potatoes… in reality, we were interested in something much more important.

Aircraft icing

All this potato waffle and online roasting began with the study of aircraft, specifically aircraft icing. In the upper reaches of the atmosphere where the air gets cold and clouds get thin, aircraft can meet ice crystals and supercooled droplets (unfrozen water drops that are colder than freezing). Relatively harmless on their own, when lots of these particulates hit an aircraft they can cause ice to form on the wings, engines and other vital components. Without adequate protection, this icing can become a significant hazard. Historically, icing events have led to disaster, with two well-known instances together claiming the lives of almost 300. To prevent future incidents, the aerospace community continues to work to understand the physics of this hazard and protect against it.

This problem was the essence of the motivation for our study. When ice and water particles impact warm aircraft surfaces, such as wings and engines, water layers form. Once formed, other ice crystals may follow up and hit these layers. So, what happens next? Does the ice crystal skim much like a rock on a lake or sink like a potato in a pan? In the case of aircraft icing, the skimming ice isn’t some nice flat rock but potentially all sorts of shapes and sizes. So, the natural questions to ask is what influence does the ice shape and mass have here?

If an ice crystal sinks into a water layer, freezing may ensue and, without protection, an ice block may form. If the crystal skims, we want to know where it goes and whether that changes how we protect an aircraft. So, we studied the relationship between ice crystal mass and shape to show how various types of ice crystal behave—it’s either sink or skim.

Mathematical modelling allows us to peer beyond the constraints of experimentation. Yes, experimental evidence is vital, but it is limited. If you want specific answers to specific questions, experiments are great. But if you are after something a little more general and want to unpick the underlying physics, this is where the beauty of maths reveals itself. With well-defined physics and a set of equations to hand, mathematical modelling enables us to evaluate and understand hard-to-measure problems and, in our case, inform the efficient and effective design of aircraft ice protection systems.

Making it mathematical

To model what happens when the ice meets water, we need some equations.

Modelling fluids is tricky. Most of fluid modelling starts with a classic set of equations (known as the Navier–Stokes equations) but these are a complex beast to solve. To make life easier, we can employ a classic applied maths technique called asymptotic reduction—which is really a fancy way of saying, we compared the big and small factors in our problem and kept what was most influential. To do this, we make some assumptions about different parts of the ice–water interaction, like the water layer being shallow (eg thin and long), the horizontal velocity of the ice being much greater than its vertical velocity (as you would expect when skimming!) and that the fluid has certain physical properties —then our problem simplifies! With some massaging, our set of equations is no longer quite so beastly. We now have a new set called the shallow water equations (included below for awe-inspiring effect):

\begin{align*} \frac{\partial u}{\partial t}+u\frac{\partial u}{\partial x}&=-\frac{\partial p}{\partial x},\\ \frac{\partial h}{\partial t}+\frac{\partial}{\partial x}(hu)&=0. \end{align*} If you are unfamiliar with this notation, don’t worry: broadly these two equations together describe how the water layer’s horizontal velocity, $u$, and height, $h$, relate to each other and vary in time and space. With that, we can model the water layer.

Now, what about the ice (which we will refer to as the ‘body’ from now on)? When the body first descends into the water’s surface, pressure $p(x,t)$ builds up underneath as the water resists the body’s entry. This pressure along the body produces a force, hence we can turn to Newton’s second law, the classic $F=ma$ (force equals mass times acceleration) to understand how the body reacts, lifting (changing its vertical position $y$) and rotating (changing its angle $\theta$) as follows:

\begin{align*} \int_{x_1}^{x_0}p(x,t)\,\mathrm{d}x&=m\frac{\mathrm{d}^2y}{\mathrm{d}t^2},\\ \int_{x_1}^{x_0}xp(x,t)\,\mathrm{d}x&=I\frac{\mathrm{d}^2\theta}{\mathrm{d}t^2}. \end{align*} Note that $m$ and $I$ (the body’s moment of inertia) here are multiplied by vertical acceleration ($\mathrm{d}^2y/\mathrm{d}t^2$) and the angular acceleration ($\mathrm{d}^2\theta/\mathrm{d}t^2$), just as in Newton’s second law.

Now let’s think through what happens while the body skims. Before entering the water, the body is ‘dry’. Then, as soon as the body first touches the water, only a single point is wet. Let’s call it $x_0$. As the body plunges further still into the water, more water contacts the body—as shown above right. Underneath the body then, there is a contact region (the ‘wet bit’) where the body meets water, which we denote as being between two points $x_0$ and $x_1$, the trailing and leading edges of this wet region, respectively.

A stone skimming the surface of a liquid layer. If $x_1$ reaches the body’s leading edge, we deem it to have sunk, while if it reaches the trailing edge, this is a skim.

As the contact region grows larger, the pressure between fluid and body increases also, pushing back against the body and potentially lifts it back out of the water. If the force is sufficient, the body skims! Yet, depending on how the body initially makes contact with the water, it may well sink! If the body is too heavy or descends too quickly, that the water will stretch past the end of the body and it will flood, claiming the body to the water’s murky depths.

What does this mean for the avid amateur skimmer?

Through our study we uncovered a relationship between a body’s shape and mass that determines its ability to skim. Greater mass can lead to a ‘super-elastic’ response (a most enticing name)—an almighty leap from the water, with the body leaving at a greater height than it entered. Have we broken physics? Did we find a way to overcome the pesky constraints of the conservation of energy? Fortunately, not!

Bodies with larger masses sink deeper into the water and for longer; this increases the pressure under the body and its contact time with the water. In doing so, the large horizontal velocity of the body (much like your thrown stone) is converted into vertical velocity. Hence the almighty leap. But, interestingly, if the same body becomes more curved—this effect is inhibited. The body rotates more and leaves the liquid layer faster. A shorter, sharper skim. Altogether then, curvature enables heavier bodies to skim that would otherwise sink if they were flatter. For our aircraft icing problem (and your holiday by the beach), this means a broader range of bodies may skim than expected with some possibly rather dramatic results.

The body might sink…

…or skim.

This graph shows the vertical position during a skimming motion of three bodies with different amounts of curvature ($c$):

All three bodies enter the water layer at the same height. The flat body ($c=0$) sinks and so the blue curve stops mid-skim. For the largest curvature case ($c=10$, the orange curve), the body undergoes a successful skim, descending into the liquid layer and rising out again, leaving the liquid layer at a lower height than it entered (this is due, in part, to how the body rotates!). The most interesting case here is $c=1$ (the green curve) in which the so-called super-elastic response is seen: the body is ejected from the liquid layer at a far greater height than it entered!

So, while rock skimming is fun—aircraft icing is a more significant application our findings. Thankfully, our paper is only part of the varied research on skimmers and not the single authority (though one Redditor did question whether we had any authority given ‘they’d never seen a scientist throw anything further than 5 feet…’—it’s the ones closest to the truth that hurt the most).

Still, skimming is as complex as it is fun. So, when you next stand by that body of water, glistening in the sun, ready to skim—you can leave your King Edwards, Maris Pipers and Russets behind. Pick up a weighty stone and see if you can get that almighty super-elastic leap. And when you do, remember, that the phenomenon of skimming goes far beyond the lake.

post

A symmetric universe

One of the joys of studying the science of the universe is using lots of lovely equations that can tell you basically anything you want to know. How far can I kick this football? How long will my phone battery last? How many copies of Chalkdust can stop this bullet?

A slightly more fundamental question might be asking how all of these formulae came about. Perhaps even more fundamentally, why do we expect certain equations to work as they do? Most people would be satisfied giving the answer ‘they were introduced in my high school physics class, and therefore they work!’ I am not most people.

This article aims to answer the latter question by unifying maths and physics via group theory. Group theory is built on the idea of mathematical symmetries, whereas physics is the study of describing and explaining observations using equations. One might naturally ask what happens when we observe a symmetry and want to put some equations behind it.

What is a symmetry?

The first of many beautiful theorems a mathematical physicist learns in their career is Noether’s theorem: any observed symmetry in your favourite physical system has a 1:1 correspondence with a specific conservation law. From this, we can build equations that describe the motion of everything in that system: fields, particles, classical balls, you name it.

But what is a symmetry? Fear not; this is not a mathematical term we throw around lightly and try to link it to something completely unrelated—like rings or holes. A mathematical symmetry is a transformation that leaves an object completely unchanged. Take a simple square; what transformations can you perform on this square that leave it unchanged?

You could rotate the square 90° and it would look exactly the same. Rotating it another 90° leaves it visually unchanged too. And the same for two more 90° rotations. This leaves us with four rotational symmetries of the square:

rotating square

But there are more! If we also consider reflections of the square, there are four ways in which this can occur. This leaves us with four reflectional symmetries:

reflecting square

Therefore, the collection of symmetries of the square has eight transformations: four rotational symmetries and four reflectional symmetries. Now that we have discovered all these symmetries, we can collect them into a mathematical group.

Groups in mathematics are collections of symmetries where combinations of symmetries should also exist in the group. In the case of squares, a rotation followed by a reflection is just another type of reflection. Evidently this is in the group, and so this condition is automatically satisfied. We call a group a Lie group if it behaves smoothly.

The case of squares forms the dihedral group of degree 4 (since a square has four corners), denoted $D_4$. In fact, the symmetries of any regular $n$-gon form the dihedral group of degree $n$, $D_n$, which will always have $2n$ elements. The dihedral groups are unfortunately not Lie groups—this can be seen by the fact that squares have corners and the above transformations would not be smooth. The rotational symmetries of a circle, however, do form a Lie group: it is denoted $\operatorname{U}(1)$.

Enough about shapes though. We’re here to talk about the universe. The universe exhibits symmetries of its own. In the context of physics, these symmetries leave the laws of physics unchanged. If you perform an experiment at one place at a certain time, you should hope to obtain the same observations at a different location at a different time.

Classical symmetries

We would first like to interpret a physical symmetry in an empty universe; one in which we assume no spacetime curvature or air resistance. We know nothing about the laws that describe motion in this space, but we should hope to find them using our intuitions. Suppose I throw a ball away from me while I am standing in one fixed position. It will travel in a straight line, at a constant rate, never interfering with anything and thus never stopping.

spatial translation

Throwing a ball from two platforms in an otherwise empty universe.

It should come as no surprise that if I move the platform I am standing on to a new position and repeat the experiment of throwing the ball away from me, the same exact motions will occur: the ball will travel in a straight line, never stopping. This tells us that the physical laws governing the movement of the ball, whatever they are, should be invariant under a spatial translation. That is, if I translate my initial position through space, the ball still has precisely the same governing motion.

Another way of viewing this is by considering each time step of the ball’s movement. A time step is a regular measurement of time; whether that be a minute, an hour, an aeon, or just one unit. When I release the ball it is in its zeroth time step. After one unit, it is in its first time step and has moved position according to its motion law. The same thing happens for the second time step. If we compare the first and second time steps, these are identical situations in which the ball has moved its position and nothing else.

But remember what we said before: the equations of motion don’t change as the ball changes position. So, at every point in time, the ball keeps its motion. This observation is what Noether identified as the conservation of momentum—any physical system with movement will have a total amount of momentum, usually labelled by a number, and this number is kept the same throughout\hfill the entire time evolution of the system. Momentum cannot be created nor destroyed and is a direct result of the spatial translation symmetry of our empty universe.

swing

Gaining momentum on a swing.

In a similar fashion, this empty universe admits a rotational symmetry, which means the laws of physics are invariant when the entire system is rotated about an arbitrary angle. Imagine now you are spinning the ball on the end of a string; almost as if it is orbiting your head, for example. No matter which orientation or which direction you decide to initially set the ball spinning, you observe that it carries on spinning for all eternity. Remember, there is no gravity or air resistance. The key point is that the equations of motion governing the ball’s motion are the same at each point in its orbit.

To see this, we can use the same timestepping argument. After spinning the ball, it enters its first time step and rotates according to its motion law. After the second time step, it obeys the same law and continues orbiting. As mentioned above, the system admits a rotational symmetry, so having the ball in the second time step means it keeps its motion law from the first. At\linebreak every point in time, the ball continues spinning.

Another conservation law has appeared—the conservation of angular momentum. This is also represented by some number that is unchanged throughout the time evolution of the system. Angular momentum cannot be created nor destroyed and is a direct result of the rotation symmetry of our empty universe.

The main point here is that the translation symmetry allows us to perform experiments in different locations and obtain the same results, and an angular momentum symmetry lets us do these experiments on a rotating object (like the Earth) and obtain the same results.

basketballs

Spinning basketballs at different angles.

Group actions

It turns out that you’re a genius; knowing nothing about this universe, you have seemingly contrived some beautiful conservation laws that will allow you to harness the true power of physics. But alas, nobody will accept your claim without a mathematical formulation. So we’re going to do exactly that.

As with the dihedral groups laying out the symmetries of polygons, there is a group-based way to represent the symmetries of what we should call space. An easy example is via the two-dimensional Cartesian plane, which will act as our sandbox for this universe. If we have a figure on it, whether that be a graph or tear marks, we would like to preserve all of its angles and distances after rotating it. This is analogous to taking our ball and spinning it around without deforming it.

Polar coordinates

Any point $(x,y)$ on the plane can be represented in polar coordinates $(r,\theta)$ measuring its absolute distance from the origin and how far anticlockwise it is rotated from the $x$-axis. This makes further rotation rather easy; to rotate by an arbitrary angle $\alpha$, we transform the coordinates by \[(r,\theta)\mapsto(r,\theta+\alpha).\] This can be noted by drawing a triangle and doing some quick trigonometry to find \[(x,y)=(r\cos\theta,r\sin\theta).\] Then, by moving by some angle $\alpha$, we arrive at the point \[(x’,y’)=(r\cos(\theta+\alpha),r\sin(\theta+\alpha)).\]

polar coordinates

Linear finite groups in mathematics are often described using matrices, which is helpful in our case since that’s what we’re working with. One can use some trigonometric identities to show that rotation in Cartesian space is nothing but matrix multiplication: \[\begin{pmatrix} r\cos(\theta+\alpha)\\r\sin(\theta+\alpha) \end{pmatrix}=\begin{pmatrix} \cos\alpha&-\sin\alpha\\\sin\alpha&\cos\alpha \end{pmatrix}\begin{pmatrix} r\cos\theta\\r\sin\theta \end{pmatrix}.\] If the laws of physics are to remain invariant after any rotation on this plane, then all possible rotations ($\alpha$ going from 0$^{\circ}$ to 360$^{\circ}$) are admitted. By putting all of these together we have uncovered the special orthogonal group $\operatorname{SO}(2)$, denoting the group of rotations on a 2D system. This group behaves just as nicely as the group of circular symmetries (for a somewhat obvious reason!), and we thus crown it a Lie group.

This also generalises to higher dimensions. We have rotation groups in higher dimensions denoted the ‘special orthogonal’ groups of degree $n$, $\operatorname{SO}(n)$. These are Lie groups that consist of rotations around the $n$ axes and all of their combinations. In a physical sense, any of these rotations can be undone via a rotation in the opposite direction. This kind of backwards rotation is an inverse.

Local transformations

Everything I’ve talked about so far is good for mathematical theory. Secretly we were describing global transformations: ones that affect every point in spacetime in the same way. But we’re here to describe interactions within the universe. For that, we need to understand local transformations that depend on where you are in spacetime.

Why is that? The way we described conserved quantities earlier assumed a really boring universe. If you want to put one of these quantities to work by making it interact with other things in nature, while still keeping it conserved, local transformations are the way to do this.

To paint an intuitive picture, imagine that we are in our flat Cartesian land again and at each coordinate we place a clock, or dial, tuned to some time that depends on that specific coordinate. A global transformation could, for example, turn the dials of each clock exactly 90$^{\circ}$—or turn time by three hours. A local transformation would instead turn one clock by three hours, another by five, and perhaps even one back by four.

All of the particles in our universe are quantum-oriented. In quantum mechanics, objects like these particles are described by a quantity called its wavefunction $\psi(x,y)$ that assigns a complex number to every point $(x,y)$ in two-dimensional space. There are many different representations of complex numbers, but for quantum mechanics it is often useful to choose polar form where we interpret the radius as the wavefunction’s modulus, $\vert\psi\vert$, and $\theta$ measures the phase of rotation. If we relate this to clocks on a grid, one could imagine that the placement of the clock describes the value of $\vert\psi\vert$ and the ‘time’ describes the phase.

It might then be sensible to suggest that after a local transformation, where we changed the phase rotation differently at each point, we would change the physics we are trying to describe. But this is not true! This is only a mathematical representation of quantum physics and therefore is not actually physical. The wavefunction itself is unobservable and (most—we’re looking at you, Aharonov–Bohm effect) physical predictions only depend on $\vert \psi\vert$, which is left unchanged by a local transformation.

change of phase

A wavefunction changing its phase.

This change-of-phase invariance is an intrinsic symmetry of the system. Mathematically, this system is locally $\operatorname{U}(1)$-invariant because the action of an element inside $\operatorname{U}(1)$ (complex circular rotation) does not change the laws of physics.\[\operatorname{U}(1)=\{\mathrm{e}^{\mathrm{i}\alpha}:\alpha\in[0^{\circ},360^{\circ}]\}.\] How might we describe this phenomenon mathematically? All this time we have discussed the motion of objects in flatspace, and motion is governed by derivatives. For observable physics we have to make sure that the derivatives themselves obey a local $\operatorname{U}(1)$ transformation so that we can include them in our equations. One of these transformations on $\psi$ looks like $\mathrm{e}^{\mathrm{i}\alpha(\boldsymbol{x})}\psi$, where one should note that $\alpha(\boldsymbol{x})$ is parametrised because it is a local transformation. When we differentiate,\[\partial_{\mu}\left(\mathrm{e}^{\mathrm{i}\alpha(\boldsymbol{x})}\psi\right) = \mathrm{e}^{\mathrm{i}\alpha(\boldsymbol{x})}\left(\partial_{\mu}+\mathrm{i}\partial_{\mu}\alpha(\boldsymbol{x})\right)\psi,\] which tells us that differentiation does not obey local $\operatorname{U}(1)$ transformations. This is an issue because one would really like to describe the motion of particles, and this requires derivatives. But we cannot do this with local transformations alone. A change in physics must be made.

Contracted differentiation

The term $\boldsymbol{x}$ or $x^{\mu}$ represents the 4-dimensional vector $(x^0,x^1,x^2,x^3)$, in which the numbers are used as coordinate indexing as opposed to powers. These values represent a point in space and time and are sometimes written as $(ct,x,y,z)$, where $c$ is the speed of light. The shorthand $\partial_\mu$ is used for $\partial/\partial x^{\mu}$, the partial derivative with respect to each coordinate.

Charge conservation

We’ve seen the general rule of thumb for how local $\operatorname{U}(1)$ transformations give structure to certain mathematical objects: wavefunctions behave nicely, but their dynamics do not. We should probably also come up with a name for this theory that isn’t ‘locally $\operatorname{U}(1)$-symmetric mathematical physics’. To fix this, we need to dive into the existing theory of electromagnetism.

In early days of scientific experimentation, society thought that the theories of electricity and magnetism were completely separate. An astounding result was formulated in the late 19th century, when Maxwell and notable others proved that these were two aspects of the same formulae. How do we understand this? As with previous reasoning, we can package it in a matrix. In 1908, Minkowski introduced the field strength tensor $F_{\mu\nu}$ as \[ F_{\mu\nu}:=\begin{pmatrix} 0&-E_1&-E_2&-E_3\\E_1&0&B_3&-B_2\\E_2&-B_3&0&B_1\\E_3&B_2&-B_1&0 \end{pmatrix},\] that encodes information about electric and magnetic quantities $\boldsymbol{E}$ and $\boldsymbol{B}$. For those interested, the metric signature here is $(-,+,+,+)$ and the speed of light has been normalised. Maxwell’s equations of electromagnetism can be formed by taking derivatives of various components of this tensor in a neat way.

Maxwell’s equations predict that light itself is an electromagnetic wave. It propagates through all of spacetime in little packets we like to call photons. Because of this, we can describe them with wavefunctions. And we already know that a change of wavefunction phase does not affect physics: it seems as though $\operatorname{U}(1)$ transformations and electromagnetism are linked. But Maxwell’s equations involve derivatives—and derivatives, as we have seen, are not locally $\operatorname{U}(1)$ invariant: $\partial_{\mu}\psi\;\not\mapsto\;\mathrm{e}^{\mathrm{i}\alpha(\boldsymbol{x})}\partial_{\mu}\psi$. There is an extra term involving the derivative of $\alpha(\boldsymbol{x})$ that we would like to remove. What we need to do is introduce a mathematical tool that, under a local $\operatorname{U}(1)$ transformation, would transform as this extra term:
\[A_{\mu}\mapsto A_{\mu}+\partial_{\mu}\alpha(\boldsymbol{x}).\] This $A_\mu$ is a mathematical tool that mediates the electromagnetic field and allows us to describe other vector fields, like $\boldsymbol{E}$ and $\boldsymbol{B}$, using its local rotation properties. We can call it the electromagnetic potential. Some may call it the photon field, as it naturally arises when discussing photons.

Instead of using the standard derivative, we now use this difference $\partial_{\mu}-\mathrm{i} A_{\mu}$ which transforms as \[\partial_{\mu}\psi-\mathrm{i} A_{\mu}\psi\mapsto\mathrm{e}^{\mathrm{i}\alpha(\boldsymbol{x})}\left(\partial_{\mu}\psi-\mathrm{i} A_{\mu}\psi\right).\] Result! With just group transformations, complex numbers and derivatives, we have discovered something wonderful. Real-world physics gives us $\boldsymbol{E}$ and $\boldsymbol{B}$, and the maths behind it uses tools such as the electromagnetic potential.

This is where Noether’s theorem comes into play, specifically the second part that deals with local transformations. Remember that any symmetry in a physical system has a determined conservation law. By observation, a local $\operatorname{U}(1)$ symmetry requires the interaction with some potential field $A_{\mu}$, which was the field encoding electric and magnetic information. But in order to interact with the electromagnetic field, our particles must have some charge associated with them. From there, to ensure quantum fluctuations don’t spin out of control, we must have conservation of electric charge.

This immediately gives us a physical representation of what a $\operatorname{U}(1)$ transformation is: the symmetry of charge. Dirac in 1928 used this to say that for every charged particle, there must exist its antiparticle counterpart with the opposite charge. Therefore, electric charge cannot be created nor destroyed and is a direct result of the $\operatorname{U}(1)$ symmetry of our particles.

Where do we go from here?

The standard model is one of the greatest feats of mathematical physics in the 20th century. It is the best quantum description we have of the fundamental forces that govern our entire universe. The best part? It is built by imposing certain symmetries on quantum particles.

More specifically, the standard model is the unification of the Lie groups \[\operatorname{U}(1)\times \operatorname{SU}(2)\times \operatorname{SU}(3),\] with the special unitary groups of degrees 2 and 3 now included. We already saw that $\operatorname{U}(1)$ could be thought of as rotation around a unit circle but is physically charge conservation. Similarly, actions of $\operatorname{SU}(2)$ are movements along the surface of a 4D hypersphere, and actions of $\operatorname{SU}(3)$ are movements along the surface of some other hyperobject. These are physically the conservation of weak isospin and colour charge, thanks to Noether.

When combined, the standard model as a whole exhibits charge–parity–time (CPT) symmetry. The action of swapping all particles for their antiparticles, exchanging chirality, and reversing time keeps the standard model invariant—but only when done together. There also exists the concept of symmetry breaking, whereby some particular symmetries make some massless particles (like the photon) massive. They have to be fixed in a nontrivial way.

These three groups mediate the electromagnetic force and the strong and weak nuclear forces among all known particles. Because of that, the equation for their dynamics is huge: it contains 93 terms. You can buy it on a T-shirt at Cern.

The story, however, does not stop there. At the time of writing, there is a huge amount of research going into a new type of symmetry called supersymmetry, which has the potential to relate all types of quantum particles together in a supergroovy way. This symmetry will present us with the conservation of supercharge (no, I am not making that up). We haven’t yet measured it, but when and if we do, it is bound to change the course of mathematical physics forever.