Can computers prove theorems?

23 October 2019

How do we prove that $2 + 2 = 4$? At school, this might have been taught to you in the following way. You were given a box of little plastic cubes, two cubes were put in one of your hands and then two more cubes in the other, and you were challenged to count how many cubes you had in total.

But this doesn’t really prove that $2 + 2 = 4$: it proves that 2 plastic cubes + 2 plastic cubes = 4 plastic cubes. You could try it again with pencils and show that 2 pencils + 2 pencils = 4 pencils, and after you’ve tried it with sufficiently many things you might become convinced that there is an underlying pattern. But is this a proof that $2 + 2 = 4$? Proving it like this feels a bit like an experimental science—it works with cubes, it works with pencils, and this is evidence that it works in general. I think we are all pretty confident that, whatever the actual rules of maths are, they probably don’t mention pencils.

But what are the rules of maths? Are there any rules at all, or do we all just have some inbuilt intuition as to what constitutes a valid mathematical argument? Before the 1900s, people worked intuitively, and there was broad agreement as to what constituted a correct argument. But as people began to do more complex and abstract mathematics, this approach became problematic, because people’s intuitions could differ. Ask a room full of teenagers whether $0.9999999\ldots=1$, and you will get different opinions. This is because different people have different intuitions about what the real numbers actually are. Differences of opinion as to whether arguments were valid forced mathematicians into actually writing down an official rulebook: the axioms of maths.

So how do we prove $2 + 2 = 4$ just using the axioms of maths and no plastic cubes? Come to think of it, how do we define 2, and how do we define +? The Italian mathematician Giuseppe Peano had some ideas about this in 1889, and his ideas are now called Peano’s axioms.

Peano’s axioms

Giuseppe Peano

Peano wanted to define a number system called the natural numbers, which for him started at zero. So the question is how to find axioms that model the set $\{0,1,2,3,4,\dots\}$. Here is how Peano did it. He defined a constant $\mathtt{zero}$ (to be thought of as 0 in our mental model), and a successor function $S$, which takes a natural number $n$ as input, and spits out another natural number $S(n)$ (to be thought of as the number after $n$).

Here then are Peano’s axioms for defining the natural numbers.

Axiom 1: $\mathtt{zero}$ is a natural number.
Axiom 2: If $n$ is a natural number, then its successor $S(n)$ is also a natural number.
Axiom 3: That’s it.

In a computer language, these axioms might look something like this:

inductive nat
| zero : nat
| S (n : nat) : nat

We’ll say more about this computer language later; before we do, let’s play with these axioms for a bit, and see if we can make a few natural numbers.

At the beginning, we have no natural numbers at all, so axiom 2 is useless to us. However we can use axiom 1 and this lets us make the natural number $\mathtt{zero}$. Now axiom 2 comes into play, and we can use it to make a new natural number $S(\mathtt{zero})$. A good name for this new number would be $\mathtt{one}$. We can similarly define $\mathtt{two}$ to be $S(\mathtt{one})$, otherwise written as $\mathtt{two}:=S(\mathtt{one})$, we can set $\mathtt{three}:=S(\mathtt{two})$ and so on. Here’s the computer language again, taking us up to $\mathtt{four}$.

definition one := S(zero)
definition two := S(one)
definition three := S(two)
definition four := S(three)

Note that we’ve defined four to be $S(\mathtt{three})$, but we can unravel the definition completely, and see that we could have defined four to be $S(S(S(S(\mathtt{zero}))))$ (note that there are four occurrences of $S$ in the definition of $\mathtt{four}$). In fact, it is not hard to convince yourself (because of axiom 3) that the only way of making natural numbers is by starting with $\mathtt{zero}$ and then applying $S$ some number of times; that’s all there is.

Each natural number can be expressed in a reduced form $S(S(S(\dots(S(\mathtt{zero}))\dots)$ and two natural numbers were defined to be equal by Peano if and only if their reduced forms were the same. This was how Peano ruled out things such as $0=1$.

We now have most of the players in our equation $\mathtt{two}+\mathtt{two}=\mathtt{four}$; all that is missing is the definition of addition. Of course addition is not a number, it is a function.

Defining functions

Before a child learns to add, they learn to count. They are challenged to do things such as counting to ten and then to one hundred. They realise that they can in theory get to any number just by starting at zero and counting. This idea is what Peano axiomatised. But once a child’s counting is solid, they are ready to learn addition, and this is what we will define next.

The addition function add takes two natural numbers $m$ and $n$ as input, and it outputs their sum $\mathtt{add}(m,n)$, a number more commonly written as $m+n$. Before we make the general definition, let’s step back, forget about axioms, go back to the usual notation for numbers, and think about what we know already about addition. Let’s see if we can isolate the ideas which capture the concept of addition perfectly.

Let’s consider the special case where $n=0$. What should $m+0$ be? We are trying to define addition which agrees with our intuitive notation: of course $m+0$ should be $m$, so this case is easy.

Now here’s a more interesting question. Say we’re building mathematics from scratch, and we’re currently halfway through defining the addition function. The answers to some addition questions have been defined already, but others have not. Say $m$ is a number and we know that $m+58=92$, but we haven’t defined $m+59$ yet. What should we define $m+59$ to be? We could solve for $m$ and work everything out, but there’s a quicker way! We know that $m+59$ is one more than $m+58$, so if $m+58$ is $92$ then $m+59$ should be $93$. Or as Peano might say, if $m+58=92$, then $m+S(58)=S(92)$. And of course $58$ is not special: if $m+n=t$ then if $m+S(n)$ should equal $S(t)$.

We have isolated two general principles of addition here:

Rule 1: $m+0=m$.
Rule 2: If we have defined $m+n$ already, then $m+S(n)$ equals $S(m+n)$.

In code it looks like this:

def add : nat → (nat → nat)
| m zero := m
| m (S(n)) := S(add m n)

Our first rule tells us how to define $m+0$ for every $m$. Because $1=S(0)$, our second rule now tells us how to define $m+1$ for every $m$. Because $2=S(1)$, we can use the second rule again to define $m+2$ for every $m$. This is looking promising! But can we prove that we have defined $m+n$ for every value of $n$? This looks like it should be a proof by induction. The first rule gives us the base case, that $m+0$ is defined. The second rule gives us the inductive step: if $m+n$ is defined, then so is $m+S(n)$.

But how do we know that induction works for Peano’s natural numbers? It’s because of Peano’s axiom 3. Every natural number is either zero, or the successor of a number that was born before it was, and that’s it. Hence if we define something for zero, and if we define it for $S(n)$ given that it’s defined for $n$, then we have defined something for every natural number. We can conclude that we have now defined addition! Now let’s test it, and let’s see if we can prove that $2+2=4$.

2 + 2 = 4

Again let’s use usual mathematical notation for numbers in this section, remembering that they are really all defined in terms of zero and $S$. We can now attempt to give a complete proof that $2+2=4$. The problem with addition is that we only defined $m+0$ and $m+S(n)$, so if we want to work out something like $m+2$ then we have to rewrite $2$ as $S(1)$ and then use the fact that $m+S(1)=S(m+1)$ by definition.

Here is a complete proof that $2+2=4$.

2 + 2 = 2 + S(1)     -- by definition of 2
      = S(2 + 1)     -- by definition of "+ S(n)"
      = S(2 + S(0))  -- by definition of 1
      = S(S(2 + 0))  -- by definition of "+ S(n)"
      = S(S(2))    -- by definition of "+ 0"
      = S(3)       -- by definition of 3
      = 4          -- by definition of 4

And here is how it looks on a computer (using the usual notation $\mathtt{+}$ for $\mathtt{add}$):

theorem two_add_two : two + two = four :=
begin
  refl
end

Wait—whatever does $\mathtt{refl}$ mean? And how come the computer’s proof is shorter than ours?

Proving theorems on a computer

The tactic select menu in a fictional game for the Sega Master System.

Lean is free, open source software being developed by a team led by Leonardo de Moura at Microsoft Research. Lean is a programming language whose commands correspond to the axioms of mathematics. Lean starts off by knowing how to do maths, but not knowing any maths at all. The code we saw above is valid Lean code. What is interesting is that after we taught Lean about natural numbers and addition, Lean figured out how to prove various things like $2+2=4$ all by itself. Here $\mathtt{refl}$ is a tactic, and what it does is that it simplifies both sides as much as it can and then checks that they have become the same. If this works, the proof compiles and we’ve proved our theorem. If it doesn’t work, the tactic fails and we will need to try another tactic. Lean has a bunch of tactics built in, many of them far more powerful than $\mathtt{refl}$.

Lean is not a completely automatic theorem prover though—it often needs help from humans to prove theorems. Lean is an interactive theorem prover. Think about it like this. Lean is a framework which turns mathematical statements into levels of a computer game. Any maths theorem, from $2+2=4$ to Fermat’s Last Theorem, when formalised in Lean, becomes a level of the game. If you manage to use Lean’s tactics to prove a theorem, you have solved the level. My favourite computer game used to be the Zelda series but now my favourite computer game is Lean.

Let me show you some more easy Lean levels. Let’s think about some more simple properties of addition, and see how we can prove them using Lean.

The final boss in Addition World.

By our definition of addition, we know that $n+0=n$ for every integer $n$. But what about $0+n$? We might instinctively just say that this is obviously $n$ too, because this is immediately obvious using our “real world” model of addition. But this is not how the game works! The fact that $a+b=b+a$ is a level in our game, but it’s quite a tricky one for beginners, and we have not got to it yet; it’s like the boss level for the Addition World level set in Lean.

So let’s get back to $0+n$. Recall that the definition of $0+n$ depends on whether $n$ is zero or a successor, and so we are going to have to prove $0+n=n$ by induction.

Base case: $n=0$. Here we have to prove $0+0=0$ and this is true by definition of $+0$, because $n+0$ is defined to be $n$.

Inductive step: let’s assume the hypothesis $H$ : $0+d=d$ and try and prove that $0+S(d)=S(d)$. In fact this is not too hard: $0+S(d)=S(0+d)$ by definition of $+$ and this equals $S(d)$ by the inductive hypothesis. We are done! In Lean this proof looks like this:

theorem zero_add (n : nat) : zero + n = n :=
begin
  induction n with d H,
  { -- base case
    refl -- true by definition
  },
  { -- inductive step
    show S (zero + d) = S(d),
    rewrite H, -- H is the hypothesis that zero + d = d
    -- Lean finishes the proof automatically
  }
end

This is the last Lean proof which I will give here. The reason is that for proofs any more complex than this, it gets hard to follow them on paper. The way to understand these proofs best is to look at them in Lean itself. If a proof were running in Lean on a computer then you could click anywhere on the proof and see Lean’s state (what it knows and what it is trying to prove). For example, just after the show tactic in the above proof, Lean’s state looks like this:

1 goal
d : nat,
H : zero + d = d
⊢ S (zero + d) = S d

Note that Lean writes $\mathtt{S\;d}$ not $\mathtt{S(d)}$, this is just some cool computer science thing that they do in functional programming languages, and Lean is a functional programming language.

It’s not hard to get Lean running on your computer — you don’t even have to install it, you can just use it through a web browser. For more details, go to my blog xenaproject.wordpress.com and search for ‘Chalkdust’. There are links there that take you straight into Lean’s Addition World, which contains a few relatively simple theorems/levels about addition on Peano’s natural numbers.

If however you still prefer pen and paper, I will tell you that the next level in Addition World is to prove that addition is associative, ie that $(a+b)+c=a+(b+c)$ for all natural numbers $a$, $b$, and $c$. Try proving it by induction on $c$. Once you’ve done this, you could try proving addition is commutative, ie, that $a+b=b+a$. That level is the boss level for Addition World, so once you’ve done it you can move onto the next set of levels, called Multiplication World.

Multiplication World, featuring the notorious exponentiation ghost.

How would you define multiplication? Can you prove $1\times m=m$? What about $a\times (b+c)=a\times b+a\times c$? How would you define exponentiation? Can you prove $(a^b)^c=a^{b\times c}$? Note that exponentiation is not associative, in contrast to addition and multiplication. Doing these levels on paper can become a bit tricky; it’s much easier to do them within the software. There are several worlds for Peano’s natural numbers: there is an Inequality World for example, which has got some pretty tricky levels in. And then on top of that is a Diophantine Equation World, which has a bunch of levels about solving equations in natural numbers and contains such levels as Fermat’s Last Theorem. Nobody has ever solved this level in Lean, or in any of the other computer proof verification systems either. Note that the level is extremely easy to make:

theorem FLT (a b c n : ℕ) (h : n > 2) : a ^ n + b ^ n = c ^ n → a * b = 0 :=
begin
  -- insert millions of lines of code here
end

It’s just very very very hard to solve.

Going further

It’s all very well answering questions about the natural numbers. But what other kinds of pure mathematics can be done in this way? I believe that all of pure mathematics can be done in this way. Groups, rings, differential equations, complex manifolds, you name it. Lean’s maths library started in 2017 and is growing and growing. When I started learning about Lean in 2017 the maths library didn’t even have the complex numbers in it, but it did have the real numbers, and all the levels in Real Number World had been solved (computer scientists call this ‘making an API for the real numbers’) so I added the complex numbers myself:

structure complex : Type := 
(re : ℝ) (im : ℝ)

I now had to define addition, subtraction and multiplication and division of complex numbers, and prove basic things such as the fact that multiplication was commutative. All the results I needed about the real numbers were already proved however, so the task was not hard; Basic Complex Number World was a bunch of pretty easy levels. Lean now has a bunch of undergraduate level mathematics, including groups, fields, vector spaces, rings and modules, metric spaces and topological spaces, bilinear maps and so on. The library is growing really quickly. In 2018, I worked with a team of undergraduates at Imperial College London to define schemes in Lean. Schemes were discovered (or invented, depending on your philosophy) by Alexandre Grothendieck in the late 1950s and their discovery heralded the beginning of modern algebraic geometry. Schemes are normally taught at masters or beginning PhD level—but they’re just a definition really, like groups or the natural numbers.

There are lots of levels that still need to be unlocked.

In 2011, the brilliant young German mathematician Peter Scholze discovered the notion of a perfectoid space. Scholze used perfectoid spaces to prove many new mathematical theorems, and in particular he made some great advances in the Langlands Philosophy, a series of conjectures that lie at the heart of modern number theory. Scholze was awarded a Fields medal, one of the highest honours in mathematics, in 2018. In 2019, Patrick Massot, Johan Commelin and I formalised the definition of perfectoid space in Lean. It took well over 10,000 lines of computer code, a substantial part of which involved solving some very cunning levels in Topological Ring World, the computer game version of chapters 2 and 3 of Topologie Générale by Nicolas Bourbaki. Within two years Lean has gone from not knowing what the complex numbers are to knowing some very modern and fashionable mathematics, and one can ask what the future holds for this software. Perfectoid Space World is now unlocked in Lean, and the master puzzler Scholze has set some pretty fiendish levels in there. Unfortunately, we might have to unlock several more worlds before we are powerful enough to solve some of the serious levels in Perfectoid Space World.

Nobody really knows how far things can go. We mathematicians are making a database of theorems. Undergraduates from across the world are contributing. We are teaching Lean a standard undergraduate maths syllabus and then we will teach it more. It’s really good fun and anyone can join in.

Occasionally we mathematicians find that we need a new tactic—for example proving that $(a+b)^3=a^3+3a^2b+3ab^2+b^3$ just from the axioms of a ring is surprisingly difficult! But computer scientists are also involved. We asked the computer scientists if they could make us an “expand out the brackets and tidy up” tactic, and they did. Any mathematical technique that can be explained as an algorithm can be turned into a tactic in Lean.

Computer scientists are interested in writing AI programs that can beat humans at things. Computers can currently beat humans at chess, Go and StarCraft. Will computers one day be able to beat us at mathematics? Machine learning algorithms can learn many things, but they work best with a database. We mathematicians are making this database using Lean.

Software like this can also be used for teaching. If students learnt the language which Lean uses, university lecturers could set homework requiring proofs in Lean, and students could write their solutions in Lean. One advantage of this for the students would be that they would know immediately if their proofs were correct or incorrect, giving them instant feedback. It would also be a neat new way for the students to learn mathematics. The advantages for lecturers like me is that I would not have to mark the work—the computer could mark it for me.

However, if computers do get better than humans at proving theorems, I might be out of a job anyway.

Kevin Buzzard

Kevin is a professor of pure mathematics at Imperial College London. The Xena Project meets to discuss Lean every Thursday evening during term time in the computer room of the maths department at Imperial College; bring a laptop. He is currently writing a book for undergraduate mathematicians about how to learn Lean.

In conversation with Clifford Cocks
We chat to the crypto chief about inventing RSA... but not being able to tell anyone
On √2
Yiannis Petridis connects square roots and continued fractions
Spotlight on: Pamela Harris
Pamela E Harris's story, as told by Talithia Williams
Artificial music
Carmen Cabrera Arnau explores the use of AI in composition
They might not be giants
Angela Brett might not be standing on their shoulders
On the cover: Islamic geometry
Explaining the mathematics of tiling, and the cover of Issue 10