post

In conversation with Ulrike Tillmann

They say variety is the spice of life and to us at Chalkdust, maths is life so it makes sense that maths is made better by variety. A variety of topics, a variety of people, a variety of poorly constructed maths puns. Ulrike Tillmann embodies this ethos with her work bridging the gap between pure and applied maths. Despite spending most of her academic career in the UK, Ulrike has lived in several other countries. She was born in Germany and then went on to study in the US. She is now a professor of pure mathematics at the University of Oxford and a fellow of the Royal Society, balancing her time between research, teaching, and outreach. She sat down with us to chat about her career and what the future holds, both for her and maths in general.

Taking the reigns

If you’ve been following maths news in the past few months, the name ‘Ulrike Tillmann’ may be particularly familiar to you. It was announced recently that she will be the next president of the London Mathematical Society, one of the UK’s five ‘learned societies’ for mathematics. She will also take up the mantle as director of the Isaac Newton Institute, a research institute at the University of Cambridge, in autumn of this year. Research institutes are perhaps the least well-known entities in the academic world (as viewed from the outside), often only visited by some of the most senior academics in a field. We asked Ulrike to explain what they are all about. “The Isaac Newton Institute runs mathematical programmes in quite a broad range of areas. These programmes typically run between four and six months and researchers come from all over the world to concentrate on their research.” The programmes are beneficial not only to individual mathematicians, but to the community as a whole. “Being together with your colleagues who are also experts in your area, and who are often completely spread all over the world, is a fantastic thing. It brings the field forward and it can make a big difference to that research area.” On paper, the role of director will involve overseeing the organisation of these programmes, but she sees it going beyond this, including “making sure that things like equality and diversity are not just observed, but also incorporated.”

Continue reading

post

On conditional probability: Cards, Covid, and Crazy Rich Asians

I was watching the film Crazy Rich Asians the other day, as there’s not a lot to do at the moment besides watching Netflix and watching more Netflix. I thoroughly enjoyed the film and would highly recommend it. However, there was something that happened in the first few minutes which really got me thinking and inspired the subject of this Chalkdust article.

The main character in the film is an economics professor (the American kind where you can achieve the title of professor while still being in your 20s and without having to claw your way over a pile of your peers to the top of your research field). Within the opening scenes of the film we see her delivering a lecture, in which she is playing poker with a student, while also making remarks about how to win using ‘mathematical logic’. The bell rings seemingly halfway through the lecture the way it always does in American films and TV shows, and our professor calls out to her students “…and don’t forget your essays on conditional probability are due next week!” Now, I am not going to delve into the question of what type of economics course she is teaching that involves playing poker and mathematical logic, but it got me thinking—what exactly would an ‘essay on conditional probability’ entail?

What is conditional probability?

Conditional probability is defined as a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion, or evidence) has already occurred. For all intents and purposes here, for two events $A$ and $B$, we’ll write the conditional probability of $A$ given $B$ as $P(A \mid B)$, and define it as \[P(A \mid B) = \frac{P(A\;\text{and}\;B)}{P(B)}.\] The little bar $ \mid $ can just be thought of as meaning ‘given’.

Now that we’ve got some technicalities out of the way, let’s look at some examples of conditional probability. Imagine you are dealt exactly two playing cards from a well-shuffled standard 52–card deck. The standard deck contains exactly four kings. What is the probability that both of your cards are kings? We might, naively, say it must be simply $(4 / 52)^2 \approx 0.59$, but we would be gravely mistaken. There are four chances that the first card dealt to you (out of a deck of 52) is a king. Conditional on the first card being a king, there remains three chances (out of a deck of 51) that the second card is also a king. Conditional probability then dictates that: \begin{align*} P(\textrm{both are kings}) &= P(\textrm{second is a king} \mid \textrm{first is a king}) \times P(\textrm{first is a king}) \\ &= \frac{3}{51}\times \frac{4}{52}\approx 0.45\%.\ \end{align*} The events here are dependent upon each other, as opposed to independent. In the realm of probability, dependency of events is very important. For example, coin tosses are always independent events. When tossing a fair coin, the probability of it landing on heads, given that it previously landed on heads 10 times in a row, is still $1/2$. Even if it lands on heads 1000 times, the chance of it landing on heads on the 1001st toss is still 50%.

Bayes’ theorem

Any essay on conditional probability would be simply incomplete without a mention of Bayes’ theorem. Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. It is stated mathematically as:

 

 

Bayes’ theorem

$P(A\mid B) = \frac{P(B \mid A)P(A)}{P(B)}.$

 

 

We can derive Bayes’ theorem from the definition of conditional probability above by considering $P(A \mid B)$ and $P(B \mid A)$, and using that $P(A\;\text{and}\;B)$ equals $P(B\;\text{and}\;A)$.

A fun (and topical!) example of Bayes’ theorem arises in a medical test/screening scenario. Suppose a test for whether or not someone has a particular infection (say scorpionitis) is 90% sensitive, or equivalently, the true positive rate is 90%. This means that the probability of the test being positive, given that someone has the infection is 0.9, or $P(\textrm{positive}\mid\textrm{infected}) = 0.9$. Now suppose that this is a somewhat prevalent infection, and 6% of the population at any given time are infected, ie $P(\textrm{infected}) = 0.06$. Finally, suppose that the test has a false positive rate of 5% (or equivalently, has 95% specificity), meaning that 5% of the time, if a person is not infected, the test will return a positive result, ie $P(\textrm{positive}\mid\textrm{not infected}) = 0.05$.

Now imagine you take this test and it comes up positive. We can ask, what is the probability that you actually have this infection, given that your test result was positive? Well, \[P(\textrm{infected} \mid \textrm{positive}) = \frac{P(\textrm{positive} \mid \textrm{infected}) P(\textrm{infected})}{P(\textrm{positive})}.\] We can directly input the probabilities in the numerator based on the information provided in the previous paragraph. For the $P(\textrm{positive})$ term in the denominator, this probability has two parts to it: the probability that the test is positive and you are infected (true positive), and the probability that the test is positive and you are not infected (false positive). We need to scale these two parts according to the group of people that they apply to—either the proportion of the population that are infected, or the proportion that are not infected. Another way of thinking about this is considering the fact that \[P(\textrm{positive}) = P(\textrm{positive and infected}) + P(\textrm{positive and not infected}).\] Thus, we have \[P(\textrm{positive}) = P(\textrm{positive} \mid \textrm{infected})P(\textrm{infected}) + P(\textrm{positive} \mid \textrm{not infected})P(\textrm{not infected}).\]

And we can infer all the probabilities in this expression from the information that’s been given. Thus, we can work out that\[P(\mathrm{infected}|\mathrm{positive}) = \frac{0.9\times 0.06}{0.9\times 0.06 + 0.05\times 0.94}\approx 0.5347.\]

In a population with 6% infected by scorpionitis, this test will come back positive (purple shaded background) in 10.7% of cases.

Unpacking this result, this means that if you test positive for an infection, and if 1 in 17 people in the population (approximately 6%) are infected at any given time, there is an almost 50% chance that you are not actually infected, despite the test having a true positive rate of 90%, and a false positive rate of 5% (compare to the proportion of the shaded area in the diagram filled by infected people). That seems pretty high. Here are some takeaways from this example: the probability that you have an infection, given that you test positive for said infection, not only depends on the accuracy of the test, but it also depends on the prevalence of the disease within the population.

Unprecedented applicability

If 3% are infected with Covid-19 (purple person), a lateral flow test will come back positive (purple shaded background) 2.7% of the time.

Of course, in a real-world scenario, it’s a lot more complicated than this. For something like (and, apologies in advance for bringing it up) Covid-19, the prevalence of infection (our $P(\textrm{infected})$ value) changes with time. For example, according to government statistics, the average number of daily new cases in July 2020 was approximately 667, whereas in January 2021 it was 38,600. Furthermore, $P(\textrm{infected})$ depends on a vast number of factors including age, geographical location, and physical symptoms to name only a few. Still, it would be nice to get a sense of how Bayes’ theorem can be applied to these uNpReCeDeNtEd times.

An article from the UK Covid-19 lateral flow oversight team (catchy name, I know) released on 26 January 2021 reported that lateral flow tests (which provide results in a very short amount of time but are less accurate than the ‘gold standard’ PCR tests) achieved 78.8% sensitivity and 99.68% specificity in detecting Covid-19 infections. In the context of probabilities, this means that \begin{align*} &P(\textrm{positive} \mid \textrm{infected}) = 0.788 \textrm{ and}\\ &P(\textrm{positive} \mid \textrm{not infected}) = 0.0032. \end{align*} On 26 January 2021, there were 1,927,100 active cases of Covid-19 in the UK. Out of a population of 66 million, this is gives us a prevalence of approximately 3%, or $P(\textrm{infected}) = 0.03$.

Taking all these probabilities into account, we have \[P(\textrm{infected} \mid \textrm{positive}) = \frac{0.788 \times 0.03}{0.788 \times 0.03 + 0.0032 \times 0.97} \approx 0.8839,\] which means that the chances of you actually having Covid-19, given that you get a positive result from a lateral flow test, is about 88%. This seems pretty good, but can we make this any better?

If the prevalence increases to 80%, you can be much more certain of a positive result, but there are also more false negatives.

Instead of just taking the number of active cases as a percentage of the total population of the UK to give us our prevalence, we can alternatively consider $P(\textrm{infected})$ for a particular individual. For someone who has a cough, a fever, or who recently interacted with someone who was then diagnosed with Covid-19, we could say that their $P(\textrm{infected})$ is substantially higher than the overall prevalence in the country. The article Interpreting a Covid-19 test result in the BMJ suggests a reasonable value for such an individual would be $P(\textrm{infected}) = 0.8$. It’s worth mentioning that this article has a fun interactive tool where you can play around with sensitivity and specificity values to see how this affects true and false positivity and negativity rates. Taking this new value of prevalence, $P(\textrm{infected})$, into account, then \[P(\text{infected} \mid \text{positive}) = \frac{0.788 \times 0.8}{0.788 \times 0.8 + 0.0032 \times 0.2} \approx 0.9990,\vphantom{\frac{a}{\frac{c}{b}}}\] giving us a 99.9% chance of infection given a positive test result, which is way closer to certainty than the previous value of 88%.

Can we do any better than this? Well, compared with the lateral flow Covid-19 tests, it has been found that PCR tests (which use a different kind of technology to detect infection) have substantially higher sensitivity and specificity. Another recent article in the BMJ published in September 2020 reported that the PCR Covid-19 test has 94% sensitivity and very close to 100% specificity. In a survey conducted by the Office for National Statistics in the same month, they measured how many people across England and Wales tested positive for Covid-19 infection at a given point in time, regardless of whether they reported experiencing symptoms. In the survey, even if all positive results were false, specificity would still be 99.92%. For the sensitivity and specificity reported in the BMJ article, this is equivalent to having a false negative rate of 6% and a false positive rate of 0%. If we plug these numbers in, regardless of what the prevalence is taken to be, we have: \[P(\textrm{infected} \mid \textrm{positive}) = \frac{0.94 \times P(\textrm{infected})}{0.94 \times P(\textrm{infected}) + 0 \times P(\textrm{not infected})} = 1.\] So when a test has a false positive rate of almost 0%, if you achieve a positive test result, there is essentially a 100% chance that you do in fact have Covid-19.

So what can we take away from this? Well, we have seen that if a test has higher rates of sensitivity and specificity, then the probability of the result being a true positive is also higher. However, prevalence and the value of the probability of infection also play a big role in this scenario. This could be used as an argument for targeted testing only, for example if only people with symptoms were tested then this would increase the probability of the result being a true positive. Unfortunately, it is the case that a large number of Covid-19 infections are actually asymptomatic—in one study it was found that over 40% of cases in one community in Italy had no symptoms. So, if only people with symptoms were tested, a lot of infections would be missed.

 

 

Bae’s theorem

$P(\text{Netflix} \mid \text{chill}) =$

$\frac{P(\text{chill} \mid \text{Netflix})P(\text{Netflix})}{P(\text{chill})}$

 

In conclusion, I’m no epidemiologist, just your average mathematician, and I don’t really have any answers. Only that conditional probability is actually pretty interesting, and it turns out you can write a whole essay on it. The ending of Crazy Rich Asians was much better than the ending to this article. Go watch it, if you haven’t already.

post

The maths of Mafia

It was 8pm on a wintry Saturday and I was pleading for my life. “I would never betray you. I promise.” I searched desperately for someone, anyone, to back me up. Of course, they were right. I had been responsible for the murder of many of their friends but I wasn’t about to admit to that. My co-conspirators had gone quiet, well aware that to support me was to put themselves into the firing line. At 8.55pm a vote was held. By 9pm I had been executed.

OK, so that first paragraph may have been a bit misleading. Thankfully I was not actually put to death, and I haven’t killed anyone in real life. This all happened whilst I was playing Mafia. For those unfamiliar, Mafia is strategy game in which players are (secretly) assigned to be either citizens or mafia. The game is split up into day and night phases (when playing in person, night is simulated by everybody closing their eyes). During the night phase, the Mafia are able to communicate with each other and can vote to kill one person. During the day phase, all the residents (both citizens and mafia) discover who died, and then vote to execute one resident. The aim for each group is to eliminate the other.

Some of you may be reading this thinking “Hmmmm this is sus. It sounds very much like Among Us” and you’d be right. The popular online game was inspired by Mafia and is one of many adaptations of the game. The particular version I was playing—when I was outed as a Mafia member—was Harry Potter themed (if there’s one thing you’ll learn about me during this article, it’s that I’m incredibly cool). People found themselves either on Team Hogwarts or Team Death Eater, and there were some special Potter themed rules, which led to an interesting situation mathematically. Continue reading

post

Who is the best England manager?

My quest to answer this simple question began for the noblest of reasons—to win an argument with my wife. We are both football fans and have been following England all our lives (the phrase ‘long-suffering’ has never been more apt). Our house is usually an oasis of calm and tranquillity, but one thing is guaranteed to get things kicking off: was Sven-Göran Eriksson a good England manager?

This year, we have had more time than usual at home together and the discussion has become heated. I believe that Sven took a golden generation of England players and led them to disappointing performances in three major tournaments and she points to the team reaching the last 8 at consecutive World Cups under his stewardship. I’ve been a maths teacher for over 25 years, so surely, I can prove I am correct using maths, right? (At this point, it is definitely not worth mentioning that in the 15 years we’ve been having this row, I never thought of applying any maths to the problem until my wife suggested it.)

How do I prove that I am right? Well, the first, and most obvious, thing to do is to look at the playing record of Sven-Göran Eriksson. He was manager of England from January 2001 until July 2006. In that time the team played 67 games and won 40 of them—a win percentage of 59.7%.

On its own that is a bit meaningless, so we need something to compare it to. It’s time for a spreadsheet.

Walter Winterbottom

I am going to tidy up the data a little though. Firstly, up to 1946, there was no England coach. Even under Walter Winterbottom, the players were selected by committee, so I am going to exclude them. Secondly, caretaker managers like Stuart Pearce or Joe Mercer (or Sam Allardyce who was sacked after one game for ‘reasons’) didn’t have enough games and so I’ll drop them from consideration. That gives us the trimmed list below, ranked by winning percentage.

This shows that Sven had a pretty average record, only just reaching the top half of the table. I was feeling suitably pleased with myself for having come up with such a convincing statistic, only to be shot down with, “Yeah, but a lot of those games were meaningless friendlies.” I mean, you could argue that playing for England in any game is the pinnacle of a footballer’s career, and that international friendlies are always important games, but I decided to look at this as it seemed interesting (and I was confident that it would support my point even more).

P W %
Fabio Capello 2008–2011 42 28 66.7%
Alf Ramsey 1963–1974 113 69 61.1%
Glenn Hoddle 1996–1999 28 17 60.7%
Ron Greenwood 1977–1982 55 33 60.0%
Sven-Goran Eriksson 2001–2006 67 40 59.7%
Gareth Southgate 2016–2020 49 29 59.2%
Roy Hodgson 2012–2016 56 33 58.9%
Steve McClaren 2006–2007 18 9 50.0%
Bobby Robson 1982–1990 95 47 49.5%
Don Revie 1974–1977 29 14 48.3%
Terry Venables 1994–1996 23 11 47.8%
Graham Taylor 1990–1993 38 18 47.4%
Kevin Keegan 1999–2000 18 7 38.9%
Selected England managers’ full competitive international results (Correct to Nov 2020)

Since we were looking at competitive internationals, I decided to look at the overall results record rather than using just the win percentage. After all, there are three possible outcomes in football and a draw has value (although this value varies with the opponent—a draw against Brazil is generally seen as a fairly decent result, whereas as a draw against Greece is not).

To calculate this, I used 3 points for a win and 1 for a draw. This has been the standard across football since the 1980s as it rewards positive play. This may disadvantage the managers from before it was introduced because playing for a draw would have been more profitable in group games and qualifiers, but I feel it is the best of the options available (and I’m trying to prove that Sven was a negative manager and I think this will help me)…

P W D L F A Win % Pts Pts available Pts %
Sven-Goran Eriksson 38 26 9 3 69 26 68.4% 87 114 76.3%
Fabio Capello 22 15 5 2 54 16 68.2% 50 66 75.8%
Ron Greenwood 26 17 5 4 48 17 65.4% 56 78 71.8%
Roy Hodgson 31 19 9 3 73 18 61.3% 66 93 71.0%
Glenn Hoddle 15 9 3 3 26 8 60.0% 30 45 66.7%
Don Revie 10 6 2 2 22 7 60.0% 20 30 66.7%
Alf Ramsey 33 20 6 7 56 29 60.6% 66 99 66.7%
Gareth Southgate 36 22 6 8 80 29 61.1% 72 108 66.7%
Bobby Robson 43 22 14 7 90 22 51.2% 80 129 62.0%
Terry Venables 5 2 3 0 8 3 40.0% 9 15 60.0%
Graham Taylor 19 8 8 3 34 14 42.1% 32 57 56.1%
Kevin Keegan 11 4 3 4 17 10 36.4% 15 33 45.5%
Selected England managers’ full competitive international results (Correct to Nov 2020). Terry Venables’ stats are decimated here because in the run up to Euro 96, England were only playing friendlies because they had already qualified as hosts.

This did not go well and there was a significant amount of smugness, which I felt was inappropriate and irritating.

To be honest, this is a compelling result and I needed to come back strong if I was to maintain any credibility in this argument. I felt a little disappointed that I had done all that work to prove this important point and it wouldn’t be any use. I was starting to get concerned that manipulating statistics to get the result I wanted was not working, when a thought occurred to me—I might be able to use the Fifa ranking data to demonstrate that Sven-Göran Eriksson’s England team was only able to beat lesser teams and often struggled against higher ranking sides. In short, I chose to take a leaf from the Trump playbook—when you’re in trouble, smear the opposition.

OK, so it’s not classy but, in this case, I think it is a valid point to explore. Were most of Sven’s competitive games against weaker opposition? This is a possibility because qualifiers and group games are seeded, and so England would be facing so-called lesser teams. For example, let’s consider the 2006 World Cup qualifying group.

Team Pld Pts Ranking
England 10 25 9
Poland 10 24 23
Austria 10 15 72 (=)
Northern Ireland 10 9 101
Wales 10 8 72 (=)
Azerbaijan 10 3 113
The final positions and 2005 Fifa world rankings of Uefa group 6 in the 2006 World Cup qualifying.

Only Poland finished ranked in the world’s top 50 international teams, which supports my contention that England were flat-track bullies under Sven. But this raised two interesting (in my opinion) questions:

  1. How are the Fifa rankings calculated?
  2. How can I use them to win this argument?

The Fifa ranking system

The Fifa ranking system was introduced in December 1992, and initially awarded teams points for every win or draw, like a traditional league table. However, Fifa quickly (five years later) realised that there were many other factors affecting the outcome of a football match and, over time (over twenty years) moved to a system based on the work of Hungarian–American mathematician Árpád Élő,more on him in a moment (I mean, why use an established and respected system when you can faff about making your own useless one? To be fair, the women’s rankings have used a version of the Elo system since their inception, which may make Fifa’s unwillingness to use it for the men even stranger).

The Fifa rankings are not helpful to me because they don’t cover all the managers I’m considering and because their accuracy, reliability and the many methods used to generate them were always questioned. Luckily, football fans have had these arguments before and there is an Elo ranking for all men’s international teams, which has been calculated back to the first international between England and Scotland in 1872 (a disappointing goalless draw).

Competitors in an esports event

The Elo rating system compares the relative performance of the competitors in two-player games. Although it was initially developed for rating chess players, variations of the system are used to rate players in sports such as table tennis, esports and even Scrabble. Strictly speaking, we should be saying an Elo system, rather than the Elo system as each sport has modified the formula to suit their own needs.

So how does an Elo system calculate a ranking? Well, at the most basic level, each team has a certain number of points and at the end of each game, one team gives some points to the other. The number of points depends on the result and the rankings of the two teams. When the favourite wins, only a few rating points will be traded, or even zero if there is a big enough difference in the rankings (eg in September 2015, England beat San Marino 6–0, but no Elo points were exchanged). However, if the underdog manages a surprise win, lots of rating points will be transferred (for example, when Iceland beat England at Euro 2016, they took 40 points from England). If the ratings difference is large enough, a team could even gain or lose points if they draw. So teams whose ratings are too low or too high should gain or lose rating points until the ratings reflect their true position relative to the rest of the group.

But how do you know how many points to add or take away after each game? Elo produced a formula for this, but there is a bit of maths—brace yourself.

Firstly, Elo assumed that a team would play at around the same standard, on average, from one game to next. However, sometimes they would play better or worse but with those performances grouped towards the average. This is known as a normal distribution (Elo uses a logistic distribution rather than the normal, but the differences are small—I mean, what’s a couple of percent between friends?) or bell curve, where outstanding results are possible but rare. In the graph below, the $x$-axis would represent the level of performance, and the $y$-axis shows the probability of that happening. So, we can see that the chance of an exceptional performance is smaller than that of an unremarkable one and the bulk of games will have a middling level of skill shown.

This means that if both teams perform to their standard, we can predict an expected score, which Elo defined as their probability of winning plus half their probability of drawing. Because we do not know the relative strengths of both teams, this expected score is calculated using their current ratings and the formulas \begin{align*} E_A&=\frac1{1+10^{(R_B-R_A)/400}} & &\text{and} & E_B&=\frac1{1+10^{(R_A-R_B)/400}}. \end{align*}

The expected score for a range of differences in team ratings.

In these formulas, $E_A$ and $E_B$ are the expected results for the teams, and $R_A$ and $R_B$ are their ratings. If you plot a graph of the $E$ values for different values of $R_A-R_B$ you get the graph shown to the left.

It’s interesting (again, interesting to me) to note the shape of this graph, which is a sigmoid, a shape that anyone who has drawn a cumulative frequency graph for their GCSE maths will recognise. It is an expression of the area under the distribution (ie the cumulative distribution function). The graph shows that if the difference between ratings is zero, the expected result is 0.5. The system uses values of 1 for a win, 0.5 for a draw and 0 for a loss, so this suggests a draw is the most likely outcome. And if the difference is 380 in your favour, the expected score is 0.9, which suggests you are likely to win (an $E_A$ of 0.9 doesn’t necessarily mean you’ll win 90% of the games and lose the rest as other combinations also give an expected score of 0.9. For example, winning 80%, and drawing the rest or winning 85%, drawing 10% and losing 5% gives the same value). The system then compares the actual result to the expected outcome and uses a relatively simple calculation (honestly, it’s easier than it looks) to calculate the number of points exchanged: $$R_A’=R_A+K(S_A-E_A).$$ In this equation, $R_A’$ is the new rating for team A, $S_A$ is the actual result of the game, and $K$ is a scaling factor. We’ll come back to $K$ in a moment. Recently, England (rating 1969) played Belgium (rating 2087) at the King Power stadium in Leuven, Belgium. It is generally thought that the home team is at an advantage and to reflect this, the home team gets a bonus 100 points to their rating which means there is a 218-point difference between the teams. England are clear underdogs, and we can calculate the expected result as follows: $$E_A=\frac1{1+10^{(2187-1969)/400}}\approx0.22$$ This shows that this will be a tricky game for England, and a draw would be a good result. Unfortunately, England lost the game 2–0, an $S_A$ of 0 (still using 1 for a win, etc). Therefore we can calculate the rating change using the formula: $$R_A’=1969+K(0-0.22)$$ Now we need to understand the $K$ value. In simple terms, the bigger the $K$ value we use, the more the rating will change with each result. We need to choose a suitable value so that it isn’t too sensitive, which would lead to wild swings, but also allows for teams to change position when they start to improve. $$R_A’=1969+60(0-0.22)\approx1956$$ The world football Elo rankings adjust the $K$ value depending on the score and the competition. In our example, which was a Nations League game (a new competition between European teams with similar Fifa rankings), the base value for $K$ is 40. This is multiplied by 1.5 for a win by 2 clear goals giving a $K$ value of 60.

This is a change of $-13$ points, and so Belgium would change by $+13$ points to a new rating of 2100.

Although I have focused on the world football Elo rankings, the Fifa rankings now use a system which is basically similar, with slight variations in the weightings and allowances. This brings me to the second, and more important part, of the question: can I use this to prove that I’m right?

Unfortunately, this explanation shows that you can only use this type of ranking, whether it’s the Elo or the Fifa system, to compare with teams that were playing at that time. This means that trying to use it to look back over time is pointless. You can’t compare the performance of Alf Ramsey’s England with that of Steve McClaren using the Elo rankings, because it is not designed to do that.

What can I do?

I can, however, use a similar idea—looking at England’s performance against differently rated teams—to judge Sven.

To achieve this, I’ve collated all of England’s results in competitive games under Sven and used some spreadsheet magic to create the tables shown to the right. (Do not ask how long this took.)

P W D L F A Win % Points %
11 4 3 4 18 10 36.4% 45.5%
Performance of England under Sven-Göran Eriksson against teams in the top 20 (top) and teams outside the top 20 (bottom). The full data is included in the Appendix.
P W D L F A Win % Points %
27 22 4 1 51 15 81.5% 86.4%

This is conclusive (it is—just trust me on this). Under Sven-Göran Eriksson, England were brilliant—if the team they were playing were outside the top twenty. Against good teams, England were awful. For comparison, in the 2020–21 season, Manchester United have a win percentage of 63.2% and a points percentage of 70.2%. On the other hand, Chelsea had a win percentage of 42.1% and a points percentage of 50.9% (based on results up to 27 January 2021), and they sacked the manager.

I can finally conclude that I was right. Sven was a rubbish manager who was worse than Frank Lampard.

Appendix – Sven-Goran Eriksson’s competitive match record

Date Opponent Competition Opponent FIFA Ranking In Top Twenty? For Against Result
24/03/01 Finland WCQ 57 0 2 1 W
28/03/01 Albania WCQ 75 0 3 0 W
06/06/01 Greece WCQ 55 0 2 0 W
01/09/01 Germany WCQ 14 1 5 1 W
05/09/01 Albania WCQ 66 0 2 0 W
06/10/01 Greece WCQ 24 0 2 2 D
02/06/02 Sweden WCF (group) 19 1 1 1 D
07/06/02 Argentina WCF (group) 3 1 1 0 W
12/06/02 Nigeria WCF (group) 27 0 0 0 D
15/06/02 Denmark WCF (2nd Round) 20 1 3 0 W
21/06/02 Brazil WCF (1/4 Final) 2 1 1 2 L
12/10/02 Slovakia ECQ 45 0 2 1 W
16/10/02 Macedonia ECQ 90 0 2 2 D
29/03/03 Liechtenstein ECQ 151 0 2 0 W
02/04/03 Turkey ECQ 7 1 2 0 W
11/06/03 Slovakia ECQ 53 0 2 1 W
06/10/03 Macedonia ECQ 86 0 2 1 W
10/10/03 Liechtenstein ECQ 143 0 2 0 W
10/11/03 Turkey ECQ 8 1 0 0 D
13/06/04 France ECF (Group) 2 1 1 2 L
17/06/04 Switzerland ECF (Group) 47 0 3 0 W
21/06/04 Croatia ECF (Group) 25 0 4 2 W
24/06/04 Portugal ECF (1/4 Final) 20 1 2 2 L
04/09/04 Austria WCQ 71 0 2 2 D
08/09/04 Poland WCQ 29 0 2 1 W
09/10/04 Wales WCQ 57 0 2 0 W
13/10/04 Azerbaijan WCQ 114 0 1 0 W
26/03/05 Northern Ireland WCQ 111 0 4 0 W
30/03/05 Azerbaijan WCQ 116 0 2 0 W
03/09/05 Wales WCQ 83 0 1 0 W
07/09/05 Northern Ireland WCQ 116 0 0 1 L
08/10/05 Austria WCQ 73 0 1 0 W
12/10/05 Poland WCQ 24 0 2 1 W
10/06/06 Paraguay WCF (group) 33 0 1 0 W
15/06/06 Trinidad & Tobago WCF (group) 47 0 2 0 W
20/06/06 Sweden WCF (group) 16 1 2 2 D
25/06/06 Ecuador WCF (2nd Round) 39 0 1 0 W
post

How to be the least popular American president

Some people say the US presidential election system is unfair, since one candidate can win the popular vote—meaning there are more people voting for that candidate than for other candidates—but still fail to win the election. This means that the difference between the number of votes for each candidate is irrelevant to the election outcome, in the sense that if you didn’t count the extra votes, the result would be the same. This is the result of how the electoral system is designed: the presidency is not determined by the popular vote, but by a system called the electoral college which distributes 538 electoral college votes among the 50 states and DC.

Electoral college votes correspond to seats in Congress, plus three additional votes for DC. Image: Martin Falbisoner, CC BY-SA 3.0

A state’s electoral votes are equal to the number of representatives and senators the state has in congress. House seats are apportioned based on population and so are representative of a state’s population, but then the extra two Senate seats per state give smaller states more power in an election. The electoral college is supposed to guarantee that populous states can’t dominate an election, but it also sets up a disparity in representation by misrepresenting every state. As a result, it has happened five times since the founding of the republic that a president has won an election without winning the popular vote. Let me invite you to a thought experiment on the implications of such a system in an extreme scenario.

Continue reading

post

Significant figures: John Conway

A game of Sprouts starting with five dots—exactly one more move is possible.

My most memorable encounter with the work of late mathematician John Horton Conway came from a friend of mine I met as a first year graduate student. As we sat across from each other in the department common room, each having made little progress with our research, he slid me a piece of paper with five dots drawn on it. This game, he explained, consisted of us each taking turns to draw a line between any two dots, with the midpoint of the line we drew then counting as an additional dot. Although the lines could bend in any direction, they were not allowed to intersect each other, and each dot could join at most three line segments. The game was over when one player could not make any more moves, and the other player was declared the winner. At first, I was quickly defeated, and I spent quite some time trying to come up with the best strategies against my skilled opponent.

The game that we spent our lunchtime playing was Sprouts, invented by Conway and his friend Michael Paterson during their time at the University of Cambridge, and was later popularised by Martin Gardner in his Scientific American column Mathematical Games. Conway is perhaps best known for his interest in games: he invented many, and his two books on the subject On numbers and games and Winning ways for mathematical plays include detailed analyses of many two-player games. He was a regular contributor to Gardner’s column, and was a major figure in the world of recreational mathematics in his own right.

Continue reading

post

Surfing on wavelets

High-speed internet and digital storage get cheaper, but the challenge of sharing large files is one that anybody who spends their time working on computers faces. Digital images in particular can be a pain if they lead to long loading times and high server costs. If you have ever seen an image on the internet, then you have certainly encountered the JPEG format because it has been the web standard for almost 30 years. I am sure, however, that you have never heard of its potential successor, JPEG2000, even though it recently celebrated its 20th anniversary. If so, then that is unfortunate because it produces much better results than its predecessor. Continue reading

post

Diagrammatic algebra: On the road to category theory

As trends go, diagrammatic algebra has taken mathematics by storm. Appearing in papers on computer science, pure mathematics and theoretical physics, the concept has expanded well beyond its birthplace, the theory of Hopf algebras. Some use these diagrams to depict difficult processes in quantum mechanics; others use them to model grammar in the English language! In algebra, such diagrams provide a platform to prove difficult ring theoretic statements by simple pictures.

As an algebraist, I’d like to present you with a down-to-earth introduction to the world of diagrammatic algebra, by diagrammatising a rather simple structure: namely, the set of natural numbers! At the end, I will allude to the connections between these diagrams and the exciting world of higher and monoidal categories.

Now—imagine yourself in a lecture room, with many others as excited about diagrams as you (yes?!), plus a cranky audience member, who isn’t a fan of category theory, in the front row:

What we would like to draw today is the process of multiplication for the natural numbers. In its essence, multiplication, $\times$, takes two natural numbers, say 2 and 3, and produces another natural number… Continue reading

post

Chalkdust issue 13 – Coming 1 May

The cover of issue 13

Exciting news: Chalkdust issue 13 will be released at 11am on Saturday 1 May!

We’re really looking forward to letting you read this issue: there’s some really great stuff in it, such as articles about ranking England managers, Jpeg compression, and an interview with Ulrike Tillmann, as well as all your favourite regulars.

The cover of issue 13 features a picture created by an elementary cellular automaton. To help us celebrate the release of issue 13, we are asking our readers to spend some time over the launch weekend creating their own automata-inspired artwork.

Your artwork can take any form you like: you could do a bit of digital art, a painting, a sculpture, or you could even retile your bathroom following the rules of your favourite automaton. If you’ve not learned much about automata yet, then you can read our On the cover article about them on Saturday for inspiration.

You can share your artwork with us on Twitter using the hashtag #chalkdust13, or email it to us at contact@chalkdustmagazine.com. If we really like it, we might even send you a Chalkdust T-shirt.

You can still order physical copies of issues 11 and 12 here, and preorders of issue 13 will be available shortly.

post

How can we tell that 2≠1?

What if I told you that $2=1$? You may say I’m wrong. OK, well, what if I proved it to you? We can both agree that there’s an $x$ and a $y$ where $x = y$. From there, multiply, subtract, factorise, divide, substitute, divide again, and you get $2 = 1$.

Still not happy? You’re probably unconvinced by my so-called ‘proof’. OK, I say, and, after a minute, hand you a sheet of paper with the following hastily scrawled on it: It’s better, but you’re still displeased. This time, I’ve made clear what steps I’m taking from $x = y$ to $2 = 1$. However, you point out, I don’t connect any of these steps. Nodding slowly, I take my time and write out a very nice, orderly proof, complete with justifications for each line:

At this point, you spot my mistake: in going from line 4 to 5, I have divided both sides by $x-y$. But we began with the assumption that $x = y$, meaning that $x-y = 0$, and dividing by 0 is not defined! This means that lines 5 to 7 are operating on nonexistent values and are therefore meaningless.

You’re happy with yourself, but something is bothering you. To reveal my mistake, you asked me to be more precise. But why stop here? Because you found what you were looking for? That’s not how truth is found.

My proof, like all proofs, is a path from one statement to another, just as we may follow the path from $ax^2 + bx + c = 0$ to $x = \big(-b \pm \sqrt{b^2-4ac}\big)/{2a}$, or from the existence of rectangles to the transitivity of parallelism (see below). Along this path I have made several intermediate statements, and linked them together with justifications. You found that one of my links is flawed, and you wonder how we know that the others aren’t also wrong. You begin to question foundational principles, wondering, for instance, why we’re even allowed to do the same thing to both sides of an equation.

Euclidean geometry: For the unfamiliar—Euclidean geometry (standard geometry on a flat surface) rests on 5 assumptions, one of which (the parallel postulate) has historically been regarded as ugly. In attempting to eliminate the parallel postulate, mathematicians have found numerous other statements that are equivalent to it, such as that a rectangle exists or that parallelism is transitive.

You keep digging deeper and deeper, questioning more and more of what you previously took to be correct. Eventually, you come across a piece of mathematics that is perhaps the most beautiful and elegant thing you’ve ever laid your eyes upon: natural deduction.

Natural deduction

Natural deduction is one result of asking for deeper and deeper justification when doing maths. A system of natural deduction is a set of very simple, almost irrefutable rules that act to formalise our intuition about what is definitely true.

Reiteration (R): if $P$, then $P$.

These rules include such things as reiteration, which simply allows us to repeat ourselves. Precisely, reiteration says that if you know that a statement $P$ is true, then you can conclude that $P$ is true. This is hardly controversial.

Conjunction introduction ($\land$I): if $P$ and $Q$ then $P\land Q$.

There are two rules for the natural idea of ‘and’. First is the so-called conjunction introduction rule, stating that if you know that $P$ and $Q$ are both true, then you may conclude $P \land Q$, pronounced ‘$P$ and $Q$’. On the other side, we have conjunction elimination, stating that if you know that $P \land Q$ is true, then you may conclude $P$ and also may conclude $Q$.

Conjunction elimination ($\land$E): if $P\land Q$, then $P$ and $Q$.

These rules don’t feel like they do much besides swapping out ‘and’ for ‘$\land$’; however, doing so is important for formality and precision.

Disjunction introduction ($\lor$I): if $P$, then $P\lor Q$.

Things start to get tricky with the rules codifying ‘or’. The first, disjunction introduction, tells us that if $P$ is true, then you may conclude $P \lor Q$, pronounced ‘$P$ or $Q$’: if I am hungry, then it’s also true that I’m either hungry or tired.

Disjunction elimination ($\lor$E): if $P\lor Q$ and from $P$ we can prove $X$ and from $Q$ we can prove $X$, then $X$.

The second rule, disjunction elimination, states that if $P \lor Q$ is true, and from $P$ you can prove $X$, and from $Q$ you can prove $X$, then you may conclude $X$. More colloquially, if either $P$ or $Q$ is true, and in both cases $X$ is true, too, then $X$ is true. For example, if I’m either well-rested or well-fed, and being well-rested makes me happy, and being well-fed makes me happy, then I must be happy.

Implication introduction ($\Rightarrow$I): if from $P$ we can prove $Q$, then $P\Rightarrow Q$.

Then come the rules regarding implication. We have implication introduction, stating that if from $P$ we can prove $Q$, then we may conclude $P \Rightarrow Q$, pronounced ‘$P$ implies $Q$’. And we have implication elimination (also known as modus ponens), which states that if $P \Rightarrow Q$ is true and $P$ is true, then we can conclude $Q$. If the weather being rainy implies that I am cosy, and the weather is rainy, then I must be cosy.

Implication elimination ($\Rightarrow$E): if $P\Rightarrow Q$ and $P$, then $Q$.

Finally, we come to the most arcane rules, those handling negation. The negation of $P$ is written $\neg P$ and pronounced ‘not $P$’. Before talking about the $\neg P$ rules, however, we must first introduce a new symbol: $\bot$ (pronounced ‘bottom’), which represents impossibility or contradiction. We can then introduce bottom introduction, which states that if both $P$ and $\neg P$ are true, which is absurd (usually… there are systems of logic that admit both $P$ and $\neg P$ at the same time, called paraconsistent logics), then we can conclude $\bot$, to represent this impossibility.

Bottom introduction ($\bot$I): if $P$ and $\neg P$, then $\bot$.

We’re then able to make use of $\bot$ through negation introduction, which states that if from $P$ we can prove $\bot$, then we can conclude $\neg P$. This is reasonable; if $P$ being true led to a contradiction, then $P$ isn’t true, so $\neg P$ is.

Negation introduction ($\neg$I): if from $P$ we can prove $\bot$, then $\neg P$.

Finally we have negation elimination. This one is a nice easy way to end: it says that if we know $\neg \neg P$, then we can conclude $P$. If something isn’t not true, then it must be true!

Negation elimination ($\neg$E): if $\neg\neg P$, then $P$.

And with that, we have completed (one kind of) natural deduction, laying out a framework for proofs based on undeniable principles so that we can be completely confident in our results.

Now, you may be wondering, hey, maths is about numbers and shapes and functions and vector fields, but all we’ve been working with are $P$s and $Q$s! Not a single $n$ or $x$, let alone an $f$, has been written so far!

Fear not! Purely logical systems such as natural deduction are key ingredient for building typical maths. For example, to define numbers, we may first extend to predicate logic, then construct the naturals (via the Peano axioms), which we’ll use to make the integers and the rationals (via equivalence classes), then finally the reals (via Dedekind cuts).

So, in fact, we still we get to work with all the maths we’re used to! Plus, due to the use of natural deduction, we have the added benefit of being confident about what we’re doing at every layer of abstraction!

Predicate and propositional logic: The logic we’ve been building, with $\land$, $\lor$, $\Rightarrow$, $\neg$, and $\bot$, is known as propositional or zeroth-order logic. Predicate or first-order logic is an extension of propositional logic wherein our statements ($P$, $Q$, $X$, etc) may be parametrised. So as well as having $H$ mean that ‘I am hungry’, we may also have $\mathcal H(x)$ mean that ‘$x$ is hungry’. Additionally, predicate logic includes two quantifiers, $\forall$ and $\exists$, which respectively mean ‘for every’ and ‘there exists’: $\forall x \mathcal H(x)$ means that everyone is hungry, and $\exists x \mathcal H(x)$ means that (at least) one person is hungry.

So what?

If you’re anything like I was at age 17, or anything like how I portrayed you in the beginning of this article, you’re drooling right now. It’s like all of your fantasies regarding rigour and precision have been heard and answered by divine mathematicians.

But maybe you’re not intrinsically motivated by rigour, so you’re less excited by natural deduction. Which is fine! I’m not hurt. Maybe a little bit. Or maybe you just feel that this is overkill—did you really need all this work to know that $2 \neq 1$? Or maybe you’re not convinced that these rules are correct; perhaps you don’t agree that from $\neg \neg P$ we can conclude $P$.

Excluding the middle: If you don’t agree, you are not alone! That $\neg \neg P$ entails $P$ is a consequence of a rule called the law of excluded middle, which states that $P \lor \neg P$. (This law is built-in to the system of natural deduction that we created.) Some mathematicians (the intuitionists or constructionists) reject the law of excluded middle, thus also forfeiting that $\neg \neg P$ entails $P$. One reason to question the law of excluded middle is that it allows us to state that something exists without stating what it is. For instance, we are able to prove that an irrational number raised to the power of an irrational number can be rational, but without giving an actual example. If we reject the law of excluded middle, then all such proofs must actually construct an example.

Still, I posit, natural deduction is worth your time. Because we’ve been so rigorous in building the system up, we gain the benefit of knowing exactly what we’re talking about. Before establishing such precision, we may have used $P \Rightarrow Q$, but without a sense of what, exactly, it really means. Now we have a precise definition: it means that from $P$ we can derive $Q$ (as per implication introduction); and it means that if we have $P$ then we can conclude $Q$ (as per implication elimination); and it means nothing else.

From this precision, we reap at least two amazing things: metamathematics and computers.

For one, we now can dip our toes into the metamathematical branch of proof theory, where we prove things about proof systems. For instance, we may wonder if natural deduction—or any proof system—is complete, meaning, roughly, that any question we can ask within the system can be answered by the system. Likewise, we may wonder if it’s consistent, meaning we can never prove $\bot$. Or perhaps it’s both? (Interestingly, logical systems capable of sustaining mathematics are never both at once, due to Gödel’s incompleteness theorem.) Proof theory is full of fascinating and surprising results, all enabled by being very precise about what we’re talking about.

@mathsproofbot and @mathstableaubot both prove the true statements that @mathslogicbot tweets.

Additionally, with our newfound precision, we get to enlist computers! Because computers generally demand the utmost precision in order to be of any use, they aren’t of much help until we achieve such rigour. Now, however, we are able to get their help writing proofs, using proof assistants such as Coq and Lean, or interactive proof-writing systems such as my own (see maynards.site/items/fitch/full). There are even programs that can write proofs entirely for us, as exemplified by @mathsproofbot and @mathstableaubot.

So let us return to my claim that $2=1$. How can we reject this? ‘By intuition’ is the easiest way: clearly $2$ is not $1$. However, now we may also turn to our system of natural deduction, where we were very careful about what we took to be true, and point out that $2 = 1$ is not true in this system. To exemplify this, we can show that it will be rejected by proof assistants and proof-writing algorithms. Finally, we may rest confident. That is, until we dig deeper once again, questioning the principles of our system of natural deduction…