Machine learning 101

So you want to build an artificial intelligence. Maybe you want to automate a repetitive task. Maybe you want to enslave-slash-destroy humanity. Either way, here’s a quick, easy guide to offloading all your thinking from your fallible, squishy brain to your unstoppable silicon automaton.

1. Give it a goal

Exercise for the reader: train your AI to identify this cat

However the automation algorithm works, you need to define a phase space for the machine and its task. This is something like ‘all possible images’ or ‘all legal chess moves’. Within that phase space, you then need to give it an objective. This could be something like ‘identify cats in this image,’ ‘win at chess,’ or ‘kill all humans.’ The more clearly you define the end state, the easier it is to nudge the AI to it.

2. Define a score function

A knight to remember

In order to navigate the phase space on the way to the correct answer, you need to have a way for your AI to distinguish good answers from bad ones – it needs to be able to navigate the phase space by assigning ‘scores’ to different locations in the phase space and comparing them.

Modern chess-playing computers are much better than ones from 20 years ago not just due to increased processing power. They’re also better because of more sophisticated score function algorithms: the modern ones are much better at assessing their current position on the board, and evaluating move options based on how the move improves their position  (its score). It’s not enough to tell the AI to ‘checkmate the opponent’ because for most of a chess game that’s not your goal – your goal is to secure positions on the board and win pieces, and you need to program your AI to understand that.

3. Give it a way to evolve

Machine learning works by assigning scores to various states the bot can be in, and then giving it a way to update its score to inch closer to its goal. One example is a genetic algorithm, inspired by  biological evolution. You start with a ‘population’ of algorithm variants, you assign them scores, you let them ‘mutate’ by randomly changing the algorithms a bit, then if the mutants’ scores are better, you update the population with the mutants.

Another example is a neural network, like the new champion crushing chess AI AlphaGo Zero. Here you create a network of ‘neurons’, and let them ‘light up’ in varying amounts corresponding to certain stimuli – this is the score function. They ‘evolve’ as the connections between the neurons change (they ‘re-wire’) as the network tries to improve its score function. With enough neurons and iterations these networks can behave in complex, smart ways even their creators can’t predict.

4. Throw processing power at it

Buy lots of these

Even if you optimise your algorithm to need as few iterations as possible to achieve its goal, it never hurts to get some time on a supercomputer. AlphaGo wouldn’t have been able to master chess in a matter of hours without Google’s fancy new hardware with unparalleled speed. Likewise, you’ll need a lot of RAM to hack the Pentagon’s nuke codes!

5. Have fun!

As this is just the bare-bones introduction, I hope I’ve piqued your interest enough to learn this all in more detail. It’s a hot topic right now, so there’s a plethora of options for self-teaching the nitty-gritty of it all. Just remember to show me mercy when your unstoppable swarm intelligence AI ends our feeble human civilisation.


Mike Jordan on machine learning

If you’ve had a cursory eye on the news over the past few years, chances are you’ll have noticed that machine learning and artificial intelligence are getting lots of air time. Whether it’s spooky algorithms employed by your favourite social networking site or fears of a robot rebellion, machine learning has never been more present in the public consciousness. To find out more, Chalkdust  attended a lecture by Mike Jordan—undoubtedly the most influential voice on this hottest of hot topics.

Jordan, an award-winning academic who has also advised several machine-learning related companies and developed toolkits that are now the industry standard, is the perfect person to provide a level-headed response to the hype. He has been involved in machine learning for over 30 years and his broad experience gives him authority in both the academic and commercial spheres.

Machine learning is changing how we shop online

Given the recent explosion in news coverage, you might be forgiven for thinking that machine learning is undergoing something of a revolution. From an academic perspective, this is not quite correct. “Really, the techniques have not dramatically evolved over the last twenty years. The introduction to machine learning class I taught in 1999 is still pretty much the same today.” Instead, Jordan suggests that we are being more creative in our applications, with “every big business in the world employing machine learning on some level”. Previously, these methods had been used in the back-end of businesses, for example by Amazon to reduce their levels of credit card fraud and by shipping companies to predict how many of a product would be needed in a certain part of the world. Now, however, machine learning algorithms are in our homes and our pockets in the form of targeted advertising, personalised recommendations and (soon, according to Jordan) domestic robots.

Although this is an exciting development, as your fridge, television and car all become smarter and more personalised, it isn’t without risks. “If Google messes up and shows you a rubbish search result, it’s frustrating. If machine learning is applied to healthcare and it goes wrong, then lots of people could die.” Like all of mathematics, development of the theory is slow. “People are using these models, and we haven’t got good ways of quantifying the error. We don’t know how well these things scale, and algorithms are being applied in areas that they were never intended for.” Jordan also suggests that journalists, hungry for clickbait, are partly at fault. “I’d love to see people write articles who know about the stuff, and have spent time getting involved in it. But at the same time, I realise people want views and clicks.” (Rising to the challenge, we have given a quick primer on machine learning on page 69.)

Don’t expect a robot rebellion soon (or ever)
Image:, CC BY 2.0

So if the pace of change is slow, does that mean we needn’t fear a robot revolution? “Anybody who tells you that we’ll be making machines smarter than we are, doesn’t know what they’re talking about. Not in our lifetime, not for a long time.” Several times during his lecture, Jordan makes the distinction between learning and intelligence. Machine-learning algorithms are simple and dumb, and we have no idea how to achieve the genuine understanding that constitutes intelligence in a human being. As a concrete example, Jordan talks about natural language processing (“the most challenging problem in the field”). “If I make up a word, and say ‘the gretch walked from London to Cambridge in an hour’, you understand lots about this gretch thing. If I add the word ‘gretch’ to a machine learning algorithm, it shifts its knowledge by a miniscule amount.” Thus, flexibility would seem to be the thing missing from machine learning. The ability to improvise, hypothesise and to infer genuine meaning (rather than statistical likelihood) from a sentence are all natural features of human communication that seem impossible from the current perspective of large data sets and finely tuned parameters. Although progress to true intelligence seems like a slow road, Jordan is adamant that machine-learning enabled devices will fill our homes in the near future. “In ten years, I expect most people to own a household robot. It’ll be like having a cellphone, but it will move around.”

Applying machine learning incorrectly to healthcare could affect thousands of lives
Image: US Army photo by Maria Pinel

It’s important to be prepared for these changes, to teach statistics and computer science together and to make sure that people can retrain as the workplace becomes more automated. “In previous tech revolutions, you had thirty years to adapt. This time it’ll be faster.” Jordan is insistent that these advancements are distributed equitably, regularly travelling abroad to spread the word about machine learning, and developing his software as open source, so that anybody can make use of its incredible power. The transformative effect of machine learning is already starting to be felt in China, a country that Jordan has visited several times. Here, banks have traditionally been hesitant about lending money to individual citizens, and access to cash can often be difficult for those living in rural communities. Now, machine learning methods are being applied by Ant Financial to lend money based on people’s online shopping history. Loans are thus easier to come by, and less risky for the lenders, than ever before. In this case it seems like everybody wins, and that the hype around the impact of ‘dumb algorithms’ is justified after all. Just don’t expect to hold a conversation with your toaster any time soon.

Mike Jordan was speaking as part of the G-Research lecture series, in which world experts in science and technology give accessible talks about cutting-edge research.


Domino tiling & domineering

Probably everyone reading this article has, perhaps some time ago, played dominoes with the standard set of 28 pieces or even with the larger set of 55 pieces. But do not fret if you haven’t: we will not be examining these sets, but merely taking the concept of a domino as a couple of squares joined along an edge.

Asking how we may arrange a set of pieces to cover a plane rectangular board leads to an interesting mathematical problem. If we add some orientation restrictions, we can make an interesting two-player game.

Domino tiling

A chessboard with two opposite corners removed

There is a standard problem in many recreational maths books in which two opposite corners of a chessboard are removed and the reader is asked to place dominoes to cover all the remaining squares of the board.

It is, of course, impossible because the two squares removed have the same colour and each of the dominoes must cover one black square and one white square. However, there is no need to introduce this somewhat contrived condition in order to obtain some relatively simple, but
interesting mathematical problems.

First, we ask the following:

Problem 1

How many different ways can eight dominoes be used to cover a $4\times4$ square?

For problem 2, these two tilings are considered the same, as one is a rotation of the other

Alternatively, tilings may be considered the same if one is a rotation of the other, but still regarded as different if they are mirror images. This change in counting the distinct tilings obviously leads to a smaller number due to symmetry, but there is no simple means of calculating the different possibilities as some patterns have two-fold rotational symmetry and others involving squares may have four-fold rotational symmetry. However, it does provide a nice problem to be examined through a systematic `trial-and-error’ approach:

Problem 2

How many different ways can eight dominoes be used to cover a $4\times4$ square, if rotations of the same pattern are excluded?

We now return to including all the rotations, as in problem 1, take a more general $m\times n$ rectangle and ask a similar question. This problem has troubled mathematicians and the full answer is quite complicated. It was solved in 1961 by Kasteleyn in the context of covering a surface with molecular dimers, and independently solved by Temperley and Fisher in the same year. The resulting equation for the total number of domino tilings of an $m\times n$ rectangular board can be shown to be

$$A_{m,n}:=\left(\prod_{k=1}^n\left(\prod_{j=1}^m\left(4\cos\left(\frac{\mathrm{\pi} j}{m+1}\right)^2+4\cos\left(\frac{\mathrm{\pi} k}{n+1}\right)^2\right)\right)\right)^{1/4}.$$

The 5 ways of tiling a $2\times4$ rectangle

Perhaps surprisingly, this product must therefore always be a non-negative integer! Although we may use this expression to calculate the numbers for various rectangles, it is perhaps more interesting to examine a few specific cases.

We begin by looking at the problem for a $2\times n$ rectangle. As you will see, this has a surprisingly familiar answer. The first few cases with small $n$ can be easily worked out and the answers are:

$n$ 1 2 3 4 5 6
$A(n)$ 1 2 3 5 8 13

Let us now assume that we know the answer, $A(n)$, for some small values of $n$. We see that we can generate the ways of tiling a $2\times(n+1)$ rectangle in two different ways: either by adding a vertical domino to one of our ways of tiling a $2\times n$ rectangle; or by adding two horizontal dominoes to one of our ways of tiling a $2\times(n-1)$ rectangle.

Consequently, we may write $A(n+1)=A(n)+A(n-1)$ and, since we already know $A(1)=1$ and $A(2)=2$, we have a recurrence relation that generates the sequence that we saw above, and that you will immediately recognise as the Fibonacci sequence. We suspect that you didn’t expect it to make an appearance in this context.

Of course, it’s not necessary to restrict ourselves to rectangles, so here are a few more shapes to consider:

Problem 3

Which of the following figures are capable of being completely tiled by dominoes?

Returning swiftly to rectangles, we note that the $2\times n$ rectangle has an interesting solution. This immediately raises a further question: is it possible to similarly analyse the $3\times n$ rectangle? This question is more difficult to address as there are more configurations for the dominoes in the first few columns. One approach has been made by Brundan, but to our knowledge no detailed analysis of the complete solution has been previously published. The way of approaching the problem, along similar lines, is to think in terms of the arrangement of dominoes on the left-hand side of the block and to investigate how these define the number of configurations as the further dominoes are placed on the right-hand side.

If $m = 3$, then clearly there are no tilings unless $n = 2k$ is even. Let $S(\hspace{-1pt}k\hspace{1pt})$ denote the number of tilings of a $3\times2k$ rectangular board. Then $S(0) = 1$ (as there’s only one way to tile a $0\times0$ board: do nothing!) and it is also easily seen that $S(1) = 3$.

In fact, by considering the possible configurations of the dominoes in the leftmost $3\times2$ and $3\times 4$ rectangles, it can be shown that the recurrence relation $S(k) = 4S(k-1) – S(k-2)$ holds for all $k > 1$. Thus $S(2) = 11$, $S(3) = 41$, $S(4) = 153$, and so on. This recurrence relation allows one to determine the number of tilings for any value of $k$. It follows that there are 571 ways to tile a $3\times10$ rectangle. Perhaps you’d like to try drawing all of these to check?

If we only count up inequivalent tilings of an $m\times n$ rectangular board (as we did in problem 2, where we regarded two such tilings as being equivalent if one is a rotation of the other), then the calculation becomes even more intricate! For example, if $m = 2$, then the
number of inequivalent tilings is 1 if $n=2$,
&\frac{F\hspace{1pt}(n+1)+F\hspace{1pt}(\tfrac{n+4}2)}2&&\text{if }n>2\text{ and }n\text{ is even, and}\\
&\frac{F\hspace{1pt}(n+1)+F\hspace{1pt}(\tfrac{n+1}2)}2&&\text{if }n\text{ is odd},
where $F(n)$ is the $n$th Fibonacci number. This result, involving an averaging process, is a special case of Burnside’s lemma from group theory (in this case the group involved is the rotational symmetry group of a rectangle). Thus for a $2\times6$ rectangle, there are $F(7) = 13$ tilings but only $\tfrac12(F(7) + F(5)) = 9$ inequivalent tilings.

Two of the 3,335,651 ways of tiling a $4\times15$ rectangle

It is, of course, possible to extend this method to the general case with $m = 4$. It is obviously more difficult to derive the recurrence relation for this condition but it can be done. Various surprising factors occur and the procedure, which is a littleChalkdust website.

Our method gives the answer for a $4\times10$ rectangle as 18,061 and for a $4\times15$ rectangle as 3,335,651. If you disagree with this result please let us know, but please don’t send all the possible tiling patterns to Chalkdust for checking!

The game of domineering

We now move to a two-player game that uses dominoes. It was invented in 1974 by Goran Andersson, who originally called it crosscram, although it is now more widely known as the game of domineering.

In this game, players take turns to place a domino on a rectangular board but one is restricted to placing them in a vertical orientation (player V) and the other (player H) is restricted to placing them in a horizontal position; it is conventional for player V to start with square boards and this causes no restriction as the board can obviously be rotated by 90°. The first person who is unable to play has lost the game.

An example game of domineering on a $5\times5$ board. Player H cannot make another move so has lost this game.

The rules are therefore very simple but the game has a number of interesting aspects that relate to the tiling problem. A quick (around 5 minute) game can be played on a $5\times5$ board in order to learn the tactics. Play on a $6\times6$ board provides some interesting possibilities that can be easily analysed. So, we now present you with problem 4.

Problem 4

With best play, who wins (a) on the $5\times5$ board and (b) on the $6\times6$ board?

The answers to the puzzles will be given in a few weeks’ time on the Chalkdust blog, when we will reveal a little more about this game. If you require a more challenging game to play and analyse, try domineering on an $8\times8$ board…


In conversation with Chris Budd

“Let’s chat any time, I’m fairly free.” Coming from Chris Budd, a professor of mathematics at both the University of Bath and the Royal Institution, as well as a board member of several of the most influential mathematics organisations in the UK, this is somewhat of a surprise. But he is true to his word, and one Friday afternoon we sat down for a conversation with one of the UK’s most experienced voices in mathematics communication.

The University of Bath. Wikimedia commons, CC BY-SA 3.0

A quick glance at Budd’s website reveals, through a CV that runs to 25 pages, the diversity of his interests and professional experiences. At various times, he has advised on setting A-level examinations, held high-ranking positions in professional societies like the Institute of Mathematics and its Applications, directed the Bath Taps into Science festival and been part of the Vorderman Committee, which produced a report in 2011 about recommendations for mathematics education.

Continue reading