Counting caterpillars

Peter Rowlett uses combinatorics to generate caterpillars


We bought a new toy for our three-year-old, a robot caterpillar. This has a head and eight segments which tell the caterpillar to perform actions like moving and playing music. A caterpillar is formed by the head, always at the front, followed by at least one segment. These form an ordered set of commands which are executed in sequence.

This is a fun toy which principally involves colourful lights and music, while also developing some logical thinking and proto-programming skills in 3– 5-year-olds. But that’s not why I’m writing about it.

On the packaging is made the following claim, acting for me like a red rag to a bull:

“8 pieces. Endless combinations!”

A robot caterpillar

This cannot possibly be true. You cannot arrange eight pieces endlessly and never repeat the same combination. This isn’t how infinity works.

My son is quite into counting, though he becomes a little inconsistent around “five-teen” so perhaps swapping segments into place and counting them isn’t the way to go about this. Thankfully, mathematics has an answer: combinatorics.

Some combinatorics fundamentals

Imagine you have three tiles containing letters A, B and C, and you want to know how many different ways you can arrange these.

Six rearrangements of three different tiles labelled A, B and C.

Notice that there are three options for which tile to place first: A, B or C. Once one of these is chosen, there are two options for which tile to place second. In the first two rearrangements A is placed first leaving two options for which to place second: B or C. Once either is chosen, there is one option remaining for the third and final tile.

This means that there are $3! = 3 \times 2 \times 1$ options for arrangements of three different tiles. This logic continues for any number of tiles, indeed any number of objects. The number of ways of arranging $n$ objects is

$$n\hspace{1pt}!=n \times (n-1) \times (n-2) \times \ldots \times 3 \times 2 \times 1 \text{.}$$

Let’s consider a different problem: say you have a bag of three balls labelled with letters A, B and C, and you want to know how many different ways there are of pulling two balls out of the bag. In particular, imagine you aren’t interested in which order the balls came out of the bag (as is the case in, say, a lottery).

At first, this may appear very similar to the tiles. There are three choices for the first ball, then two choices remaining for the second ball, then one choice for the ball left in the bag. That’s $n\hspace{1pt}!$, and for three balls the $3!=6$ options are listed here:

Six ways of choosing two balls from three labelled A, B and C.

However, if you look at the pairs of balls just above, you will notice that we are counting some twice. Whether you draw A and then B or B and then A, you are still holding A and B at the end of the draw. Actually, there are only three different ways to draw two balls from three. To see how this works for a larger case, consider a bag with ten unique balls from which you desire to draw three, and you aren’t interested in the order of the three.

You could consider drawing balls to be an act of arranging objects: arrange ten balls into a line and then ‘draw’ the three leftmost balls. Doing so, you arrange the balls into two groups: the three you have drawn and the seven you have not.

Arranging ten balls and drawing three of them.

The birthday problem

The birthday problem says that you need only $23$ people in a room for a better than $50\%$ chance that at least two of them will share a birthday.

One way to think about this is to think about pairs of people. Given the birthday of one member of a pair, if the other member of the pair is going to have a different birthday then (ignoring leap years) there are $364$ days left for their birthday. This yields a probability (assuming all birth days of a 365-day year are equally likely) of $\frac{364}{365}$.

Now we notice that there are $\binom{23}{2}=253$ ways to choose a pair from the group of $23$ people. If we assume that each pair of birthdays is an independent event, then we have an approximation for the probability of each and every pair not sharing a birthday as $\left(\frac{364}{365}\right)^{253}\approx 0.4995$. The probability, therefore, of at least two of them sharing a birthday is approximately $1-0.4995=0.5005$, slightly greater than $50\%$.


There are $10!$ ways to arrange these ten balls. But within these $10!$ arrangements, there are $7!$ that simply shuffle the balls left in the bag, which doesn’t affect which three are drawn. So we are down to $\frac{10!}{7!}$ arrangements. This represents all possibilities for drawing three balls from the bag. If we aren’t interested in the order that these three are drawn, then we can further reduce this number by $3!$, the number of ways of arranging them. This gives a formula for the number of ways of drawing three balls from ten when the order of the three drawn isn’t important:

$$\frac{10!}{3! \times 7!} = 120 \text{.}$$

In general, the number of ways of choosing $r$ objects from $n$ without repetition when we don’t care about the order of the objects chosen is


This is the binomial coefficient and can be written using the notation $\displaystyle\binom{n}{r}$.

The binomial coefficient, and the fact that there are $n\hspace{1pt}!$ ways of arranging $n$ objects are key tools at the heart of combinatorics.

We can get very surprising and unintuitive results through combinatorics. I used to give a talk about coincidences which started with two counting problems—the birthday problem and shuffling cards (see the sidebars on this page and the previous page for descriptions of these problems). Though I know the methods involved and intellectually it makes sense, I still find it hard to genuinely, intuitively appreciate that if I collect $23$ unconnected people, there is a $50\%$ chance that two of them will share a birthday. Meanwhile, I find it really hard to (genuinely, intuitively) believe that a bunch of cards I’ve just thrown on a table are arranged in a way such that it is extremely unlikely any pack of cards has been arranged in the history of the universe. Such is the nature of combinatorics, which offers techniques to address counterintuitive counting problems.

Shuffling cards

While I was explaining the birthday problem, I would slowly shuffle and deal all $52$ cards from a standard pack of playing cards onto the table in front of me. I’d observe that $52$ different playing cards can be arranged in $52!=8\times 10^{67}$ ways. So the exact set of cards arranged on the table in front of me (assuming a well-shuffled deck) represented $1$ case in $52!$, with probability approximately $1.24\times 10^{-68}$. The fact we got to witness this hand of 52 cards is a truly astonishing coincidence. I’d invite members of the audience down to the front to view the astonishing layout, but for some reason I’d never get any takers.


Back to the caterpillar

With eight segments to arrange, it feels to me like there can’t be very many different ways to arrange these. But, as I know, numbers in these sorts of problems are counterintuitively large. Let’s put an upper bound on the number, anyway, to dispute the `endless’ claim.

If all eight segments were different, for sequences using $k\leq 8$ positions, we choose $k$ segments from the $8$ possible segments and there are $k\hspace{1pt}!$ ways to arrange these, so the number of possible sequences would be

$$\binom{8}{k} k\hspace{1pt}! = \frac{8!}{(8-k)!} \text{.}$$

Notice that for $k=8$, this reduces to $8!$, as you might expect.

The number of possible sequences using any number from $1$ to $8$ segments would be the sum of this arrangement for all possible lengths, ie

$$\sum_{k=1}^{8} \frac{8!}{(8-k)!} = 109600 \text{.} $$

I think you’ll agree with me that $109,\!600$ is significantly smaller than infinity. For this problem, though, it’s definitely too large.

But it isn’t that simple

The caterpillar has three identical segments which instruct it to move forwards, two to turn left, two to turn right and one to stop and play music.

For the avoidance of doubt, some other features and restrictions:

  • there are eight positions that can hold segments, the head is always present;
  • any number of segments from $1$–$8$ may be used, and not $0$;
  • the order of segments matters, because it is a sequence of commands;
  • segments are connected (by USB), so there can be no gaps in a sequence—if a position is unfilled, the caterpillar ends at the preceding segment.

Let the forwards segments be labelled $F$, the left $L$, the right $R$ and the music $M$, and let a sequence be denoted by a string of letters from left to right (eg the head, not denoted, is on the left of the string). Then some examples of valid caterpillar sequences would be $FFFM$, $FFMF$, $FLRFLRFM$, $LRL$, $F$, and so on.

In the simple analysis that led to the upper bound $109,\!600$, we were over-counting the cases where multiple $F$, $L$ and $R$ segments are used. For example, the sequence $FF$ would be counted twice, even though they produce functionally identical caterpillars.

There is a neat trick in combinatorics for counting in this kind of circumstance-generating functions.

Generating functions-an example

Say I have three cards, two blue and one green, and I want to know how many ways I can arrange them into sequences of length 1-3. This is fairly straightforward to enumerate by hand, as you can see here.

For a generating function approach, I assign two functions. For green, I use $1+g$. Think of this as the $1$ representing no green card being selected and $g$ representing one green card being selected. For blue, I use $1+b+b^{\hspace{1pt}2}$. Similarly to green, $1$ and $b$ represent no blue cards and one blue card, with $b^{\hspace{1pt}2}$ representing two blue cards being selected.

Now we simply multiply the two functions together.

$$(1+g)(1+b+b^{\hspace{1pt}2}) = 1+b+b^{\hspace{1pt}2}+g+gb+gb^{\hspace{1pt}2}$$

Arrangements of two blue cards and one green card of length 1, 2 and 3.

Each term of the expansion represents a selection from the set of three cards. For example, $g$ represents choosing one green card only for the sequence `green’, and $b^{\hspace{1pt}2}$ represents choosing just the two blue cards for the sequence ‘blue, blue’. The term $gb$ represents choosing one green and one blue card, and there are $2!=2$ ways to arrange these two cards: ‘green, blue’ and ‘blue, green’. We didn’t have this rearrangement problem with $b^{\hspace{1pt}2}$ because the two blue cards are identical. The term $gb^{\hspace{1pt}2}$ represents choosing all three cards. There are $3!$ ways to arrange three cards, but $2!$ of them are the same, because the two blue cards are identical. So we can arrange $gb^{\hspace{1pt}2}$ in $\frac{3!}{2!}=3$ ways. The final term in our expansion is $1$, which represents choosing none of the cards. Whether this is a valid sequence depends on the context and the rules you are following.

Generating caterpillars

Since there are three $F$ segments, we can use the function $1+f+f^{\hspace{2pt}2}+f^{\hspace{2pt}3}$ to represent these. For the two $R$ segments, use $1+r+r^{\hspace{1pt}2}$, and for the two $L$ segments use $1+\ell+\ell^2$. Finally, for the one $M$ use $1+m$.

To find out how many different ways there are of selecting from these segments, expand $(1+f+f^{\hspace{2pt}2}+f^{\hspace{2pt}3})(1+r+r^{\hspace{1pt}2})(1+\ell+\ell^2)(1+m)$. The expansion has $72$ terms, each representing a different caterpillar. One of these, however, is formed by multiplying the four $1$s together to get $1$, which represents selecting no segments. Since this isn’t a valid caterpillar (the toy doesn’t work using only the head), there are $71$ different ways of forming caterpillars up to length $8$.

For example, one of the terms of the expansion is $f^{\hspace{2pt}3}r\ell^2m$. This represents choosing $F$, $F$, $F$, $R$, $L$, $L$ and $M$ for a caterpillar of length $7$. A caterpillar of length $7$ can be arranged $7!$ ways, but some of these are duplicates; how many depends on which segments are involved. In this case, a caterpillar involving three $F$s, an $R$, two $L$s and a $M$ can be rearranged $\frac{7!}{3!2!}=420$ different ways, with $7!$ divided by $3!$ for the three identical $F$s and $2!$ for the two identical $L$s.

We complete this calculation for all $71$ valid selections of segment arrangements. The total number of possible caterpillars is 5023.

So the “endless combinations” claimed on the packaging is, in fact, a little over five thousand possible caterpillars. Not endless, but certainly enough to keep us busy for a while! Oh, and the manufacturer sells add-on kits containing extra and different segments, which opens up whole new worlds of combinatorics possibilities…

If you are interested in a gentle-but-thorough introduction to combinatorics, I heartily recommend Robin Wilson’s Combinatorics: A Very Short Introduction.

Peter is a reader in the mathematics group at Sheffield Hallam University, where among other things he teaches a module on game theory and recreational mathematics. His research investigates teaching and learning of mathematics at university level. He blogs at the Aperiodical.

More from Chalkdust