A tale of $n$ cities

Donovan Young finds every connection wherever he goes.

post

The famous Dickens novel highlights the political and social chaos amid the French Revolution, taking place primarily in London and Paris. Sending messages between these two cities was obviously of extreme importance. In fact, the story is set in motion by a cryptic message received from France—but I will leave that analysis to the GCSE students. Our interest lies with these messengers.

We begin with a messenger who flags down a mail coach to deliver an important communication to one of the passengers and waits to receive their reply. The issue is that the mail coach naturally has other passengers, not to mention a driver and security guard, who will hear everything and, of course, gossip about it afterwards.

So what if they wanted a means of secure communication? If a pair of pen pals in different cities wanted to exchange private messages, and they were too paranoid to trust a public mail service, they might want to extend their circle of trust the minimum amount, ie by one more person—an exclusive letter carrier used only by that pair of people and no one else.

In our mathematical version of the tale, we will tell the story of how many ways there are of matching up pen pals in various cities. As we will see, this count is made easier by matching each pen pal pair to an exclusive letter carrier they can trust. With this count in hand, we can also estimate the typical number of letter carriers servicing any pair of cities.

Following Dickens, we open with the book where $n=2$.

Retelling a classic

We would like to assign, to each and every person in these two cities, a unique pen pal from the other city. How many ways are there of pairing up the people of city 1 with those of city 2?

The first obvious point is that the two cities must have the same population, say, $k_1$. Then there are $k_1!$ ways of assigning pen pals under the usual explanation that ‘citizen 1’ of city 1 has $k_1$ choices for their pen pal, then ‘citizen 2’ has $k_1-1$ choices, etc. But there is another way to arrive at the same count which involves the secure messengers that we introduced earlier. If we also include $k_1$ exclusive letter carriers, enough for each pen pal pair, now how many ways are there of assigning them?

Each pen pal pair has one unique letter carrier

Let us first match the people of city 1 to the letter carriers. Citizen 1 of city 1 has $k_1$ choices of letter carrier, citizen 2 has $k_1-1$ choices, etc. So there are $k_1!$ ways of joining the $k_1$ citizens of city 1 with the $k_1$ letter carriers. Citizen 1 of city 2 also has $k_1$ different letter carriers to choose from (and each one has a person of city 1 connected to them). Citizen 2 of city 2 has $k_1-1$ choices, etc. This brings about another factor of $k_1!$, so we have $k_1!k_1!$ different arrangements.

We could also think about this as ‘pen pal pair 1’ having a choice of $k_1$ letter carriers, then ‘pen pal pair 2’ having a choice of the remaining $k_1-1$ letter carriers, etc. Since we know from our previous argument that there are $k_1!$ different ways to match pals, then the extra $k_1!$ has come from deciding which pair gets which letter carrier.

A drawback in this process is that all of the letter carriers are running the same route. This ruins efficiency, not to mention the travel costs for the secure messaging company, and so we introduce the rule that if two letter carriers are going between the same city, they might as well be the same person. From a security perspective, things might actually be safer, as there will be nobody else for the letter carrier to share information with.

Because of this, the extra factor of $k_1!$ which counted the permutations of the letter carriers amongst the pairs needs to be divided out: $k_1!k_1!/k_1! = k_1!$, and we are back to the original count of the number of ways of pairing the citizens of cities 1 and 2.

Why would we introduce the different carriers and count the ways of distributing them amongst the pairs just to undo it again?

A tale of three cities

We’re now going to investigate how the pen pal problem changes when we consider three cities, which should hopefully resolve the cliffhanger ending of the previous book. Let’s have the populations be $k_1$, $k_2$ and $k_3$ for cities 1, 2, and 3, respectively. We no longer need the constraint that the populations be equal, but instead we must have:

  1. the combined population of the three cities be even,
  2. $k_1 + k_2 \geq k_3$,

assuming without loss of generality that $k_1 \leq k_2 \leq k_3$. The combined population must be even because every pen pal pair consists of two unique people. The second condition stems from the fact that if the total population in the two smallest cities is too small to provide a match for each person in the largest city, then there will be people left over in the largest city, and since we don’t have pen pals in the same city, we reach a contradiction. In the special case that $k_1+k_2=k_3$ there will be no pen pals between cities 1 and 2.

In general, let there be $p_1$ pen pal pairs between cities 1 and 2, $p_2$ between cities 2 and 3, and $p_3$ between cities 1 and 3, as pictured below.

A triangle diagram, showing unevenly-sized connections between the three cities k_1, k_2 and k_3.

Doncaster, Wolverhampton and Peterborough

Then it must be that
$$\begin{split} &p_1 + p_3 = k_1,\\ &p_1 + p_2 = k_2,\\ &p_2+p_3=k_3. \end{split} $$
If we take $k_1$, $k_2$, and $k_3$ to be known populations, then we have a linear system of three equations in three unknowns: $p_1$, $p_2$, and $p_3$.

This is, therefore, a solvable system with a unique solution for each $p_i$, namely:
$$\begin{split} &p_1 = \tfrac{1}{2}\left(k_1+k_2-k_3\right),\\ &p_2 = \tfrac{1}{2}\left(k_2+k_3-k_1\right),\\ &p_3 = \tfrac{1}{2}\left(k_3+k_1-k_2\right). \end{split} $$
With our setup now demanding letter carriers for each route, we introduce $p_1$ of them to be assigned to each of the pen pals between cities 1 and 2. Each one is unique and delivers letters between two distinct people. In total, the $p_1$ letter carries serve $k_1 + k_2 – k_3$ people: all of those in the combined populations of cities 1 and 2, less those who are pen pals with someone in city 3. Since each letter carrier works for two people, $(k_1 + k_2 – k_3)/2$ counts the number of carriers who work between cities 1 and 2.

The argument generalises for the expressions for $p_2$ and $p_3$. The notion of the letter carriers will hopefully now not seem so ridiculous; in our last book, the number of letter carriers would have been $p_1 \equiv k_1$. Now, each $p_i$ is of a more complicated form, and so dividing through by the $p_i$ will be nontrivial.

With this in mind, how many ways are there (assuming it is possible at all) to match up the citizens in the three cities with pen pals from a different city?

The same diagram as before, but the solid sections connecting each city are now depicted as several individual lines, labelled 1 through p_1 (or p_2 or p_3).

Each pen pal pair has a unique letter carrier between them.

There are $k_1!$ ways of joining the citizens of city 1 with the $p_1+p_3=k_1$ letter carriers which service city 1; $k_2!$ ways of matching up the $k_2$ citizens with the $p_1+p_2=k_2$ letter carriers which service city 2; $k_3!$ ways of matching up the $k_3$ citizens of city 3 with the $p_2+p_3=k_3$ letter carriers which service city 3. In total we have $k_1!\,k_2!\,k_3!$ ways of assigning citizens to pen pals via unique letter carriers.

One should note that $p_1$ of the $k_2$ letter carriers servicing city 2 have already been assigned a citizen of city 1, and in a similar fashion $p_2$ of the $k_3$ letter carriers servicing city 3 have already been assigned a citizen of city 2 The remaining $p_3$ have been assigned a citizen of city 1.

As before, we now say that a letter carrier is distinguished only by their route, ie the two cities they service, and not by the particular pen pal pair they deliver letters for. So we need to divide by the number of ways $p_1$ pen pal pairs can be assigned to $p_1$ letter carriers, by $p_1!$. We need to do the same for $p_2$ and $p_3$. This provides us with
\begin{equation*} \frac{k_1!\,k_2!\,k_3!}{\left(\dfrac{k_1+k_2-k_3}{2}\right)!\left(\dfrac{k_2+k_3-k_1}{2}\right)!\left(\dfrac{k_3+k_1-k_2}{2}\right)!} \end{equation*} different ways of pairing the citizens with their pen pals, based on the solution for the $p_i$ above.

Let’s try some concrete numbers to get a sense of things. We start with some unnaturally small populations: $\{k_1,k_2,k_3\} = \{3,4,5\}$. This gives $\{p_1,p_2,p_3\} = \{1,3,2\}$, and the number of matchings is $1440$. A slightly larger example is $\{k_1,k_2,k_3\} = \{6,8,12\}$, yielding $\{p_1,p_2,p_3\} = \{1,7,5\}$, however the number of matchings is now nearly $23$ billion.

Notice that the $p_i$ are directly proportional to the population sizes—we could just as easily consider the $k_i$s above to be counted in millions, then the $p_i$s would be also. But the number of matchings would increase much, much quicker and would be of the order of ten to the power of millions!

A tale of four cities

With each next book in a series, the readers are promised more lore and unexpected twists and turns that give the characters a more interesting backstory. In the tale of four cities we do exactly that and introduce entropy.

First, we set up our populations $k_1$, $k_2$, $k_3$, $k_4$ and will again assume $k_1\leq k_2\leq k_3\leq k_4$. We must require that the total population, $k_1+k_2+k_3+k_4$, be even. We will also require that $k_1+k_2+k_3 \geq k_4$, else, as before, there will be unmatched citizens in the largest city. There are now six groups of letter carriers connecting the cities, as pictured below.

A diagram with four "cities", arranged in a square and labelled k_1, k_2, k_3, k_4, with "edges" of different widths between each pair.

The six routes in the tale of four cities and their respective counts of pen pal pairs

On the left, a diagram with cities [1,4] and [2,3] joined; on the right, two possible swaps: [1,2] and [3,4], or [1,3] and [2,4]

Two kinds of swap—a given pen pal pair between cities 1 and 4, and another between cities 2 and 3, can be reconfigured in two different ways

This makes it easy to build a linear system of equations that connect the $p_i$s with the $k_i$s:
$$ \begin{split} &p_1 + p_2 + p_3 = k_1\\ &p_1 + p_4 + p_5= k_2\\ &p_2 + p_4 + p_6=k_3\\ &p_3 + p_5 + p_6=k_4. \end{split} $$
If we take the populations to be fixed, these are four equations for six unknowns—the system is undetermined. Unlike the previous tales, here the numbers $p_i$ are not fixed by the populations $k_i$. Why not?

Consider, for example, the two sets of letter carriers $p_3$ and $p_4$ that work between cities 1 and 4, and cities 2 and 3, respectively. Let’s imagine these two carriers are friends and during their regular deliveries, they decide to meet up somewhere central to chat. What would happen if they left the meeting with the other carrier’s bag by accident?

Well, clearly the concerned pen pals are going to receive messages intended for someone else, but the accident also reveals something interesting about the tale of four cities. Notice that there are two ways this accident could play out, depending on which directions they were heading when they met.

In the first way, the two carriers are simulating the jobs of a $p_1$ carrier and a $p_6$ carrier. In mathematical terms, $p_{3,4} \to p_{3,4} – 1$ and $p_{1,6} \to p_{1,6} + 1$. In the second way, they would be simulating the jobs of a $p_2$ carrier and a $p_5$ carrier, so $p_{3,4} \to p_{3,4} – 1$ and $p_{2,5} \to p_{2,5} + 1$.

In either of these scenarios we see that, without changing the populations of the four cities, there is a freedom in how we choose the values of the $p_i$. In fact, given that there are two ways to swap, this freedom is two-dimensional. This is consistent with having four equations to fix six unknowns; there will be a two-parameter family of solutions to this linear system.

Here’s that fourth-book twist: different ways of matching up citizens with pen pals can result in different numbers of carriers servicing the links between cities! This gave me an exciting, new question to ask: given fixed populations $k_i$, how many carriers should we expect to need to assign to each of the six links between cities on average? That is, assuming all possible ways of matching citizens to pen pals are equally likely to occur, what are the typical values of the $p_i$?

I should first clarify that these ‘typical values’ really only make sense when the cities’ populations are large. In fact, the larger they are, the more likely it is that the $p_i$ are found to be at these typical values; put another way, the more rare it is to find them deviating from these values.

The answer to our question turns out to be that the typical values for the $p_i$ obey the relations
\begin{equation*} p_3p_4=p_1p_6=p_2p_5. \end{equation*} Let’s gain some intuition for why this should be true before going into the mathematical details. Imagine the two kinds of ‘swap’ shown previously in pink were happening randomly at some average rate. There are $p_3p_4$ ways to choose one of the $p_3$ letter carriers (that serve cities 1 and 4) and one of the $p_4$ (that serve cities 2 and 3) to be the ones involved in a swap. In the first kind of swap, we could restore the values of the pre-swapped $p_i$ by randomly choosing a carrier counted amongst the $p_1$ and another counted amongst the $p_6$ (there are $p_1p_6$ ways to do this) and have them ‘swap back’.

When a towel is dry, and you are wet, you can use it to dry yourself. But if the towel becomes too wet, it stops being effective, as it deposits as much water as it removes. This represents a state of maximal entropy. Random processes tend to increase entropy towards a maximal value.

Note that we aren’t quibbling about the fact that $p_1$ and $p_6$ have increased by 1 in the swap; the city’s populations are large meaning the $p_i$ are also large numbers—large being much, much greater than 1. The random process will naturally drive the system towards the state where the probability of swapping is equal to the probability of swapping back, ie when $p_3p_4=p_1p_6$. The same argument could also be applied to the second kind of swap, and the corresponding condition there is $p_3p_4=p_2p_5$.

This argument of the $p_i$s miraculously arranging themselves into a steady state has an intimate relationship with the concept of entropy. For those unfamiliar, I particularly like how Richard Feynman put it—see above. Indeed, we can draw our own analogy here between swapping, swapping back and the deposition or removal of water by the towel.

Maximising entropy

We can turn the $p_i$ into probabilities that a random letter carrier serves the corresponding route. Since each carrier serves two cities, $\sum p_i = 2 \sum k_i$, and this is a constant, fixed by the populations of the four cities. Let $N=\sum k_i$ be the total population across the four cities, then the probability that a random letter carrier is on route $i$ is $p_i/(2N)$. The entropy of a probability distribution is proportional to $-\sum p \ln p$ (more on this later), where $p$ are the probabilities, so for our distribution we have that the entropy is proportional to:
$$ \begin{split} -\sum_{i=1}^6 \frac{p_i}{2N} \ln \frac{p_i}{2N} &= -\frac{1}{2N} \sum_{i=1}^6 \big( p_i \ln p_i – p_i \ln (2N) \big)\\ &=-\frac{1}{2N} \sum_{i=1}^6 p_i \ln p_i + \ln (2N). \end{split} $$
As it turns out, neither the constant factor of $1/(2N)$ nor the additive constant $\ln(2N)$ are important to the maximisation, so we will drop them and define our entropy to be $S$, where
\begin{equation*} S = -\sum_{i=1}^6 p_i \ln p_i. \end{equation*}
Let’s look at the solution to the linear system of six unknowns. We saw that the system was undetermined; we will choose $p_5$ and $p_6$ to be our two independent variables without loss of generality and relabel them $x$ and $y$, respectively. Here’s the solution:
$$ \begin{split} &p_1 = \tfrac{1}{2} (k_1+k_2-k_3-k_4+2 y),\\ &p_2= \tfrac{1}{2}(k_1-k_2+k_3-k_4+2 x),\\ &p_3= k_4-x-y,\\ &p_4= \tfrac{1}{2} (-k_1+k_2+k_3+k_4-2 x-2 y),\\ &p_5=x,\\ &p_6=y. \end{split} $$
To gain some familiarity, let’s look at a specific example. We will take populations of $k_1 = 300$, $k_2=5000$, $k_3=18000$ and $k_4=22000$. In order to satisfy the constraint that none of the $p_i$ are negative, we must have $(x,y)\geq (4350,17350)$, but also have that $x+y \leq 22000$. This leaves some wiggle room for possible values of $x$ and $y$ since $22000-4350-17350=300$. We can allocate all, or a part of the 300 to $x$ and $y$ as we like, to increase them from their minimum values. The most likely configuration turns out to be when $35$ is added to $x$ and $9$ is added to $y$. This gives $p_1 = 9$, $p_2=35$, $p_3 = 256$, $p_4=606$, $p_5=4385$, $p_6 = 17359$.

The number of possible matchings here is very large, and would be measured by numbers near 10 to the power of a hundred thousand. What would happen if we considered a solution not too far away from optimal? For example, if we set $x=p_5=4400$ and $y=p_6=17365$, then we find $p_1=15$, $p_2=50$, $p_3=235$, $p_4=585$. The number of matchings is still astronomically large, but is some 400 times smaller than the optimal value.

Returning to the entropy $S$, we can now view it as a function of $x$ and $y$, since the $p_i$ now depend on them. We therefore seek to find the values of $x$ and $y$ which maximise
$$ S(x,y) = – \sum_{i=1}^6 p_i(x,y) \ln p_i(x,y). $$
The maximum occurs where the derivatives of $S$ with respect to both $x$ and $y$ are zero. Further, the fact that $\sum p_i = 2 N$ is a constant means its derivatives with respect to both $x$ and $y$ are zero also. In essence, we can differentiate our definition of entropy $S$ and be left with
\begin{equation*} \begin{split} &\frac{\partial}{\partial x} S(x,y) = -\sum_{i=1}^6 \left[ \frac{\partial}{\partial x} p_i(x,y)\right] \ln p_i(x,y),\\ &\frac{\partial}{\partial y} S(x,y) = -\sum_{i=1}^6 \left[ \frac{\partial}{\partial y} p_i(x,y)\right] \ln p_i(x,y). \end{split} \end{equation*}
This ultimately results in
$$ \begin{split} &\frac{\partial}{\partial x} S(x,y) = -\ln p_2 + \ln p_3 + \ln p_4 – \ln p_5,\\ &\frac{\partial}{\partial y} S(x,y) = -\ln p_1 + \ln p_3 + \ln p_4 – \ln p_6 , \end{split} $$
and setting these quantities to zero means that $p_2p_5 = p_3 p_4 = p_1p_6$, as we argued before.

Why $\sum p\ln p$?

Now, I wonder how many of you thought ”wait, where does $-\sum p \ln p$ come from?”. Rather than answer that question, let me give you a way to see why finding values of $x$ and $y$ which maximise $S(x,y)$ is the same as asking for the values of $x$ and $y$ which maximise the number of ways of matching citizens with pen pals. Following our earlier theme, the number ${\cal N}$ of matches is
\begin{equation*} {\cal N} = \frac{k_1!k_2!k_3!k_4!}{p_1!p_2!p_3!p_4!p_5!p_6!}. \end{equation*}
Taking a logarithm, we find
\begin{equation*} \ln {\cal N} = \sum_{j=1}^4 \ln (k_j!) – \sum_{i=1}^6 \ln (p_i!) . \end{equation*} There is a nice way to approximate the logarithm of a large factorial called Stirling’s approximation, $\ln(n!) \approx n\ln n – n,$ an approximation that is increasingly good the larger $n$ is. Using this approximation on $\ln {\cal N}$, we find
\begin{equation*} \ln {\cal N} \approx \sum_{j=1}^4 \ln (k_j!) + \sum_{i=1}^6 p_i -\sum_{i=1}^6 p_i\ln p_i. \end{equation*} The first two sums are both constants which don’t depend on $x$ or $y$. We see that, up to an irrelevant additive constant, $\ln {\cal N}$ is the entropy $S$. But since the logarithm is a so-called monotonically increasing function—a function whose graph always goes up—finding the values of $x$ and $y$ which maximise $\ln {\cal N}$ is the same as finding the values which maximise ${\cal N}$ itself.

A touch of irony

We can now work out what the values of $x$ and $y$ are in order to write down the typical values of the $p_i$. Our entropy-maximising constraints are $p_2p_5=p_3p_4$ and $p_1p_6=p_3p_4$, and in terms of $x$ and $y$ these are:
\begin{equation*} \begin{split} x (&k_1-k_2+k_3-k_4+2 x)\\ &=(-k_1+k_2+k_3+k_4-2 x-2 y)(k_4-x-y),\\ y (&k_1+k_2-k_3-k_4+2 y)\\ &=(-k_1+k_2+k_3+k_4-2 x-2 y) (k_4-x-y) . \end{split} \end{equation*} These equations define two intersecting hyperbolas in the $x$-$y$ plane. In order to find the relevant point of intersection in terms of the $k_i$, we can solve the second equation for $y$:
\begin{equation*} y=\frac{(-k_1+k_2+k_3+k_4-2 x) (k_4-x)}{2 (k_2+k_4-2 x)}, \end{equation*} and then substitute this into the first equation. What results from this substitution is a fourth-order polynomial, also called a quartic, in $x$:
\begin{equation*} a x^4 + b x^3 + c x^2 + d x + e = 0, \end{equation*}
where
$$ \begin{aligned} a&=12, \\
b&=8 (k_1 – 2 k_2 + k_3 – 2 k_4),\\ c&= k_1^2 + 7 k_2^2 – 8 k_2 k_3 + k_3^2 + 10 k_2 k_4 – 8 k_3 k_4\\ &\qquad \qquad + 7 k_4^2 – 2 k_1 (4 k_2 + k_3 + 4 k_4),\\ d&=-(k_2 + k_4) \big(k_1^2 + k_2^2 + (k_3 – k_4)^2\\ &\qquad \qquad – 2 k_2 (k_3 + k_4) – 2 k_1 (k_2 + k_3 + k_4)\big),\\ e&=k_2 k_4 (k_1 + k_2 – k_3 + k_4) (k_1 – k_2 – k_3 – k_4). \end{aligned} $$
Famously, the quartic is the highest degree polynomial for which one can write down a solution in terms of basic arithmetical operations $+,-,\times,\div,\surd$. There is no such formula for polynomials of degree $\geq 5$. This is ironic—the tale of four cities is the first time we have a problem to solve, since we had no freedom to alter the $p_i$ in the tale of three cities. It is also the last case where a solution can be ‘written down’—it is a horribly long expression, which would fill an entire page, and I will spare you from having to read it. We can, however, visualise the typical values (see below).

A diagram like the previous one, this time without labels or colours, emphasising how the widths of different "edges" are different. The circles with the most total edge width are the biggest.

Circle radii are proportional to the city’s population; route widths are proportional to the number of pen pals.

A touch of agony

We end our story by considering the tale of $n\geqslant5$ cities, where we begin with some counting. The big question is how many routes are there for $n$ cities? Well, every route serves exactly two cities, so there are ${n\choose 2} = n(n-1)/2$ routes. The linear system will provide $n$ equations and will hence fix $n$ of the $p_i$ (let these be $p_1,p_2,\ldots ,p_n$) in terms of the remaining \[\frac{n(n-1)}{2} – n = \frac{n(n-3)}{2}\] others. So we will have $n(n-3)/2$ variables which will need to be fixed by an equal number of entropy-maximising constraints. To understand where the constraints come from, we will build up our distribution network starting with the first $n$ routes. We are free to choose how these join the $n$ cities—see the picture below.

n cities, joined by n routes: the first n-1 connect City 1 to each of the others, and the nth connects City 2 to City 3.

The first $n$ routes

The remaining routes come in two varieties. First, there are $2(n-3)$ routes which connect each city $j\geq 4$ with cities two and three. For each of these we have a ‘tale of four cities’ scenario which fixes the associated variables, $x$ and $y$. Then, there are $(n-3)(n-4)/2$ routes which connect cities $j$ and $\ell$, where $j,\ell\geq 4$.

The two routes $x$ and $y$ connect city 6 with cities 2 and 3: in total there are $2(n-3)$ routes connecting cities 4 to $n$ to cities 2 and 3. Together, routes $x,y, p_1, p_2, p_5$ and $p_n$ form a tale of four cities.

Route $z$ connects cities 5 and 6: in total there are $(n-3)(n-4)/2$ routes connecting cities $j$ and $\ell$, where $j, \ell \geq 4$.

To fix these variables we introduce a new flavour of swap. We decrease $z$ by $1$, increasing both $p_4$ and $p_5$ by $1$, and also decrease $p_1$ and $p_2$ by $1$, increasing $p_n$ by one. The associated entropy-maximising constraint is $zp_1p_2=p_4p_5p_n$.

Let’s take a tally of these constraints. We have $2(n-3)$ relations of degree two which must involve the product of two $p$s, and $(n-3)(n-4)/2$ relations of degree three. If we were to try to reduce this system of $n(n-3)/2$ equations, by using the various equations to eliminate variables until there was one final polynomial to solve, it would have a degree equal to
\begin{equation*} d_n=2^{2(n-3)}\times 3^{(n-3)(n-4)/2}. \end{equation*} The sequence begins $d_4 = 4$, $d_5 = 48$, $d_6 = 1728$, $d_7 = 186624$, $d_8 = 60466176$.

A polynomial of degree sixty million? Well, that’s the touch of agony.

Donovan is a teacher and former physicist who can’t stop seeing mathematics in just about everything he looks at.

Donovan is a teacher and former physicist who can’t stop seeing mathematics in just about everything he looks at.

More from Chalkdust