In her fabulous book Once Upon a Prime, Sarah Hart shows us myriad ways in which mathematics and literature interact. One of the most straightforward of these is structure: the mathematics defines the shape and literature fills it. And one of the simplest examples of structure is in the rhyming patterns of poems.
The doggerel at the top of the post is something I’m going to call a universal poem. By that I mean that it contains every possible rhyming pattern of a particular length, which in this case is couplets. There are only two such patterns: either the second line rhymes with the first or it doesn’t. In my poem above, the first two lines rhyme giving the first pattern, and the last two lines don’t rhyme, giving the second pattern. It’s also the shortest—in number of lines—universal poem for couplets.
What is a couplet?
If it has been a little while since you have engaged with poetry, a rhyming couplet is two successive (but not necessarily consecutive) lines that rhyme. Rhyming triplets are then three successive lines that rhyme. You could carry on and write poetry with four, five or even $n$ successive lines that rhyme!
There are lots of things to count when it comes to rhymes. The first that probably comes to mind is to count the number of rhyming patterns of a given length.
Rhyming patterns are described using a letter for each line. Each distinct letter denotes a distinct rhyme, so that lines labelled with the same letter are meant to rhyme. So for couplets we get AA (both rhyme) and AB (no rhyme), while my poem above has the rhyming pattern AAB. There are five rhyming patterns for triplets, the full list being AAA, AAB, ABA, ABB, and ABC. It might seem more logical to label the fourth of these as BAA, but rhyming patterns always start with A and then use the next available letter when introducing a new rhyme. The sequence of counts of rhyming patterns is in the Online Encyclopaedia of Integer Sequences (OEIS) as sequence A000110 and starts 1, 2, 5, 15, 52. The problem of counting rhyming patterns is a nice exercise and can lead to some interesting combinatorics. However, that’s not what this article is about. What I want to consider is the idea of finding all the rhyming patterns of a given length within a single poem, which I’m calling a universal poem. This leads me to a couple of questions:
It’s obvious that we can always find a poem, since we can just write a poem with stanzas (that is, verses) of the given length in which each stanza has a different rhyming scheme.
But by overlapping, as in my opening opus, we can do better. Next question:
I could change my opening ode to:
and this flips the rhyming scheme to ABB.
In one sense this is a new universal poem structure, but as we’ll see later, it can also be considered to be equivalent to the original.
Incidentally, by fairly trivial changes I can actually get every rhyming scheme for triplets from this poem:
Truly, it is an amazing piece of art!
The poetry of de Bruijn sequences
To unlock the structure of a universal poem we need to know about de Bruijn sequences. These are a fascinating piece of mathematics that can be illustrated by something that many of us encounter on a daily basis: key codes, such as those that unlock doors.
Many places have these key-coded locked doors where you have to enter a code on a pad to unlock the door. There are two types of these: one where you have to enter the code and then press an enter key to submit it, and one where as soon as you’ve entered the code, the door unlocks immediately.
Suppose you have a door secured by a four-digit numerical code. For the first type of lock, to go through all possibilities it would take up to $4 \times 10^4$ key presses. (There are $10^4$ combinations and each requires 4 key presses—we’ll ignore the enter button for this.) For the second type, however, it turns out we can do a bit better.
Not a lot, but a bit.
Let’s reduce the numbers somewhat and suppose it is a three-digit code but with only two buttons, say 0 and 1. So there are eight possible codes: 000 to 111. That’s $3 \times 2^3 = 24$ key presses to go through them all. But if it were the type that opens as soon as the code is complete, we could run through the lot with just 10 key presses: the sequence 0001110100 contains all eight possible codes within it and so will unlock such a door.
Before you get too set on a life of opening doors, this reduces the number of possibilities from $3 \times 2^3$ to $2^3 + 2$, so our more common door lock reduces from $4 \times 10^4$ to $10^4 + 3$. Still quite a big number.
Such a sequence is called a de Bruijn sequence. I first learnt of them from the mystical book Magical Mathematics by Persi Diaconis and Ron Graham.
I expect you can now see where I’m going: can we find a rhyming pattern that is a de Bruijn pattern for all rhymes of a given length?
Eulerian poetry
There is a standard technique for finding de Bruijn sequences which is very well explained in Diaconis and Graham’s book. The core of it is to set up a certain graph and look for an Euler path (or cycle) of that graph. As a reminder, an Eulerian path goes along every edge of a graph while an Eulerian cycle goes along every edge of a graph starting and ending at the same node.
The graph works like this. We start with a sequence of codes that we want to find a de Bruijn sequence for, say the eight codes 000 to 111. We strip off the last character of each and look at the set of remaining prefixes (so, ignore duplicates): in this case, just 00, 01, 10 and 11. These are the nodes in our graph.
The edges are then the possible suffixes at each node. In this case just 0 and 1 at each node. The target of an edge is found as follows: append the edge label to the node label to form a code, and then remove the first character. This will then be a possible prefix in the node list. So from 01 we could follow edge 1 which produces the code 011 and so the target of that edge is 11.
If the graph has an Euler path (or cycle) we construct our de Bruijn sequence as follows: write down the node label of the starting node, and then append the edge labels as we go round the path.
Starting from 00 we can find an Euler path by following 0 to remain at 00, then 1 to 01, again 1 to 11, again 1 to remain at 11, then 0 to 10, followed by 1 to 01 and 0 back to 10, then finally 0 to return to 00.
This produces: 0001110100, as above.
In fact, since this Euler path is actually a cycle, the last two 00s can be viewed as the same as the first two and so if we join the start to the end we can think of it as an eight digit code 00011101 which contains all 3 digit codes providing we think of it as a cyclical list. I’ll call this a de Bruijn cycle.
The length of any de Bruijn cycle or sequence of the graph is completely determined by the structure of the graph. That of a cycle is the number of edges in the graph, and that of a sequence is the number of edges plus the length of a node label.
Universal poetry
Let’s apply this to rhyming patterns of couplets. There aren’t many: AA and AB. Following the above recipe, we have one node for the prefix, which is just A. Then we have two edges. According to the above, one should be labelled A and the other should be labelled B.
However, there is a slight difference here in that the edge that should be labelled B goes from the node labelled A to itself, not to a node labelled B. This is because the label B says ‘This line has a different rhyme to its predecessor’, but once that line is written then its predecessor is removed from view and so its rhyming pattern is that of a single line. Then according to the convention on rhyming patterns this is labelled just A. So when setting up a suitable graph of rhyming patterns then there is a little bookkeeping to take care of when a rhyme ‘falls off the edge’. To help with remembering this, I’m using lowercase for labelling edges.
Here’s the graph for couplets:
Applying the technique yields two de Bruijn sequences: we have to start at A, after which we have a choice of a or b, but then we have to follow the other one. This yields AAB and ABB as the two rhyming patterns that contain every possible rhyming pattern for couplets. And this leads to the opening doggerel (and its variants).
Combinatorial poetry
The best type of answer is one that raises even more questions than it answers. Indeed, having gotten this far then I found myself with a few questions:
(The last question is just my second question from earlier and we are almost ready to answer it!)
The answers are ‘yes’, ‘no’, and ‘good question’.
Before looking at that last one, let us first deal with the shortest length of a universal poem. By the construction of a de Bruijn sequence from its graph, the length of the sequence is the number of rhyming patterns plus one less than their length. So this sequence starts $1 + 0 = 1$, $2 + 1 = 3$, $5 + 2 = 7$, $15 + 3 = 18$. This is not in the OEIS, the closest is A338735 which differs from this by 1 at each term.
And so to the last question: how many universal poems are there of a given length?
We’ve seen that there are two universal rhyming patterns for couplets: AAB and ABB. But we can split the counting into two parts: it turns out that all the Eulerian paths are actually cycles (which hopefully makes the ‘no’ answer to the second question above slightly intriguing). So from one cycle we can get a number of different rhyming patterns just by choosing a different starting point along it. As the number of starting points is just the number of edges in the graph regardless of the cycle, to count the number of rhyming patterns it is simpler to start by counting the number of cycles. Even here though the story is not quite that simple; it depends on what one considers as ‘the same’ for two rhyming patterns for universal poems. I’m counting two the same if all the appropriate subpatterns are the same. Thus our two rhyming patterns for couplets are actually from the same underlying Eulerian cycle—this is what I meant by saying that my two universal poems at the beginning were really the same.
There’s also just one cycle for singlets (well, what would you call a one-line poem?) so our sequence starts $1, 1$.
To find out how it continues we need to draw some more graphs.
Here’s the graph for triplets:
As with couplets, we have to be careful with the interpretation of the edges.
When we traverse an edge, we have to examine the last two lines of the rhyming pattern. We then have a set number of options depending on whether the last two lines of rhyming pattern rhyme of don’t rhyme:
It’s not hard to see that the only choice here when walking along the edges is as to the order of the loops a and c at node AB so we get two Eulerian cycles.
So the two cycles (starting at AA) look like this:
AAabacb: the final b means ‘match the second rhyme in the preceding couplet’ and that couplet is AC. So we can write this as AAABACC
AAabcab: the a looping at AB means ‘match the first rhyme in the preceding couplet’ and that couplet is BC. So we can write this as AAABCBB.
So our sequence of number of Eulerian cycles in rhyming graphs starts $1,1,2$.
Here’s the graph for four-line poems:
Counting the Eulerian cycles in this seems a bit daunting, but it can be simplified. Firstly, the loops. The loop labelled b at node ABA has to be traversed, and so it can be traversed at any stage when we arrive at node ABA. Since there are two paths that lead to ABA, we must visit it twice and so we can choose either of those visits as the time to traverse the loop. So the presence of that loop means that we get twice the number of Eulerian cycles that we would get if we simply ignored it. Similarly the loops at ABC can be traversed one of six ways since there are two ways to order them, and then three ways to assign an ordered pair to the two times we visit the node: either both on first visit, one on each (in their order), or both on the second. So we get a factor of 6 here. The loop at AAA doesn’t affect things since we only visit AAA once in a cycle. Therefore, removing all the loops simplifies the graph at the expense of a total factor of 12.
When looking for Eulerian cycles, it is easier to pick a starting edge. For aesthetic reasons, I like to start with the edge from ABB to AAA. The first decision is then at AAB where we have three outgoing edges. We’ll see that we can pick these in any order, leading to a factor of 6. Each of these exits from AAB must return to it via ABB (except for the last, which ends the cycle there) and there are two routes left from ABB to AAB, another factor of 2.
Now of the exits from AAB, it turns out that the crucial choice is as to whether the edge labelled a happens before or after that labelled c. Suppose it happens first, then there are three routes from ABA to ABB: direct (along a), via ABC (along c and c), or via ABC and then back to ABA (along b). Once one of those routes has been chosen, the route from AAB that starts along c is fixed. So this gives us three options. A similar analysis for if c happens first gives us also three options. So for each ordering of the exits from AAB there are three possible ways to arrange the edges between AAB and ABB.
Putting all of these factors together gives: \[2 \times 6 \times 6 \times 2 \times 3 = 432.\] So our sequence of number of Eulerian cycles goes: \[1, 1, 2, 432.\] To get the number of rhyming patterns, multiply each by the number of edges in its graph: \[1 \times 1, \, 1 \times 2, \, 2 \times 5, \, 432 \times 15 = 1, 2, 10, 6480.\] Neither of these sequences is in the OEIS!
The real challenge, though, is to write a poem that matches one of these patterns. Or possibly all of them—a universal universal poem would be one that had every possible universal poem (for a given length) contained within it!
Still, as Sarah Hart tells us, someone’s written a book with 100,000,000,000,000 sonnets. So why not challenge your English department?
A universal poem for triplets
I started this article with a poem, so it seems only fitting to end it similarly. Here is a possible universal poem for triplets using the pattern AABACCC which starts at the AA node and then follows the b edge and then the A edge: