The Newton–Raphson fractal

The Newton–Raphson (NR) method is a black box device for finding roots to equations like $f(x) = 0$. Insert guess at input, crank handle, reinsert output as better guess. Repeat until happy with answer. NR is elegant in its simplicity and easy to implement, beloved by mathematicians, physicists, and computer scientists everywhere. Why? Not just because of its speed and efficiency, but because of its innate capacity to create strikingly intricate patterns when the inputs are complex numbers instead of real numbers.

Suppose $z_0$ is an initial guess at a root of $f(z) = 0$, where $z = x + \mathrm{i} y$ and $\mathrm{i}=\sqrt{-1}$. We start by anticipating a more accurate root to be $z_0 + \mathrm{\Delta} z$, where $\mathrm{\Delta} z$ is a small correction. An equation for $\mathrm{\Delta} z$ can be sought by Taylor expanding $f$ around $z_0$ according to \[f(z_0 + \mathrm{\Delta} z) \approx f(z_0) + f^\prime(z_0)\mathrm{\Delta} z \approx 0,\] which yields \[\mathrm{\Delta} z \approx -f(z_0)/f^\prime(z_0).\] The quantity $f^\prime(z_0)$ denotes the derivative of $f$ evaluated at $z_0$, and higher powers in the expansion (such as $\mathrm{\Delta} z^2$, $\mathrm{\Delta} z^3$, etc) have been discarded. Our approximation for $\mathrm{\Delta} z$ is valid so long as $f^\prime(z_0) \neq 0$; moreover, $\mathrm{\Delta} z$ vanishes whenever $z_0$ coincides exactly with a root of $f$. The ‘better guess’ is now computed from the translation \[z_1 = z_0 + \mathrm{\Delta} z = F(z_0),\] where \[F(z) \equiv z-f(z)/f'(z).\] The NR algorithm is the iterated application of this recipe.

In what follows, we will look for the cube roots of $-1$ so that $f(z) \equiv z^3 + 1 = 0$ and $f^\prime(z) = 3z^2$. The ordered pair $(x,y)$ represents a point in the complex plane, just like in coordinate geometry, and the three points we are looking for are the solutions $z = -1$, $(1+\mathrm{i}\sqrt{3})/2$ and $(1-\mathrm{i}\sqrt{3})/2$. They mark the corners $a_1$, $a_2$ and $a_3$ of an equilateral triangle, centred on the origin $O$, and sitting inside a circle with radius 1. That we already know the answers for $z$ is not important; we are not interested in root finding per se. Rather, we want to unpack how NR creates its famous pattern and what some of it means.

An equilateral triangle, pointing left, with vertices at the three roots of -1.

The three solutions $x+i y$ of $z^3=-1$ form an equilateral triangle

On the complex plane, the NR machinery for our cube roots problem is
\begin{align*} z_{n+1} &= F(z_n), &\textrm{where }F(\chi) &= \chi-\frac{\chi^3+1}{3\chi^2}, \end{align*} and $\chi$ denotes the input to black box $F$. The discrete index $n = 0, 1, 2, 3, \dots$ is the iterate number, often interpreted as labelling points in time that are separated by some arbitrary fixed interval. For any starting point $z_0$, repeated application of $F$ generates a sequence of numbers
z_1 &= F(z_0),\\ z_2 &= F(z_1) = F(F(z_0)),\\ z_3 &= F(z_2) = F(F(F(z_0)))
and so on. At each step, the output is fed back into $F$ as a new input. The number $z_n$ in the sequence $z_1, z_2, z_3, \dots, z_{n-1}, z_n, z_{n+1}, \dots$ is therefore computed through $n$ applications of $F$ when the initial condition is $z_0$.

Attractors and their basins

Suppose there exists a number $a$ with the property $a = F(a)$ so that feeding $a$ into $F$ returns the number $a$. Given $z_0 = a$, under iteration map $F$ must generate a very simple sequence $a, a, a, a, a, \dots$ so that input and output are identical to each other. We say that $a$ is a fixed point of $F$. For our chosen $F$, there exist three such fixed points—they are the roots of $a^3 + 1 = 0$ that we wrote down earlier. It is not a coincidence that these particular values of $a$ have appeared. By design, the NR scheme naturally has the roots we want as the fixed points of $F$.

A sequence produced by $F$ for a given initial condition $z_0$ is called a trajectory (or, sometimes, an orbit). The trajectory is really just a discrete set of points but it is often instructive to connect successive points by straight line segments to create a kind of zig-zag in the complex plane. However, it must always be borne in mind that those line segments are not themselves meaningful in any dynamical sense. They simply provide visual cues so as one can trace out a continuous path—albeit a fictitious one!—showing where a trajectory has been in the past and where it is going in the future.

The paths that NR follows if we start at the blue, green, and orange dots: the winning attractor is evidently not always the root closest to the initial condition

For almost any $z_0$, the trajectory typically bounces around in the complex plane, jumping instantaneously from one point in the sequence to the next and gradually zeroing in on $a_1$, $a_2$, or $a_3$. It is as though the trajectory is being pulled by invisible forces, attracted simultaneously towards all three fixed points until one of them ultimately ‘wins’. With that conceptual picture in mind, the three fixed points are known as attractors. But which one wins? And how does the winner vary with the choice of $z_0$?

There is an appealing graphical way to start addressing those questions on a computer. We consider a region of the complex plane, typically dividing it up into a square grid of points. Taking each point on that grid in turn, the trajectory is computed and the winning attractor identified. The attractors are colour coded (blue for $a_1$, green for $a_2$, and orange for $a_3$) and the collection of winners then overlaid back on top of the complex plane. The result is an abstract kind of map where the colour at each grid point indicates where a trajectory starting at that point will end up in the long term.

The blue area, comprising all the points whose trajectories converge on $a_1$, is the basin of attraction for $a_1$. In the same way, the green and orange areas are the basins of attraction for $a_2$ and $a_3$, respectively. The pattern along the arms where the three colours are intertwined is incredibly complicated and detailed:

Zoom in on any small region and we see the same string-of-pearls motif repeated only at a shorter length scale.

This string of pearls is the Newton–Raphson fractal for the cube roots problem, an absolute classic example of a scale-free—or self-similar—pattern. It survives under arbitrary magnification and it never goes away.

The existence of boundaries between the colours can be inferred simply from inspection: whenever the colour changes, we have implicitly crossed a boundary. But there is so much more to it than that. What, exactly, is the boundary? How is it defined? What properties does it have? What role does it play? We will now attempt to answer some of those questions from a practical perspective.

Period 2 orbits

Suppose we can find two numbers, say $b$ and $c \neq b$, such that, upon iteration, map $F$ produces a sequence $b, c, b, c, b, c, b, c, \dots$. The output of $F$ now alternates, jumping back and forth between $z = b$ and $z = c$ according to the rules $c = F(b)$ and $b = F(c)$. The numbers $b$ and $c$ are called period 2 points and, taken together, they prescribe a period 2 orbit.

Eliminating $c$, it follows that $b = F^{(2)}(b)$, where the symbol $F^{(2)}$ denotes the second iterate of $F$ (the result of feeding the number $F(b)$ back into $F$ to get $F(F(b))$)—so whenever we start with $z = b$, two iterations later we are guaranteed to return to $z = b$. Since the labelling of $b$ and $c$ is arbitrary, we must also have $c = F^{(2)}(c)$. The important point to take away is that $b$ and $c$ are two distinct values of $z$ satisfying the equation $z = F^{(2)}(z)$.

These very general concepts can now be applied to the cube roots problem. A period 2 point can be identified by the condition $z_{n+2} = z_n$, whereby the value of $z$ is repeated every two iterations. Since $z_{n+1} = F(z_n)$, replacing index $n$ with $n+1$ throughout gives
z_{n+2} &= F(z_{n+1}) \\ &= F(F(z_n)) \\ &= F^{(2)}(z_n).
We can thus find all the period 2 points of $F$ by solving \[ z = F(F(z)) = z-\dfrac{z^3+1}{3z^2} – \dfrac{\left(z-\dfrac{z^3+1}{3z^2}\right)^3 + 1}{3\left(z-\dfrac{z^3+1}{3z^2}\right)^2}, \] which is a decidedly daunting task.

With a bit of algebra and a bit of patience, we can show that its roots are also the roots of a simpler equation that can be written as the product of two polynomials: \[ (z^3+1)(20z^6-5z^3 + 2) = 0. \] The possibility that $z^3 + 1 = 0$ must be discarded because its solutions are the three fixed points of $F$ that we already know. Recall that fixed points correspond to sequences like $a, a, a, a, \dots$, clearly repeating after every two iterations but not period 2 in the $b, c, b, c, \dots$ ‘flipping’ sense. They are reappearing here after being unintentionally snagged by the condition $z_{n+2} = z_n$. The remaining true period 2 points must instead be the six solutions to $20z^6-5z^3 + 2 = 0$, which eagle eyed observers will recognise as a quadratic in $z^3$. Solve to find $z^3 = (5 \pm \sqrt{-135})/40 = (5 \pm \mathrm{i} 3\sqrt{15})/40$, then take the cube roots. This procedure gives us these period 2 points:

Period 2 points (dots) and their orbits (lines)

Period 3 orbits

We can do the whole thing again, only this time looking for a triplet of different numbers $d$, $e$ and $f$ such that $F$ produces the sequence $d, e, f, d, e, f, d,$ $e, f, \dots$. The output now repeats after every three iterations according to the rules $e = F(d)$, $f = F(e)$ and $d = F(f)$. Using exactly the same type of elimination as before, it is easy to show that $d$, $e$ and $f$ must each be a unique value of $z$ satisfying $z = F^{(3)}(z)$, where $F^{(3)} \equiv F(F(F))$ denotes the third iterate of $F$. Those numbers are referred to as period 3 points and together they prescribe a period 3 orbit.

The period 3 condition is $z_{n+3} = z_n$, and by shifting the index according to $n \rightarrow n + 2$ we end up with
z_{n+3} &= F(z_{n+2}) \\&= F(F(z_{n+1})) \\& = F(F(F(z_n))) \equiv F^{(3)}(z_n).
Hence, period 3 points can be found by identifying the solutions of $z = F^{(3)}(z)$, which is an equation far too big to write out here! In brief, we get a degree 27 polynomial, where three of the roots are the fixed points of $F$ and so can be factorised away (just as they were in the period 2 case).

After a lot more algebra and a lot more patience, we are obliged to solve $\text{19,456}z^{24} – \text{98,368}z^{21} + \text{195,820}z^{18} – \text{140,530}z^{15} + \text{60,493}z^{12} – \text{14,413}z^9 + 2149z^6 – 196z^3 + 16 = 0$ to find these period 3 points:

Period 3 orbits break up into two distinct groups with different shapes

Each period 3 orbit comprises three points and the set of $24/3 = 8$ such orbits naturally separates into two groups. The first contains six scalene triangle shaped orbits distributed about the arms of the NR fractal. The second contains two equilateral triangle shaped orbits centred on the origin that are slightly rotated versions of each other. The organisation of sets of periodic orbits into distinct families is absolutely essential, and it follows from symmetries present in map $F$ (not by accident, the three reflections and three rotations of an equilateral triangle).

Beyond period 3 orbits

The barriers to finding period $N$ orbits with $N = 4, 5, 6, \dots$ are fairly stark. Prior to discarding any unwanted solutions, the equation $z = F^{(N)}(z)$ leads to a degree $3^N$ polynomial and so the size of the task increases exponentially with $N$. For example, the case of $N = 4$ gives degree 81. After removing the three fixed points and six period 2 points, that 81 is reduced to 72. For Chalkdust readers’ pleasure, we have persevered and solved the period 4 problem. The $72/4=18$ orbits are:

The 72 period 4 points comprise two groups of bat orbits (top row), a central cobweb, and stretched quadrilaterals along the arms

The situation is even worse for $N = 5$, where degree 243 reduces only slightly to 240 and there exist $240/5=48$ orbits to find and classify. One can reasonably expect many kinds of exotic solutions, but we are not brave enough to attempt that here.

As suggested by the terminology, the three fixed point attractors are all stable in the sense that a trajectory following an $a, a, a, a, \dots$ sequence is robust. Any small disturbance will knock the trajectory off course a little bit, but it will always tend back toward $a, a, a, a, \dots$ as $n \rightarrow \infty$. Crucially, the same is not true of the periodic solutions—all of which are unstable. Reconsider our description of a period 3 orbit. For an initial condition such as $z_0 = d$, map $F$ would lead to a sequence $d, e, f, d, e, f, \dots$ that repeats indefinitely. But what about when $z_0 = d + \varepsilon$, where $\varepsilon$ is an arbitrarily small disturbance? In that case, $F$ will produce a sequence that is approximately $d, e, f, d, e, f, \dots$ in the short term—but ‘approximately’ is not good enough. Tiny differences due to the $\varepsilon$ seed grow rapidly under repeated iteration and the trajectory eventually wanders off in the complex plane, never to return.

The periodic orbits in $F$ are said to be repellors since perturbed trajectories tend to move away from them. Such is their precariousness that they do not tend to survive for very long in numerical calculations, a fact due to the finite size (and, thus, precision) of computers. Hardcore mathematicians will have already clocked, for example, that periodic points tend to involve irrational numbers and so they cannot be reproduced as finite decimals with perfect accuracy.

The Julia set of $F$

The final conceptual leap ties everything together. Take all of the periodic points of $F$: the six period-2 points, 24 period-3 points, 72 period-4 points, 240 period-5 points, …, 515,377,520,732,011,331,036,460,411,867,633,580,846,032,026,400 period-100 points, and so on. That uncountable collection of repelling periodic points is the Julia set of $F$, denoted by $J(F)$. Why is $J(F)$ so important? Because the Julia set is the boundary between the three basins of attraction.

The Julia set (black) and the period 2 (blue), period 3 (green), and period 4 (orange) points. %The distribution of points is necessarily mirror symmetric about a line down the centre of each arm.

Julia sets are some of the most fascinating objects in modern mathematics. They are nearly always fractal, and have many mind boggling properties, but we mention only a handful. If $z_0$ is an element of $J(F)$, then all the subsequent iterates, $z_1, z_2, z_3, \dots, z_n$ as $n \rightarrow \infty$, are also elements of $J(F)$. Trajectories on $J(F)$ are thus bounded—they jump to and fro indefinitely between other points on the Julia set, neither converging onto one of the attractors nor, importantly, going off to infinity.

The dynamics on $J(F)$ are also chaotic, where long term trajectories are highly sensitive to tiny fluctuations in the initial condition. In marked contrast, the dynamics inside a basin of attraction are regular because such susceptibility is absent.

Finally, take any point in $J(F)$ and imagine drawing a circle of radius $\varepsilon$ around it. No matter how arbitrarily small we make $\varepsilon$, there are always uncountably many other points of $J(F)$ within that circle. Also inside the same circle are points lying in all three basins of attraction. Any point in $J$ is therefore on the boundary between all three colours. Put another way, no two colours can ever form a solid interface because the third always manages to squeeze in between. That is the essence of the Wada property, and it is by no means an intuitive concept to grasp.

Concluding remarks

We have hopefully left you with a sense that fractals are endlessly fascinating (literally!) and they are enormous fun to play with, just for the pleasure of it. More seriously, they often appear at the intersection of many research problems in applied mathematics and theoretical physics.

At an abstract level, Julia sets are closely related to their much more famous Mandelbrot set cousins while, in the root finding arena, they persist far beyond the simple NR approximation we started with. Sophisticated tweaks to $F$ can yield fractal basin boundaries with ever greater complexity compared to their NR counterparts. Julia sets also play a fundamental role in discrete dynamics, particularly when unravelling the phenomenon of chaos—broadly speaking, where systems can be deterministic (prescribed by equations in which there are no uncertainties) and yet still show wildly erratic behaviours that have an apparent air of randomness about them.

The key to unlocking the mystery of a Newton–Raphson fractal turns out to be its Julia set. But that is really just the start; so far as this article goes, we have probably raised more questions than we could possibly answer here. Why are the fixed points attracting (or, more correctly, superattracting)? Why are the periodic orbits repelling (some more so than others)? What happens when a trajectory lands on the origin (and which trajectories end up there in the first place)? What are the implications of fractal basin boundaries for predictability (and for physics more generally)? Why is all this stuff so utterly addictive? The list is almost as endless as the Julia set itself…


What’s odd about Pascal’s triangle?

It was towards the end of the school year. Teachers were winding down for the summer. Issue 17 of Chalkdust had just been published. Life was good. But maths doesn’t stop when exams stop. In fact, this is when the fun really starts.

As a teacher, I like nothing more than getting my students to challenge their mathematical intuition. We’d already debated whether maths is created or discovered (inspired by issue 17’s big argument). It was time for another challenge. I asked my students to use their mathematical intuition to determine whether they thought that there were more or less (or the same number of) even numbers than odd numbers in Pascal’s triangle.

Continue reading