post

How do calculators do trigonometry?

How your calculator DOESN’T do it

The easiest and least accurate way to calculate sine and cosine for an angle is to measure a physical object. Want to know $\sin(40^\circ)$? Then get out your protractor, draw a right-angled triangle with a $40^\circ$ angle, and divide the measured length of the opposite side by the hypotenuse.

Finding $\sin(40^\circ) \approx$ 6cm/10cm through measurement

During the all-too-brief reign of the analogue calculator, you could effectively do this electro-mechanically by measuring the induction in a coil of wire angled $40^\circ$ out of phase with another coil — but unless your calculator was salvaged from the navigation computer of a 50s Boeing bomber aircraft, this is not how your calculator does it. Nor does your calculator consult tables of values painstakingly hand-derived from the angle addition and half-angle trigonometric identities, as was the norm for mathematicians, scientists, and engineers until the mid-century.

If you’re like me, you assumed that modern calculators approximate sine and cosine with polynomial functions using techniques from calculus and analysis. (If you’re very like me, you told a high school student this in your first year of teaching, struggling to make the key ideas of Taylor series accessible without referencing calculus.) However, these algebraic approximations prove slow to converge. A sophisticated computer can take a shortcut by storing the old immense tables in memory and then Chebyshev-approximating to get the last few digits of accuracy. But what about your pocket or scientific calculator, or, for that matter, your TI graphing calculator — where adding that architecture would be an expensive hassle?

A weirdly capitalised acronym

Enter CORDIC, the COordinate Rotation DIgital Computer — an algorithm designed by Convair’s Jack Volder to exploit the strengths of the then-emerging digital computers to their fullest.

The idea that Volder laid out in “Binary Computation Algorithms for Coordinate Rotation and Function Generation” is simple: Program the computer to be able to perform a set of progressively smaller rotations, which it can then apply on one of the points of a known right-angled triangle—say, the $45^\circ-45^\circ-90^\circ$ isosceles right triangle with hypotenuse $1$—until it reaches the angle, and thus the measurements, of desired triangle.

Rotating a vertex to find $\sin(75^\circ) \approx 0.97\ldots/1$.

By itself, this eliminates the need to store an immense table—roughly speaking, each new smaller rotation refines the result to an additional binary digit of accuracy, and all that needs to be stored are the instructions for rotating by that angle. Indeed, similar methods were used for those hand-generated tables: the method described so far is a geometric perspective on using the angle-sum trigonometric identities. Volder’s algorithm, however, doesn’t stop there. It adds in two crucial details that computers just adore:

  • Don’t choose your set of rotations by progressively halving the angle—choose them by halving their tangents: $\tan(45^\circ)=1$, $\tan(26.56\ldots^\circ)=\frac{1}{2}$, $\tan(14.03\ldots^\circ)=\frac{1}{4}$…
  • Instead of performing a true rotation, attach the side of a right triangle with the desired angle to the old hypotenuse, creating what Richard Parris calls a fan

Building a fan to approximate a $40^\circ$- angle.

Both of these refinements to the scheme would seem to make our lives more difficult; after all, we lose both the nice rational angles and the hypotenuse that stays obligingly fixed at length 1. We shall see, however, that the actual computations are more straightforward, and are even simpler in binary arithmetic. If you are particularly fond of transformation matrices, you can pause your reading right now to work out why this is on your own (or read it in Richard Parris’ excellent “Elementary Functions and Calculators“)—but if you’d like to see a more purely geometric explanation, read on!

CORDIC: A geometric justification

Let’s look at a simple example of these rules in action. Say we have a right-angled triangle with sides of length $a$, $b$, and $c$, following well-trod convention. Let $c$ be the hypotenuse, and let side $a$ be adjacent to an angle $\theta$:

A right-angled triangle.

We then attach to the hypotenuse a triangle with a tangent ratio of $\frac{1}{2}$—that is, such that the side opposite the angle $\varphi$ is half the length of the side next to it. This opposite side, then, has length $\frac{c}{2}$:

Attaching a triangle with a tangent ratio of $\frac{1}{2}$.

And what of the triangle we want—a right-angled triangle with angle $\theta+\varphi$? Consider a vertical drawn from the opposite angle of the new triangle, perpendicular to side $a$, and a horizontal from the opposite angle of the original triangle perpendicular to side $b$:

A triangle with angle $\theta + \varphi$.

We then have the marked congruent angles—so the right triangle created by the intersection of the vertical and the horizontal is similar to the original triangle, and indeed is simply a triangle half the size, rotated by $90^\circ$! Thus, the dimensions of a right triangle with angle $\theta+\varphi$ can be $a-\frac{b}{2}$ on the adjacent side, and $b+\frac{a}{2}$ on the opposite side:

The side-lengths of the triangle with angle $\theta + \varphi$.

Similarly, we find sides of $a+\frac{b}{2}$ and $b-\frac{a}{2}$ for the triangle with angle $\theta-\varphi$, and would find $b+\frac{a}{2^n}$ and $a-\frac{b}{2^n}$ if we had sought $\varphi$ with a tangent halved n times. The change in the side lengths will always be given by that little triangle, which — as its sides are constructed parallel to those of the original triangle — will always be similar to the original triangle, with its scale determined by the tangent ratio of the new triangle in the fan.

What your calculator (maybe) does

To build out the fan and track the coordinates of the point, we only need to repeat this little divide-and-add process until we get close enough to the desired angle. For a binary computer, this is easy. Dividing by two is a simple matter of moving the decimal over by one place; dividing by $2^n$, moving the decimal point over by $n$ places. Many simpler calculators use binary-coded decimal instead of pure binary—storing each digit of a decimal number individually in binary representation—which adds some wrinkles, but follows fundamentally the same principle. All that remains is addition.

For calculation of the sine and cosine ratios, we need only one further piece of information: the length of the hypotenuse. Here the “fan” is helpful, especially if we make one further proviso—we never leave an angle, or triangle, out. Then the component triangles of the fan are fixed, and in return for losing a bit of speed in narrowing in on the correct angle, we can have the computer memorise the length of the hypotenuse of the final, thinnest triangle in the fan—that same line will be the hypotenuse of the desired triangle!

Let us walk through this process, in binary, with the example a four-step fan for $40^\circ$ illustrated above. We start with an isosceles right triangle with legs of length 1. This gives us our initial coordinates $\left(x_0,y_0\right)$ of $\left(1,1\right)$. We then subtract the the half-tangent right triangle, about $26^\circ$ (below, a subscript “$d$” indicates the number is in base-10):
\[x_1 = x_0+\frac{y_0}{2_d^{1_d}} = 1+\frac{1}{2_d} = 1+0.1=1.1\]
\[y_1 = y_0-\frac{x_0}{2_d^{1_d}} = 1-\frac{1}{2_d} = 1-0.1 = 0.1\]
Then we add the quarter-tangent triangle, about $14^\circ$:
\[x_2 = x_1-\frac{y_1}{2_d^{2_d}} =1.1-\frac{0.1}{4_d}= 1.1-0.001=1.011\]
\[y_2 = y_1+\frac{x_1}{2_d^{2_d}} =0.1+\frac{1.1}{4_d} = 0.1+0.011 = 0.111\]
And finally, we add the eighth-tangent triangle, or about $7^\circ$, to arrive at an angle of $39.6^\circ$:
\[x_3 = x_2-\frac{y_2}{2_d^{3_d}} =1.011-\frac{0.111}{8_d}= 1.011-0.000111=1.010001\]
\[y_3 = y_2+\frac{x_2}{2_d^{3_d}} =0.111+\frac{1.011}{8_d} = 0.111+0.001011 = 1.000011\]
Transitioning back into decimal, this gives us an $x$-coordinate of about 1.27 and a $y$-coordinate of about 1.047. Repeated application of the Pythagorean theorem, meanwhile, tells us that the length of the hypotenuse of the eighth-tangent triangle—shared by the hypotenuse of the combined $39.6^\circ$ triangle—is about 1.64. So, $\sin(40^\circ)\approx \frac{1.047}{1.64} = 0.638$. This is pretty close to the actual three-place value of 0.643—but it should be noted that $40^\circ$ is one of the more convenient angles to reach in this fashion. A more usual implementation of CORDIC would be set to halve the tangent twenty or thirty times to ensure accuracy past five decimal places.

So, we can calculate since and cosine with only one computationally costly task: dividing by a set number at the end. Those computational costs, though, have been decreasing, and CORDIC’s clever algorithmic thriftiness has begun to fall from favour. Intel’s CPUs haven’t used CORDIC since the days of the 486, and even some modern graphing calculators have dropped the method in favour of the polynomial approach. But even though CORDIC may someday cease to be a tool people use themselves, its incredible simplicity, cheap implementation, and proven accuracy will keep it in use in dedicated processors and purpose-built machines for as long as speciality electronics needs to know how to measure triangles.

An exercise for the reader

As the hyperbolic trigonometric functions share many basic properties with their circle-trigonometric counterparts, it may come as little surprise that CORDIC can also evaluate $\sinh$, $\cosh$, $\tanh$, and the rest of the hyperbolic ratios — though values on hyperbola can shift fast enough to leave ‘gaps’ that hyperbolic-tangent–halving alone can’t reach. Curiously, though, the fact that $\cosh(x)+\sinh(x)=e^x$ means that the basic CORDIC method can even be adapted to evaluate exponents and logarithms. If you can describe a geometric justification for the hyperbolic, exponential, or logarithmic variants like the circular case above, please let me know — I’m tired of looking at matrices.

post

Counting all the numbers between zero and one

Heavy as the setting sun
Oh, I’m counting all the numbers
between zero and one
Happy, but a little lost
Well, I don’t know what I don’t know
so I’ll kick my shoes off and run

Sir Sly, &Run

There’s a song popular on the radio right now (‘&Run’ by Sir Sly) that contains the lyric, “I’m counting all the numbers between zero and one.” The first time I heard it, I thought, that’s not going to take very long. I assumed the singer was referring to integers, of which there are none strictly between zero and one. Was he trying to tell the woman he was singing to that he was impatient to see her again? Or suggesting that they were as close as two adjacent integers, with literally nothing keeping them apart?

The second time I heard the song, though, it occurred to me that he might be referring to the real numbers, of which there are an infinite number between zero and one. Counting those would keep him busy for, well, ever. Was he expressing his (infinite) patience? Or was he in despair at an impossible task? It all depends on what Sir Sly means by ‘numbers.’ (Also ‘counting’, and even ‘between’.) Continue reading

post

Unknown pleasures: pulsars and the first data revolution

Many of the 20th century’s most famous albums are as celebrated for their artwork as they are for their music. This is particularly true of albums from the 1970’s: think of Bowie’s Aladdin Sane,  Fleetwood Mac’s Rumours, or Pink Floyd’s Dark side of the moon. The 1970’s were a golden era for concept albums in popular music, meaning that the world’s biggest artists spent a lot of time and money producing beautiful covers to fit the sound and message of the songs within.

A recurring theme in artistic images from that decade is space. In the age of the moon landing, when much of technological development was focused on pushing the human frontier, cosmic images were everywhere in popular culture. The original Star Trek series was first broadcast in 1966, the first Star Wars film was released in 1977, and E.T. followed five years later. Similarly, Parliament’s Mothership connection (1975) and Supertramp’s Crime of the century (1974) both feature space-inspired covers. With space comes science, so it’s no surprise that many of the most famous album covers have an interesting scientific tale to tell.

One spacey image with a particularly rich backstory comes from Joy Division’s 1979 debut, Unknown pleasures. The cover of the album features an array of wavy, organic-looking lines on a black background, with no information suggesting where they might come from. (Neither the title of the album nor the name of the artist appears on the front cover.) In fact the image is a ‘stacked plot’ depicting radiation emitted by a pulsar, a type of star that had been discovered 12 years earlier, and gives a glimpse into the transformative effect that early computers had on scientific research.

The cover of Unknown pleasures.

Continue reading

post

Talkdust, episode 4

It’s time for episode four of Talkdust!

You can listen to the podcast using the player below, or download an mp3, or search for us in your in favourite podcast app (or add us to your app manually using the RSS feed).

The presenters of this episode of Talkdust are Sean Jamshidi and Eleanor Doman. Music and editing by Matthew Scroggs, and announcements by Tom Rivlin. In this episode:

  • Eleanor and Sean talk about the launch of issue 09.
  • Stephen Muirhead tells us about his article in issue 09.
  • Eleanor and Sean talk some more about the launch of issue 09.
  • Eleanor and Sean play a game in which they try to describe mathematical words to each other and take the second place position on the leaderboard.
post

Read Issue 09 now!

Our latest edition, Issue 09, is available now. Enjoy the articles online or scroll down to view the magazine as a PDF.

Features

Fun

Printed versions

or

Download Issue 09 as a PDF

post

When Truchet met Chladni

While catching up on past issues of Chalkdust over the Christmas break, I was intrigued to see that issue 07 featured an article on Chladni figures (On the cover: Chladni figures of a square drum by James Cann), followed in issue 08 by an article on Truchet tiles (Too good to be Truchet by Colin Beveridge). This coincidence fascinated me because, although not the focus of the articles, there is a remarkable conjecture that links these two apparently unrelated objects, via the mathematics of percolation theory. Continue reading

post

Playing billiards with cue-bics

Icy patches sit in wait and bare trees shiver in anticipation but spring is still dawdling, likely enjoying a final cuppa before going back to work. What are we to do while days are dreary and the wind rummages through empty branches? Why, stay inside and play with geometry software, of course! Yet winter was long and snowy, and after one too many hours indoors the treasure trove of maths is perhaps running low.

What to do once ancient theorems are re-proved and the excitement of a parametric plot has wilted? Well, there is still a nugget of algebra that could flourish into hours of exploratory fun. So grab your Geometer’s Sketchpads, your Geogebra, or simply some transparencies and coloured markers, and let’s play billiards with polynomials. Continue reading

post

Hiding in plain sight

For a very long time now, people have been coming up with ingenious ways of exchanging messages securely. From the basic Caesar cipher to the Enigma, however, there has been a universal caveat: how to agree on the encryption/decryption key without meeting face to face?

To put it simply, consider the following situation: Alice and Bob want to communicate with one another using an Enigma machine. Alice adjusts the settings of her machine, tells Bob what those settings are, types a secret message, and sends the encrypted text to Bob, who then adjusts his own machine’s settings and proceeds to decrypt the message.

But hang on a minute. How exactly did Alice give her settings to Bob? Alice can’t send encrypted messages unless Bob has those settings, but at the same time Alice has no way of secretly passing those settings on to Bob. Is there a way for them to somehow come up with a shared secret key (ie the settings for the machines) by exchanging messages visible to anyone listening to their conversation?

The Diffie–Hellman protocol allows us to do exactly that. Let’s dig in and see how it works!

Some modular arithmetic

We are all used to the regular addition and multiplication of numbers. Indeed no one would ever question that $4 + 11 = 15$. In modular arithmetic, we change the rules slightly, by saying two numbers $a$ and $b$ are the same mod ${p}$, or congruent mod ${p}$, if their difference is an integer multiple of $p$, and we write $a \equiv b \bmod{p}$. For example, since $15 – 1 = 2\times7$, we would write $15 \equiv 1 \bmod{7}$. This is sometimes know as clock arithmetic because we kind of wrap around every time we reach $p$.

Let’s think of what happens when we take the integers mod ${5}$. Observe that any integer will be congruent to one of $0,1,2,3,4$, and that none of $0,1,2,3,4$ are congruent to each other. The set $\{0,1,2,3,4\}$ is therefore enough to “represent” all the integers mod ${5}$, and we call this set $\mathbb{Z}_5$.

Just as with the regular integers, we can add and multiply integers mod ${p}$ together. Of particular interest to us is the multiplication, because it gives an interesting structure to the set. Notice that almost every element in the set $\mathbb{Z}_p$ has what we call a multiplicative inverse. For example in $\mathbb{Z}_5$, $2 \times 3 \equiv 1 \bmod{5} $, so we say that $3$ is the inverse of 2 mod ${5}$ (and vice-versa that $2$ is the inverse of 3 mod ${5}$). Similarly, $4\times 4\equiv 1 \bmod{5}$, so we say that $4$ is its own inverse mod ${5}$.

I did say almost every element, because $0$ never has a multiplicative inverse. This could cause some problems down the road, so we decide to do away with zero entirely, and call the resulting set $\mathbb{Z}_p^*$. Now, $\mathbb{Z}_p^*$ together with multiplication has a lot of structure, and it is in fact an example of a group. We will see another example of a group later on in the article.

 

There is something special about $\mathbb{Z}_p^*$ that makes it very useful in cryptography. Let’s look at $\mathbb{Z}_7^*$ and enumerate the powers of 3 mod ${7}$. We have:

The elements of $\mathbb{Z}_7^*$ written in a circle, with arrows showing multiplication by 3

 

$3^1 \equiv 3 \bmod{7}$
$3^2 \equiv 2 \bmod{7}$
$3^3 \equiv 6 \bmod{7}$
$3^4 \equiv 4 \bmod{7}$
$3^5 \equiv 5 \bmod{7}$
$3^6 \equiv 1 \bmod{7}$

 

 

What do we notice? Every one of $1,2,3,\ldots,6 \bmod{7}$ can be expressed as a power of 3 mod ${7}$! Because of this, we say that $3$ is a primitive root mod ${7}$ and that $\mathbb{Z}_7^*$ is a cyclic group. Note that not every element of $\mathbb{Z}_7^*$ is a primitive root: if we instead looked at powers of 2 mod ${7}$ for example, we would only get half of all possible values. Because $7$ is prime, however, we are guaranteed the existence of primitive roots in $\mathbb{Z}_7^*$. This simple fact will be paramount in making sure our Diffie–Hellman protocol is secure, but before we get there, let me give you a small problem: can you list all the primitive roots mod ${7}$?

It shouldn’t take very long to determine that these are $3$ and $5$. But $7$ is a small prime; what about finding all the primitive roots mod ${23}$?

This is a bit tedious if you try every number up to $23$, but you should find exactly $10$ distinct primitive roots. Notice that $10$ is exactly the number of integers smaller than $22$ which are coprime to $22$. This is not a coincidence! In general, there are exactly $\phi(p-1)$ distinct primitive roots in $\mathbb{Z}_p^*$, where $\phi(n)$ is the number of integers between $1$ and $n$ that are coprime to $n$. Although this doesn’t tell us how to find primitive roots, it does give an upper bound on how many numbers we have to try before finding a primitive root. Computers are reasonably good at finding primitive roots, but the larger the prime the longer it takes. Because of this, we tend to reuse primes whose primitive roots we’ve already computed when doing cryptography.

Discrete logarithms and the Diffie–Hellman key exchange

Now let’s think about another problem: if I give you some $k$, say $k=3$, can you compute $5^k \bmod{7}$? This is not too difficult, and computers are quite fast at computing powers of elements mod ${p}$.

The inverse problem, though, is not so straightforward: given that $5^k \equiv 3 \bmod{7}$, can you find $k$? Yet again, because $7$ is a small prime, you shouldn’t have a hard time figuring out the answer. I invite you to think about your process in solving the problem.

My guess is that you raised $5$ to different powers until you found that $5^5 \equiv 3 \bmod{7}$. What if I told you that $5^k \equiv 48 \bmod{53}$? Can you find $k$ then? Going about it the same way as before, it might take you a while. The general problem of finding $k$ such that $g^{k} \equiv m \bmod{p}$, where $g$ is a primitive root, is known as the discrete logarithm problem, and it is a very difficult problem indeed when $p$ is very large. Hence the function
$$
k \mapsto g^{k} \bmod{p}
$$
is an example of a trapdoor function: it’s easy to compute $g^{k} \bmod{p}$, but hard to find $k$ if you only know $g$ and $g^{k} \bmod{p}$.

We can use this trapdoor function to publicly exchange keys. Let’s see how:

That’s it! I personally find it hard not to be taken aback by the simplicity of the procedure: because Alice knows $n$, she can raise $g^{m}$, which she received from Bob, to the $n$th power and get $g^{nm}$, and similarly for Bob. After an exchange of information taking place entirely in plain sight, Alice and Bob now have a secret key that they can use to encrypt all further communication! This is what made the Diffie–Hellman protocol revolutionary.

To understand why this is secure, let us put ourselves in the shoes of an attacker. In the course of the whole exchange, what do we see? Because the communications are not encrypted, we will know the values $p$, $g$, $g^{n} \bmod{p}$, and $g^{m} \bmod{p}$. To obtain $g^{nm} \bmod{p}$, one essentially needs to solve the discrete logarithm problem to find $n$ or $m$.

Ironically enough, there is no formal proof that the discrete log problem is hard to solve. We do have quite a few algorithms that compute discrete logarithms, but none of them performs particularly well. There is an efficient quantum algorithm due to Peter Shor: it is an algorithm that would run on a quantum computer, which could pose a threat to current systems should practical quantum computers ever hit the market.

Another type of logjam

But quantum computing aside, Diffie–Hellman is not perfect. As mentioned before, choosing such large primes and finding their primitive roots is not something you want to do every single time you initiate a link, and, in practice, the same primes are reused. This is fine in general, as the discrete logarithm problem is considered hard enough for very large groups, even if those groups are known in advance.

 

In 2015, however, a group of computer scientists reported that pre-computations were feasible for those primes which we know are used—essentially building a lookup table of the powers of primitive roots for those specific primes. These pre-computations would take a lot of resources, and the estimated cost of such an operation is in the hundreds of millions of dollars, which has been pointed out to be well within the budget of some national agencies.

This vulnerability is known as logjam. The ways to circumvent it are to either use larger primes, or to use a variant of the Diffie–Hellman protocol known as elliptic-curve Diffie–Hellman.

What is an elliptic curve?

An elliptic curve is the solution set of an equation of the form: $y^{2} = x^{3} + ax^{2} + bx + c$. For us, the numbers $a$, $b$, and $c$ will be integers. We also require, as a technicality, that the curve has no cusps and does not cross itself. We have to specify which values of $x$ and $y$ we allow: rational numbers? real numbers? complex numbers? We will denote the set (field) we draw our values from by $k$ and the elliptic curve by $\mathcal{C}(k)$. On the next page, there are some examples of elliptic curves over the real numbers.

Later on, we will be considering curves over finite fields, in particular the $\mathbb{Z}_p$ we talked about earlier. Here’s what our curves look like over those fields:

You might be thinking that these don’t exactly look like curves, but they are: they’re the set of points that satisfy the equation over $\mathbb{Z}_p$. What we’re interested in is the structure of the points on those curves, and in practice we may have lots of questions about those points: how many are there? Is there an effective way of finding all of them?

A theorem of Hasse tells us that in general, there should be about $p+1$ points on an elliptic curve over $\mathbb{Z}_p$, and another result by Mordell and Weil tells us that starting with finitely many points on an elliptic curve, we can generate all the points on the curve. We won’t
understand the proof of this theorem by the end of this article, however it would be nice to understand what it is saying.

How can we generate new points from old points? It turns out that’s not particularly complicated. Take two points $P$ and $Q$ on the elliptic curve. Draw a line through $P$ and $Q$. This line will intersect the curve at some third point, which we name $P*Q$.

$y^{2}=x^{3}-x+4$ with point addition constructions

This point is on the elliptic curve, so we could stop here, but we go one step further, and for good reasons. Draw a vertical line through $P*Q$. It will intersect the curve at one other point, and this point we name $P+Q$.

If $P$ and $Q$ are the same point, then we do the same thing except that we draw a line tangent to $P$ instead of through $P$ and $Q$ at the beginning.

This new “addition” of points is the first step to giving some structure to the set of points on an elliptic curve.

Let’s make another definition: starting from a point $P$, draw a vertical line through it. Just as before this vertical line intersects the curve at one other point, and we call this point $-P$. Now I want you to add $P$ and $-P$ together. What happens? It seems like the whole thing breaks down: the line through $P$ and $-P$ doesn’t intersect the curve anywhere else, so we don’t get a third point at all. We can circumvent this by thinking of there being an extra point on the elliptic curve, a point which exists at infinity and where all vertical lines meet.

To make this notion clear requires some projective geometry, and this might be a good moment to ask for a leap of faith–it works. To give you some intuition, you could think of our flat plane as being embedded on some very large sphere. Then any two vertical lines we draw will look parallel as seen from within the plane, but if we pull back and see them as being on the sphere, we will see that they all intersect at the north pole of the sphere. Therefore we can think of the north pole as our extra point at infinity, and we call it $\mathcal{O}$. This analogy is very limiting and should really be taken with a grain of salt, but I hope it gives some intuition for what $\mathcal{O}$ is. Now it becomes clear that $P + (-P) = \mathcal{O}$. Furthermore $P + \mathcal{O} = P$, so $\mathcal{O}$ serves as an additive identity.

Lastly, addition of points on elliptic curves defined like this is associative. The bottom line is that the set of points on an elliptic curve with addition is quite similar to the set $\mathbb{Z}_p^*$ with multiplication: it’s a group!

Though the geometric explanation for the group law makes sense, it has its shortcomings: it doesn’t tell us how to compute the coordinates of $P+Q$ in a way a computer could follow, and it doesn’t work when we consider our curve over $\mathbb{Z}_p$. These issues are easily taken care of: since all we’re doing is playing with equations of lines and cubics, we can derive formulae for the coordinates of $P+Q$ in terms of the coordinates of $P$ and $Q$ (I invite the motivated reader to try and find these formulae). Once we have the formulae, we can use them to define addition of points and forget altogether about the previous geometric description.

Elliptic curves over finite fields and our trapdoor function

Let’s actually go through an example of an elliptic curve over a finite field to get a sense of what’s going on. Consider the curve $y^{2} = x^{3} + x$ over $\mathbb{Z}_5$, the integers modulo $5$. We are going to compute all the points on this curve. The way to do this is to plug values of $x$ into the right-hand side and to see if we get a square modulo $5$, in which case we get a solution. For example plugging $x=0$, we get $y^{2}\equiv 0 \bmod{5}$, which has solution $y=0$, so $(0,0)$ is a point on our curve. Plugging $x=1$, we get $y^{2} \equiv 2 \bmod{5}$, which doesn’t have any solutions. Plugging $x=2$, we get $y^{2} \equiv 0 \bmod{5}$, so $(2,0)$ is on our curve.

Continuing like this, we find $(3,0)$ and no other points. We also shouldn’t forget $\mathcal{O}$ (it’s there even over finite fields!). Hence in this case $\mathcal{C}(\mathbb{Z}_5) = \{\mathcal{O}, (0,0), (2,0), (3,0)\}$.

A slightly more interesting example is the curve $y^{2} = x^{3} + 2x + 2$ over $\mathbb{Z}_{17}$. I invite you to check that $P = (3,1)$ is a point on this elliptic curve. If we do some computations (most likely with a computer), we find that $19P = \mathcal{O}$ and that $P, 2P, \ldots, 18P \ne \mathcal{O}$. We say that $P$ has order 19. We also find that $P, 2P,\ldots, 18P$ are all distinct. An interesting question is now: given the point $P$ and a point $kP$ on our elliptic curve, can we find $k$?

The problem is analogous to the discrete logarithm problem, and is called the elliptic curve discrete logarithm problem. Even though it is relatively easy to state, it is very hard to solve, and our methods for tackling it are only marginally better than brute-forcing. This is why we can use smaller prime numbers in the protocol without fear of getting hacked. Choosing a point $G$ on our elliptic curve, our trapdoor function is then $$k \mapsto kG,$$ and we can implement Diffie–Hellman as before:

How much better are elliptic curves?

The whole point of using elliptic curve cryptography instead of ‘classical’ algorithms is that we are able to achieve comparable cryptographic strength with much smaller key sizes. For example, a 228-bit elliptic curve key is equivalent to a 3072-bit Diffie–Hellman key. This means that choosing a 69-digit prime $p$ for our $\mathbb{Z}_p$ in elliptic curve Diffie–Hellman gives us roughly the same security as using a 925-digit prime for regular Diffie–Hellman. As you can imagine, computations are much faster with smaller numbers.

Personally, I think the sheer fact that elliptic curves, which are objects used extensively in pure mathematics, can be impactful in everyday applications is amazing in and of itself!