I am an applied mathematician. Like, very applied. I use mathematics, sure, but I’m far more interested in what maths can do to solve real problems. Specifically, how can mathematics be used to tackle issues in infectious diseases?
The best thing about maths, from my point of view, is that it has one incredible superpower: it can predict the future. That’s amazing. And very useful, of course.
When teaching my undergraduate students the required details to make these predictions, I stumbled upon a very profound realisation. Namely, that the Jacobian matrix – a technical thing from linear algebra – is a) just about the most massively useful thing ever and b) a glorious way to reconcile two apparently disparate strands of mathematics.
Don’t believe me? Read on and hopefully you too will become a convert to the church of the Jacobian…
Eigenvalues: why?
Being an applied mathematician, the issue of eigenvalues and eigenvectors always puzzled me. Specifically, why on earth would you want to deal with them in the first place? I’m all for doing maths for its own sake, but I often have to explain mathematics to people who a) aren’t mathematicians and b) live their life in sheer terror of equations.
(No, since you ask, they’re not all politicians. Okay, some of them are politicians. But not all. I work with biologists, epidemiologists and policy-makers, who need to use maths to make predictions, but would often be very happy if they never saw a Greek letter ever again.)
So to explain dizzying mathematical concepts to such people, you need to do it in a very grounded way. That is, absolutely everything needs a rock-solid foundation. Usually that’s not a problem, because the types of maths that they’ll encounter (disease models using differential equations) all have a very sensible grounding in the real world that can give them a handle on it. But not eigenvalues. That’s a tough one.
The best explanation I could come up with was this: If $\boldsymbol{\mathsf{A}}$ is a matrix and $\mathbf x$ is a vector, then $\boldsymbol{\mathsf{A}}\mathbf x$ is also a vector. So could it be possible that $\boldsymbol{\mathsf{A}}\mathbf x=\mathbf x$ somehow? (Okay, this isn’t exactly ‘the real world’ at its finest, but bear with me.)
First off, if this is the case, then that tells us something about the dimension of $\boldsymbol{\mathsf{A}}$: it has to be a square matrix. That is, there must be as many columns in $\boldsymbol{\mathsf{A}}$ as there are rows in $\mathbf x$ (or else we wouldn’t be able to multiply them together). But if $\boldsymbol{\mathsf{A}}\mathbf x=\mathbf x$, then there must also be as many rows in $\boldsymbol{\mathsf{A}}$ as there are in $\mathbf x$ (or else the product would produce a vector of a different size). That’s pretty neat, actually.
Second, my non-mathematician friends can solve this themselves, because the answer is clearly $\mathbf x=\mathbf 0$. Hooray, that was easy! Except… well duh, of course it is. So the real question is: are there any vectors other than zero? Sorry non-maths buddies, we’re not letting you off the hook that easily.
Third, let’s be slightly more general. Let’s suppose that multiplying $\boldsymbol{\mathsf{A}}$ by $\mathbf x$ doesn’t just equal $\mathbf x$, but can equal some scalar multiple of $\mathbf x$ (which we’ll call $\lambda$). This gives us the core equation
$$\boldsymbol{\mathsf{A}}\mathbf x=\lambda \mathbf x. \qquad\qquad(1)$$
At this point, if you’re a non-mathematician, you’ll probably throw your hands up and storm off. Not because you’ve seen an equation (heaven forbid), but because you probably know just enough to know that there’s a fundamental issue here and it’s this: this is a ridiculous problem.
No, really, it is. If $\boldsymbol{\mathsf{A}}$ is known and I can safely say that $\boldsymbol{\mathsf{A}}$ is an $n\times n$ matrix, then solving $\boldsymbol{\mathsf{A}}\mathbf x=\mathbf x$ is bad enough, because you have to solve n equations for n unknowns. Okay, fine. But if we add in the issue of $\lambda$, then you have at least n+1 unknowns… and that’s if there’s only one $\lambda$. (Spoiler: there isn’t.) In fact, there’s now a second problem, which is: how do we even know how many $\lambda$s will satisfy this equation? Those crazy mathematicians…
And, I have to say, that’s an excellent question. The problem appears far too big to solve in any sensible way. But let’s proceed nonetheless. (Yep, we mathematicians are kind of crazy. That part isn’t in doubt.) If these were numbers, I’d just move everything over to one side and divide. I can’t quite do that with matrices and vectors, but let’s see how we go anyway:
\begin{align}
\boldsymbol{\mathsf{A}}\mathbf x &=\lambda \mathbf x\\
\boldsymbol{\mathsf{A}}\mathbf x -\lambda \mathbf x&=\mathbf 0 && \text{(subtracting the right-hand side)}\\
(\boldsymbol{\mathsf{A}} -\lambda) \mathbf x&=\mathbf 0 &&\text{(factoring out the $\mathbf x$).}
\end{align}
Bzzzzt! Uh-uh, you can’t do that. Why not? Because $\boldsymbol{\mathsf{A}}$ is a matrix and $\lambda$ is a number, so they can’t be subtracted from one another. (You can multiply a matrix by a number, sure, but you can only add or subtract things of the same dimension.) This is where the fact that we’re dealing with matrices rather than numbers starts to throw up issues.
Fortunately, we have a secret weapon at our disposal and – you guessed it – it’s the identity matrix. (Okay, maybe you didn’t guess it. That’s okay, that’s why it’s a secret.) The identity matrix is essentially the matrix equivalent of the number 1. If you multiply something by it, that something is unchanged. It also happens to be square.
Fortunately, we have a secret weapon at our disposal and – you guessed it – it’s the identity matrix.
In particular, $\boldsymbol{\mathsf{I}}\mathbf x=\mathbf x$. Which… okay, sure, ‘is this going somewhere?’ ask my non-mathematician friends, whose patience is surely being tried at this point. But the answer is yes, because of the fact that equals signs work in both directions.
Usually we’d think that we’re multiplying $\mathbf x$ by $\boldsymbol{\mathsf{I}}$. Which isn’t terribly interesting. But because equals signs work both ways, we can do the reverse too: we can take $\mathbf x$ and insert the identity matrix $\boldsymbol{\mathsf{I}}$ in front of it. Why would we want to do that? Because it solves our ‘$\lambda$’ problem! Let’s see:
\begin{align}
\boldsymbol{\mathsf{A}}\mathbf x -\lambda \mathbf x&=\mathbf 0 && \\
\boldsymbol{\mathsf{A}}\mathbf x -\lambda \boldsymbol{\mathsf{I}}\mathbf x&=\mathbf 0 && \text{(inserting $\boldsymbol{\mathsf{I}}$)}\\
(\boldsymbol{\mathsf{A}} -\lambda \boldsymbol{\mathsf{I}}) \mathbf x&=\mathbf 0 &&\text{(factoring out the $\mathbf x$).}
\end{align}
See, there’s no problem now. $\boldsymbol{\mathsf{A}}$ is a square matrix (it had to be, remember) and $\boldsymbol{\mathsf{I}}$ is also a square matrix, so things line up perfectly.
If these were numbers, I’d just divide… so long as I knew I wasn’t dividing by zero. With matrices, we do something similar: we take the inverse… so long as the matrix is actually invertible.
Actually, I have no idea whether the matrix $(\boldsymbol{\mathsf{A}}-\lambda \boldsymbol{\mathsf{I}})$ is invertible or not. How would I? I don’t have the first clue what $\lambda$ is. So let’s consider both possibilities.
Possibility 1: $(\boldsymbol{\mathsf{A}}-\lambda \boldsymbol{\mathsf{I}})$ is invertible. Right, off we go.
\begin{align*}
(\boldsymbol{\mathsf{A}} -\lambda \boldsymbol{\mathsf{I}}) \mathbf x&=\mathbf 0 &&\\
(\boldsymbol{\mathsf{A}} -\lambda \boldsymbol{\mathsf{I}})^{-1}(\boldsymbol{\mathsf{A}} -\lambda \boldsymbol{\mathsf{I}}) \mathbf x&=(\boldsymbol{\mathsf{A}} -\lambda \boldsymbol{\mathsf{I}})^{-1}\mathbf 0 &&\text{(apply the inverse to both sides, on the left)}\\
\mathbf x &=\mathbf 0.
\end{align*}
D’oh. My non-maths friends could have told you that. In fact, they did. So if $(\boldsymbol{\mathsf{A}}-\lambda \boldsymbol{\mathsf{I}})$ is invertible, then there’s only the trivial solution and we explicitly excluded that. So that means that $(\boldsymbol{\mathsf{A}}-\lambda \boldsymbol{\mathsf{I}})$ cannot be invertible.
Possibility 2: $(\boldsymbol{\mathsf{A}}-\lambda \boldsymbol{\mathsf{I}})$ is not invertible. There’s another way to say this:
\begin{align}
\det(\boldsymbol{\mathsf{A}}-\lambda \boldsymbol{\mathsf{I}})=0.\label{det}\qquad\qquad(2)
\end{align}
Woah. Woah! Did we just figure out $\lambda$ independently of $\mathbf x$? Hells yes! In fact, I’ll go one better: since $\boldsymbol{\mathsf{A}}$ is an $n\times n$ matrix, this equation is an $n$th order polynomial in $\lambda$. No, wait, I can do even better than that: the fundamental theorem of algebra states that an $n$th order polynomial not only has a solution, it has precisely $n$ solutions in the complex field. (In fact, this is why we want complex numbers in the first place: to solve polynomials.)
This is magnificent. It means a) I know how many $\lambda$s there are; b) I can solve equation (2) for these special values (aka ‘eigenvalues’) first; and c) for each eigenvalue, I can use equation (1) to find the special vectors (aka ‘eigenvectors’) $\mathbf x$.
Because of the noninvertibility of $(\boldsymbol{\mathsf{A}}-\lambda \boldsymbol{\mathsf{I}})$, I’ll get an infinite number of eigenvectors corresponding to each eigenvalue, but that’s okay. I solved my maths problem.
Except, I still haven’t solved my biological problem. And my non-maths friends are being very patient here. Why would I want to do this in the first place?
How the Jacobian helps disease modelling
When I use differential equations to create a mathematical model of a disease (e.g. HIV, Ebola, zombies), I might write something like this:
\begin{align*}
{\mathrm{d}S\over \mathrm{d}t} &= \Lambda-\mu S-\beta SI \\
{\mathrm{d}I\over \mathrm{d}t} &=\beta SI – \mu I-\gamma I.
\end{align*}
This is a classic disease model, called the “SI” model. It stands for Susceptible–Infected. To create such a model, you figure out what comes in and what goes out. Here, I have susceptible individuals, who can come in only by being born, at a rate $\Lambda$. Two things can happen to them: they can die (from things unrelated to the disease) or they can become infected. So they leave (for good) at rate $-\mu S$ and they transfer from the susceptible compartment to the infected compartment at rate $-\beta SI$.
Likewise, the infected individuals can only come into being by becoming infected, at rate $\beta SI$. They leave either by dying (from things unrelated to the disease), at rate $-\mu I$, or by dying from the disease itself, at rate $-\gamma I$.
Because this is a nonlinear system of differential equations, I have no hope of solving it. But I’d still like to ask crucial questions, like ‘What happens eventually?’. Specifically, will the disease die out or will it become endemic? If the disease is the flu, this might be useful information to have. If the disease is zombies or Ebola, this might be really useful information to have.
If the disease is the flu, this might be useful information to have. If the disease is zombies or Ebola, this might be really useful information to have.
In general, I can’t solve differential equations. But I can solve equations! So the first step is to look at equilibria. That is, values when both derivatives are zero:
\begin{align*}
0 &= \Lambda-\mu S-\beta SI \\
0 &=\beta SI – \mu I-\gamma I.
\end{align*}
These are two simultaneous equations with two unknowns. Easy peasy. I can solve this to find two equilibria:
\begin{align*}
(S,I\,)=\left({\Lambda\over \mu},0\right), \left({\mu+\gamma\over \beta}, {\Lambda\over \mu+\gamma}-{\mu \over\beta}\right).
\end{align*}
The first one, called the disease-free equilibrium, always exists; the second, known as the endemic equilibrium, only exists some of the time. Pipe down please, mathematicians: these are infected individuals we’re talking about – how can they be negative? Or, to put it another way, we’d like to avoid negative people.
Finding the endemic equilibrium can be tough as models get more complicated. But the disease-free equilibrium always exists (in any sensible model) and is easy to find because it comes with the additional constraint that $I=0$.
What I’d really like to know is the stability of the disease-free equilibrium. Why is this so crucial? If I start on the equilibrium, I stay there of course. But suppose I start close to my equilibrium (e.g. a few infected individuals). If the equilibrium is stable, the disease will die off. If the equilibrium is unstable, then the disease will persist.
Stability of equilibria in one dimension is equivalent to looking at the slope of the tangent line at the equilibrium. In higher dimensions, we can generalise: we linearise around our equilibrium and find the analogue of the ‘slope’. In one dimension, we simply use the derivative. But in higher dimensions, there’s no such thing as ‘the’ derivative; instead, we have many partial derivatives.
The way to make this work is to create a matrix of partial derivatives, called—you guessed it!—the Jacobian. Essentially, it’s the analogue of ‘the’ derivative in higher dimensions. You create it by differentiating every equation with respect to every variable. Thus
\begin{align*}
\boldsymbol{\mathsf{J}}&=\begin{bmatrix} {\partial F_1\over \partial x_1} & {\partial F_1\over \partial x_2} & \cdots & {\partial F_1\over \partial x_n} \\[2mm]
{\partial F_2\over \partial x_1} & {\partial F_2\over \partial x_2} & \cdots & {\partial F_2\over \partial x_n}\\[2mm]
\vdots & \vdots & \ddots & \vdots\\[2mm]
{\partial F_n\over \partial x_1} & {\partial F_n\over \partial x_2}& \cdots & {\partial F_n\over \partial x_n}\end{bmatrix}.
\end{align*}
It turns out that this is just about the most massively useful thing ever. Okay, my non-maths friends are arching their eyebrows at this point, because it sure doesn’t look like it from where they’re sitting. But let’s work through our example:
\begin{align*}
\boldsymbol{\mathsf{J}}&=\begin{bmatrix}-\mu-\beta I & -\beta S \\
\beta I & \beta S-\mu-\gamma
\end{bmatrix} & \begin{array}{l}\text{(the first column is the derivatives with }\\
\text{respect to $S$, the second with respect to $I$)}\end{array}\\
\boldsymbol{\mathsf{J}}\bigg|_{(S,I\,)=\left({\Lambda\over\mu},0\right)}&=\begin{bmatrix}\;\,\,-\mu \;\;& -\beta \Lambda/\mu \\
\;0 & \beta \Lambda/\mu-\mu-\gamma
\end{bmatrix} & \begin{array}{l}\text{(substituting the disease-free equilibrium). }\end{array}
\end{align*}
Happily, this matrix is upper triangular, so the eigenvalues lie on the diagonal (not true in general, of course). Our eigenvalues are thus $$\lambda=-\mu \qquad \text{and} \qquad {\beta\Lambda\over\mu}-\mu-\gamma.$$ Hooray!
For any two-dimensional linear system (which ours isn’t, of course), solutions look like
\begin{align}
\left(\begin{array}{c}S\\I\end{array}\right) &=c_1\mathrm{e}^{\lambda_1t}\mathbf v_1+c_2\mathrm{e}^{\lambda_2t}\mathbf v_2 \qquad\qquad (3)
\end{align}
where $c_1$ and $c_2$ are arbitrary constants and $\mathbf v_1$ and $\mathbf v_2$ are eigenvectors corresponding to the eigenvalues $\lambda_1$ and $\lambda_2$, respectively. That is, we take linear combinations of two fundamental solutions. The eigenvector part of these solutions gives us the dimensionality. But the eigenvalue part is exponential, which turns out to be hugely important.
How do we know that solutions are always in the form of equation (3)? Because we know everything there is to know about linear systems. For nonlinear systems, we only know some special cases, but linear systems have been completely solved.
Finding an eradication threshold
So let’s recap. We can fully solve any linear system and the solution depends on the eigenvalues. Actually, better than that, it only depends on the sign of the eigenvalues. Actually, even better: it only depends on the sign of the real part of the eigenvalues. Why? Because these are exponentials. So if $\lambda=a+\mathrm{i}b$, then solutions are of the order
\begin{align*}
\mathrm{e}^{\lambda t}=\mathrm{e}^{(a+\mathrm{i}b)t}&=\mathrm{e}^{at}(\cos bt + \mathrm{i} \sin bt)
\end{align*}
using Euler’s formula. If $a>0$, solutions increase without bound. If $a<0$, solutions decrease to zero.
Thus linear solutions either blow up or go to zero at a rate of $\mathrm{e}^{at}$. That is, if we have complex eigenvalues, we can ignore the imaginary part, because the imaginary part only contributes oscillations: i.e. it’s bounded, so plays no part in the stability. If there’s more than one eigenvalue, then we need all the eigenvalues to have negative real part for stability; if just one eigenvalue has positive real part, then solutions blow up. This makes sense, because while the others are happily going to zero (exponentially fast), if one is heading to infinity, then the solution overall is going to be unstable.
(If you’re paying attention, you’ll notice that I’ve left something out. What did I miss? Before you read on, try to think about the case I didn’t mention.)
But that’s for linear systems, whereas we’re usually dealing with nonlinear systems. So here’s the rub: if my initial conditions are sufficiently close to my equilibrium, then – in almost all cases – the behaviour in the nonlinear system can be approximated by the behaviour in the linear system. This is because a function looks a lot like its tangent, if you’re sufficiently close to the point in question. That’s not just true of one-dimensional functions, it’s true in higher dimensions too.
So the same result applies: we take our nonlinear system, find equilibria, calculate the Jacobian and find the eigenvalues. If the real part of any eigenvalue is positive, then solutions blow up. If the real part of all the eigenvalues is negative, then solutions converge (locally) to the equilibrium. The case I missed? If the real part is zero. In that case, we have to use more complicated methods. But those cases are a) few in number and b) irrelevant if we take a broader view.
Looking at our example, we see that our eigenvalues are $-\mu$ and ${\beta\Lambda\over\mu}-\mu-\gamma$. The first one is always negative, so it plays no part in the stability (it’s not enough to prove stability, but it doesn’t rule it out either). But the second one could be either positive or negative, depending. This gives us a threshold criterion. Define
\begin{align*}
R_0&={\beta \Lambda \over \mu(\mu+\gamma)}.
\end{align*}
If $R_0<1$, the disease dies out (because all the eigenvalues are negative), whereas if $R_0>1$, the disease persists (because there’s a positive eigenvalue). This of course depends on the parameters of the disease, like the transmission rate $\beta$ and the death (or recovery) rate $\gamma$, as well as characteristics of the population like the birth rate $\Lambda$ and the background death rate $\mu$.
Why did I put that as a fraction instead of just talking about the eigenvalues? Two reasons: 1) because $R_0$ actually represents the average number of secondary infections caused by a single infected individual; and 2) because my non-maths friends don’t know what an eigenvalue is, but they do know what $R_0$ is. It’s a well-established value in the biological sciences and is one of the things we have in common with them.
So there you go. I can now fully predict the progress of my disease. I simply calculate $R_0$ (which I can get by finding the eigenvalues, which I get from the Jacobian matrix) and then I know everything I need to know. Even better, if I can intervene in some way, like by lowering the transmission rate $\beta$ (such as through a vaccine or using condoms or changing patterns of behaviour), then I lower my $R_0$. If my intervention lowers $R_0$ sufficiently, then it will fall below 1 and then my disease will no longer persist and will instead be eradicated.
What this means is that I have a very powerful method for telling the future. I can assess my intervention methods in advance and know whether they’ll be effective or not. This can save me a great deal of money and, potentially, very many lives as well.
What this means is that I have a very powerful method for telling the future.
And it all comes about because of a little thing called the Jacobian matrix. A matrix you build from a mathematical model using calculus, that you analyse using algebra and which, as we’ve now seen, turns out to be just about the most massively useful thing in the history of ever.
Actually, I’m not joking about that last point. It’s been estimated that malaria has killed half of all humans who ever lived. Half. One in two humans who ever lived died of malaria. And yet, today, most of us in the developed world don’t die of malaria. Why not? Because disease modellers used $R_0$ to show that spraying insecticide would switch the system from persistence to eradication. So they sprayed the world with DDT. Hence there’s a very good chance that most of us are alive today, thanks to this. And thanks to the Jacobian.
So don’t underestimate the power of mathematics. It might just save your life. And probably has.
(Title image from NIAID, licensed under Creative Commons CC BY 2.0)