I was watching the film Crazy Rich Asians the other day, as there’s not a lot to do at the moment besides watching Netflix and watching more Netflix. I thoroughly enjoyed the film and would highly recommend it. However, there was something that happened in the first few minutes which really got me thinking and inspired the subject of this Chalkdust article.
The main character in the film is an economics professor (the American kind where you can achieve the title of professor while still being in your 20s and without having to claw your way over a pile of your peers to the top of your research field). Within the opening scenes of the film we see her delivering a lecture, in which she is playing poker with a student, while also making remarks about how to win using ‘mathematical logic’. The bell rings seemingly halfway through the lecture the way it always does in American films and TV shows, and our professor calls out to her students “…and don’t forget your essays on conditional probability are due next week!” Now, I am not going to delve into the question of what type of economics course she is teaching that involves playing poker and mathematical logic, but it got me thinking—what exactly would an ‘essay on conditional probability’ entail?
What is conditional probability?
Conditional probability is defined as a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion, or evidence) has already occurred. For all intents and purposes here, for two events $A$ and $B$, we’ll write the conditional probability of $A$ given $B$ as $P(A \mid B)$, and define it as \[P(A \mid B) = \frac{P(A\;\text{and}\;B)}{P(B)}.\] The little bar $ \mid $ can just be thought of as meaning ‘given’.
Now that we’ve got some technicalities out of the way, let’s look at some examples of conditional probability. Imagine you are dealt exactly two playing cards from a well-shuffled standard 52–card deck. The standard deck contains exactly four kings. What is the probability that both of your cards are kings? We might, naively, say it must be simply $(4 / 52)^2 \approx 0.59$, but we would be gravely mistaken. There are four chances that the first card dealt to you (out of a deck of 52) is a king. Conditional on the first card being a king, there remains three chances (out of a deck of 51) that the second card is also a king. Conditional probability then dictates that: \begin{align*} P(\textrm{both are kings}) &= P(\textrm{second is a king} \mid \textrm{first is a king}) \times P(\textrm{first is a king}) \\ &= \frac{3}{51}\times \frac{4}{52}\approx 0.45\%.\ \end{align*} The events here are dependent upon each other, as opposed to independent. In the realm of probability, dependency of events is very important. For example, coin tosses are always independent events. When tossing a fair coin, the probability of it landing on heads, given that it previously landed on heads 10 times in a row, is still $1/2$. Even if it lands on heads 1000 times, the chance of it landing on heads on the 1001st toss is still 50%.
Bayes’ theorem
Any essay on conditional probability would be simply incomplete without a mention of Bayes’ theorem. Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. It is stated mathematically as:
Bayes’ theorem
$P(A\mid B) = \frac{P(B \mid A)P(A)}{P(B)}.$
We can derive Bayes’ theorem from the definition of conditional probability above by considering $P(A \mid B)$ and $P(B \mid A)$, and using that $P(A\;\text{and}\;B)$ equals $P(B\;\text{and}\;A)$.
A fun (and topical!) example of Bayes’ theorem arises in a medical test/screening scenario. Suppose a test for whether or not someone has a particular infection (say scorpionitis) is 90% sensitive, or equivalently, the true positive rate is 90%. This means that the probability of the test being positive, given that someone has the infection is 0.9, or $P(\textrm{positive}\mid\textrm{infected}) = 0.9$. Now suppose that this is a somewhat prevalent infection, and 6% of the population at any given time are infected, ie $P(\textrm{infected}) = 0.06$. Finally, suppose that the test has a false positive rate of 5% (or equivalently, has 95% specificity), meaning that 5% of the time, if a person is not infected, the test will return a positive result, ie $P(\textrm{positive}\mid\textrm{not infected}) = 0.05$.
Now imagine you take this test and it comes up positive. We can ask, what is the probability that you actually have this infection, given that your test result was positive? Well, \[P(\textrm{infected} \mid \textrm{positive}) = \frac{P(\textrm{positive} \mid \textrm{infected}) P(\textrm{infected})}{P(\textrm{positive})}.\] We can directly input the probabilities in the numerator based on the information provided in the previous paragraph. For the $P(\textrm{positive})$ term in the denominator, this probability has two parts to it: the probability that the test is positive and you are infected (true positive), and the probability that the test is positive and you are not infected (false positive). We need to scale these two parts according to the group of people that they apply to—either the proportion of the population that are infected, or the proportion that are not infected. Another way of thinking about this is considering the fact that \[P(\textrm{positive}) = P(\textrm{positive and infected}) + P(\textrm{positive and not infected}).\] Thus, we have \[P(\textrm{positive}) = P(\textrm{positive} \mid \textrm{infected})P(\textrm{infected}) + P(\textrm{positive} \mid \textrm{not infected})P(\textrm{not infected}).\]
And we can infer all the probabilities in this expression from the information that’s been given. Thus, we can work out that\[P(\mathrm{infected}|\mathrm{positive}) = \frac{0.9\times 0.06}{0.9\times 0.06 + 0.05\times 0.94}\approx 0.5347.\]
Unpacking this result, this means that if you test positive for an infection, and if 1 in 17 people in the population (approximately 6%) are infected at any given time, there is an almost 50% chance that you are not actually infected, despite the test having a true positive rate of 90%, and a false positive rate of 5% (compare to the proportion of the shaded area in the diagram filled by infected people). That seems pretty high. Here are some takeaways from this example: the probability that you have an infection, given that you test positive for said infection, not only depends on the accuracy of the test, but it also depends on the prevalence of the disease within the population.
Unprecedented applicability
Of course, in a real-world scenario, it’s a lot more complicated than this. For something like (and, apologies in advance for bringing it up) Covid-19, the prevalence of infection (our $P(\textrm{infected})$ value) changes with time. For example, according to government statistics, the average number of daily new cases in July 2020 was approximately 667, whereas in January 2021 it was 38,600. Furthermore, $P(\textrm{infected})$ depends on a vast number of factors including age, geographical location, and physical symptoms to name only a few. Still, it would be nice to get a sense of how Bayes’ theorem can be applied to these uNpReCeDeNtEd times.
An article from the UK Covid-19 lateral flow oversight team (catchy name, I know) released on 26 January 2021 reported that lateral flow tests (which provide results in a very short amount of time but are less accurate than the ‘gold standard’ PCR tests) achieved 78.8% sensitivity and 99.68% specificity in detecting Covid-19 infections. In the context of probabilities, this means that \begin{align*} &P(\textrm{positive} \mid \textrm{infected}) = 0.788 \textrm{ and}\\ &P(\textrm{positive} \mid \textrm{not infected}) = 0.0032. \end{align*} On 26 January 2021, there were 1,927,100 active cases of Covid-19 in the UK. Out of a population of 66 million, this is gives us a prevalence of approximately 3%, or $P(\textrm{infected}) = 0.03$.
Taking all these probabilities into account, we have \[P(\textrm{infected} \mid \textrm{positive}) = \frac{0.788 \times 0.03}{0.788 \times 0.03 + 0.0032 \times 0.97} \approx 0.8839,\] which means that the chances of you actually having Covid-19, given that you get a positive result from a lateral flow test, is about 88%. This seems pretty good, but can we make this any better?
Instead of just taking the number of active cases as a percentage of the total population of the UK to give us our prevalence, we can alternatively consider $P(\textrm{infected})$ for a particular individual. For someone who has a cough, a fever, or who recently interacted with someone who was then diagnosed with Covid-19, we could say that their $P(\textrm{infected})$ is substantially higher than the overall prevalence in the country. The article Interpreting a Covid-19 test result in the BMJ suggests a reasonable value for such an individual would be $P(\textrm{infected}) = 0.8$. It’s worth mentioning that this article has a fun interactive tool where you can play around with sensitivity and specificity values to see how this affects true and false positivity and negativity rates. Taking this new value of prevalence, $P(\textrm{infected})$, into account, then \[P(\text{infected} \mid \text{positive}) = \frac{0.788 \times 0.8}{0.788 \times 0.8 + 0.0032 \times 0.2} \approx 0.9990,\vphantom{\frac{a}{\frac{c}{b}}}\] giving us a 99.9% chance of infection given a positive test result, which is way closer to certainty than the previous value of 88%.
Can we do any better than this? Well, compared with the lateral flow Covid-19 tests, it has been found that PCR tests (which use a different kind of technology to detect infection) have substantially higher sensitivity and specificity. Another recent article in the BMJ published in September 2020 reported that the PCR Covid-19 test has 94% sensitivity and very close to 100% specificity. In a survey conducted by the Office for National Statistics in the same month, they measured how many people across England and Wales tested positive for Covid-19 infection at a given point in time, regardless of whether they reported experiencing symptoms. In the survey, even if all positive results were false, specificity would still be 99.92%. For the sensitivity and specificity reported in the BMJ article, this is equivalent to having a false negative rate of 6% and a false positive rate of 0%. If we plug these numbers in, regardless of what the prevalence is taken to be, we have: \[P(\textrm{infected} \mid \textrm{positive}) = \frac{0.94 \times P(\textrm{infected})}{0.94 \times P(\textrm{infected}) + 0 \times P(\textrm{not infected})} = 1.\] So when a test has a false positive rate of almost 0%, if you achieve a positive test result, there is essentially a 100% chance that you do in fact have Covid-19.
So what can we take away from this? Well, we have seen that if a test has higher rates of sensitivity and specificity, then the probability of the result being a true positive is also higher. However, prevalence and the value of the probability of infection also play a big role in this scenario. This could be used as an argument for targeted testing only, for example if only people with symptoms were tested then this would increase the probability of the result being a true positive. Unfortunately, it is the case that a large number of Covid-19 infections are actually asymptomatic—in one study it was found that over 40% of cases in one community in Italy had no symptoms. So, if only people with symptoms were tested, a lot of infections would be missed.
Bae’s theorem
$P(\text{Netflix} \mid \text{chill}) =$
$\frac{P(\text{chill} \mid \text{Netflix})P(\text{Netflix})}{P(\text{chill})}$
In conclusion, I’m no epidemiologist, just your average mathematician, and I don’t really have any answers. Only that conditional probability is actually pretty interesting, and it turns out you can write a whole essay on it. The ending of Crazy Rich Asians was much better than the ending to this article. Go watch it, if you haven’t already.