Happy Thanksgiving. Do You Have Toenailitis?

24 Nov 2016 • 3 minute read

It’s Thanksgiving! Happy Thanksgiving if you are reading this on the day. Cadence is closed, of course. But here is a post anyway. This has nothing to do with EDA or the semiconductor ecosystem or even Cadence. It is also a puzzle and an important lesson.

The Puzzle

There is a test for toenailitis that has an accuracy of 99%. That is, if you have toenailitis, then there is a 99% chance that the test will correctly say you do. If you don't have toenailimtis, there is a 99% chance that the test will correctly be false. There is actually no reason that these two numbers would be the same and typically they are not, but let's keep things simple. Luckily, toenailitis is rare and only one in 10,000 people has it.

So here's the puzzle. You go to the doctor. She runs the test. Unfortunately it is positive. What is the chance that you have toenailitis?

The Answer

Let's look at 1 million people and test everyone, and get to the probability that way. Since 1 in 10,000 people has toenailitis, on average 100 of this million will have the disease. Of those 100 people, on average, 99 will test positive correctly, and 1 will test negative incorrectly.

Of the remaining 999,900 people, 99% will correctly test negative, which comes to 989,901. But also 1% will incorrectly test positive, which is 9,999.

So out of our 1 million people, on average 9,999 + 99 = 10,098 will test positive. But only 99 have the disease. So if you test positive, the chance you have the disease is 99 / 10,098 = 0.009804 meaning just less than 1%. That is surprisingly low. You test positive with a test that is 99% accurate and it is still 99% likely that you are free of the disease.

The Lesson

It turns out that most doctors get this wildly wrong, with only around 15% getting it right. That 15% is not a number I invented, it is an easy result to replicate, plus surprising, and as a result it's been extensively replicated several times over the last 40 years.

Here is one place to start: Statistical Literacy Among Doctors Now Lower Than Chance.

This is actually an example of Bayes' Theorem (named after the Reverend Thomas Bayes, that's him on the right). For a good introduction to this if you are not familiar with it, look at Eliezer Yudkowsky's An Intuitive Explanation of Bayes' Theorem.

Bayes' Theorem is a way of combining new information with what you already know (or guess), your prior. One lesson is that the prior is much more important than you think. In the example I gave, the chance of having toenailitis before the test was 1 in 10,000, the incidence in the general population. After a positive test, that probability increased, but only to 1 in 100. The test being 99% accurate makes it seem almost certain you will have toenailitis, but in fact it is so rare that the false positives among the majority of the population who don't have the disease completely overwhelm the small number of people who do.

The Bank Teller Riddle

Before I let you go, here's another little puzzle about probabilities that people often get wrong.

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations. Which is more probable?

Linda is a bank teller
Linda is a bank teller and is active in the feminist movement

The answer is 1, although many people think 2. This is something known as the "conjunction fallacy", that two things together cannot be more probable that one of them on its own.

Even if it is highly unlikely that Linda is a bank teller and almost certain that she is a feminist, it is still more probable that she is a bank teller than that she is a both.