unspurious.

The inference illusions · An interactive tool

The base-rate explorer.

Pick how rare a condition is and how accurate the test is. Then watch a thousand people sort themselves — and find out what a positive result is really worth.

1,000 people, screened Each square is a person. Move the dials and they re-sort in real time.
Positive & truly sick (true positive) Positive but healthy (false alarm) Negative but sick (missed) Negative & healthy (true negative)
The dials
Real scenarios
The confusion matrix
Actually sickActually healthyTotal
Test + 9caught 89false alarm 98all positives
Test − 1missed 901cleared 902all negatives
Total 10sick 990healthy 1000people
A negative result is correct 99.9% of the time (rare disease — easy to rule out).
What a positive is worth
9%
of positives are truly sick
The point. "Sensitivity" and "specificity" describe the test's skill at sorting the already-known sick and healthy. But a person holding a positive result needs the reverse: among everyone who tested positive, what fraction are truly sick? That number — the positive predictive value — depends just as much on how rare the condition was to begin with. Make the disease rare enough and even a superb test spends most of its positives on false alarms.

01 · The dial that does the damage

Drag prevalence down and watch the colour drain out

Leave the test untouched — fix it at a flattering 90% sensitivity and 91% specificity — and move only the prevalence dial. At 30% prevalence a positive is trustworthy; the grid is mostly azure. Slide prevalence toward 1% and the azure shrinks to a few squares while the ochre false alarms flood in, even though the test never changed. The instrument is the same; the meaning of its verdict is not.

This is the whole base-rate fallacy in one motion. The test's accuracy is a property of the test. The worth of a positive is a property of the test and the population it's pointed at — and at low prevalence the population wins.

Switch to "just the positives" and the screen shows only the pool a doctor actually faces: a mix of azure and ochre whose ratio is the positive predictive value. That ratio, not the accuracy on the box, is what your result is worth.

02 · Why counting beats percentages

The numbers are easier as people

Ask someone the probability directly — "90% sensitive, 91% specific, 1% prevalence, what's a positive worth?" — and most people, including clinicians, guess far too high. Lay the same facts out as a count of people, the way the matrix does, and the answer becomes almost obvious: of 1,000 people, 10 are sick and 9 of them test positive; of the 990 healthy, about 89 test positive too; so 9 out of roughly 98 positives are real. That's the explorer's headline number, and it's the format the psychologist Gerd Gigerenzer showed can rescue people from the fallacy.

The lesson generalises past medicine. Any time a rare thing is hunted with an imperfect filter — fraud flags, spam catchers, security alerts, DNA database matches — the same arithmetic applies, and the same trick defuses it. Stop multiplying percentages in your head and count a concrete thousand.

03 · Read more

The full story behind the dials

This sandbox is the playable companion to the compendium's entry on the base-rate fallacy, which walks through David Eddy's mammography problem, the natural-frequency cure in detail, and the courtroom version — the prosecutor's fallacy that helped convict Sally Clark. If the grid above has made the mechanism click, the entry is where it gets its history and its sharper edges.

Keep going

More from the compendium