What a positive test result actually means
Your test is 90% accurate and it came back positive. Why the honest answer is ‘about 9%’ — counted out in pictures.
You take a test for a condition you have no symptoms of — a routine screen, a workplace check, an at-home kit. The leaflet says the test is “90% accurate.” It comes back positive. How worried should you be?
Most people, when asked some version of this, answer “about 90% worried.” So, in repeated studies stretching back four decades, do most physicians. For a genuinely rare condition the right answer is usually closer to “mildly concerned, and ready for the follow-up” — often a single-digit percentage. The gap between those two answers is the base-rate fallacy, and closing it is one of the highest-value pieces of statistical literacy a person can own. This post closes it with pictures and a little counting.
Two questions wearing the same words
“Accuracy” quietly bundles together two different promises, and the leaflet only ever makes the first one. Sensitivity says: if you are sick, the test will probably say so. Specificity says: if you are healthy, the test will probably clear you. Both describe the test's behaviour when the truth is already known — the arrow runs from your condition to the result.
But a person holding a positive result needs the arrow to run the other way: given that the test said positive, how likely am I to be sick? That quantity has its own name — the positive predictive value — and it is not printed on the box, because it isn't a property of the box. It depends on who is being tested. When the condition is rare, the two arrows are not even close.
Count a thousand people, not percentages
Why is the second arrow so weak? Because the healthy massively outnumber the sick, and even a small error rate applied to an enormous group produces a crowd. The psychologist Gerd Gigerenzer showed that this becomes nearly self-evident the moment you stop multiplying percentages and start counting actual people — a format he calls natural frequencies. So let's count.
Take 1,000 people screened for a condition with a 1% base rate, with a test that catches 90% of true cases and falsely flags 9% of healthy people (numbers in the ballpark of real mammography screening, which is where this example was first studied). Ten of the thousand are truly sick, and the test catches nine of them. But the other 990 are healthy, and 9% of them — about 89 people — get flagged anyway. Follow the branches:
It is worth dwelling on what just happened, because nothing about the test was bad. It caught nine of the ten sick people — exactly the 90% promised. The problem is purely that it was hunting something rare: the ten true positives are simply outnumbered by the residue of error from the 990 healthy people. Gather everyone holding a positive result into one room and the room looks like this:
The same test, everywhere on one curve
Here is the deeper point the counting hides: the worth of a positive is not one number. Hold the test fixed and let only the base rate vary, and the positive predictive value sweeps along a curve — nearly worthless when screening the symptomless general population, a coin flip somewhere in the middle, and genuinely decisive among high-risk patients in a specialist clinic.
This curve explains a lot of otherwise confusing medical practice. It is why doctors ask about symptoms, family history and risk factors before testing — each “yes” slides you rightward along the curve, raising the base rate and therefore the meaning of any positive. It is why a positive result in a 25-year-old with no risk factors is treated differently from the same result in a symptomatic 60-year-old. And it is why mass screening programmes agonise over who to invite: screen too broadly and the programme manufactures false alarms by the thousand, each carrying real anxiety, follow-up procedures and cost.
So what do you actually do with a positive?
Not nothing — and not panic. The practical reading is that a screening positive is the start of a diagnostic process whose entire design anticipates this arithmetic. The screen's job is to concentrate the search: it takes a population where the condition is 1-in-100 and produces a much smaller group where it is 1-in-11. Inside that group, a second, more specific test now operates at a far friendlier base rate — the curve above, entered further to the right — which is why confirmatory testing works so much better than the first screen did.
The same logic, run in reverse, also tells you when to be sceptical of reassurance: a negative result on a rare condition was overwhelmingly likely anyway, so it carries less information than it feels like it does — though happily, for rare conditions negatives are also almost always right.
The courtroom version
Before closing, one sobering note: this exact confusion of conditionals has a name in law — the prosecutor's fallacy — and a body count of miscarried justice. “The chance of this evidence arising by innocent coincidence is one in a million” is the leaflet's arrow; “the chance the defendant is innocent is one in a million” is the reversed one, and equating them ignores the base rate exactly as our test-taker did. In the British case of Sally Clark, an expert's “one in 73 million” figure for two natural infant deaths helped convict a grieving mother before the Royal Statistical Society's public protest and the eventual quashing of the verdict. DNA database trawls raise the same spectre: search several million profiles and a one-in-a-million match probability statistically guarantees innocent hits. The full entry treats both cases properly.
The one-sentence habit
That habit — base rate first, people not percentages — defuses the fallacy in medicine, security, hiring filters and spam folders alike. If you want it in your hands rather than your head, the base-rate explorer runs the whole calculation live: set the rarity and the accuracy, and watch a thousand people sort themselves into exactly the piles drawn above.
This is an explainer about statistical reasoning, not medical advice. A real result always deserves a conversation with a clinician who knows your history, your risk factors, and which test you actually took — all the things that move you along the curve.