‹ The blog12 June 20269 min read

Five statistical paradoxes that change how you read the news

Honest numbers, dishonest conclusions — and the five questions that catch them before you do.

Open any news site and you are reading the output of a thousand quiet statistical decisions: which average to print, which groups to compare, which survivors to interview, which of many analyses to headline, and where to start the axis on the chart. Most of those decisions are made honestly, by people in a hurry. That is precisely what makes the resulting errors so durable — there is no villain to catch, no doctored figure to expose. The numbers are true. It's the step from number to conclusion that quietly breaks.

This post walks through five of the most common breakages, each with a picture of the mechanism and a real case where it bit. None of them requires any mathematics beyond arithmetic, and every one of them is somewhere in this week's news. They are also the front doors to this site's compendium, where each gets a full interactive treatment.

1. The reversal hiding inside every average

Here is a true pair of facts about the United States between 2000 and 2013. Adjusted for inflation, the median wage rose slightly — about one per cent. Over the same period, the median wage of workers without a high-school diploma fell. So did the median wage of high-school graduates. And of those with some college. And of degree-holders. Every single education group went backwards while the overall number went forwards.

Nothing in that paragraph is a misprint. It is Simpson's paradox: a pooled figure is a weighted blend of its groups, and if the mix shifts — here, the workforce became markedly more educated, moving people from lower-paid groups into higher-paid ones — the blend can move against the grain of every component. The economy didn't raise anyone's wages; it reshuffled who was being averaged.

The wage that rose while falling, 2000–2013US median weekly earnings, inflation-adjusted, indexed to 2000 · approximate, after F. Norris (2013)

All workers (pooled) By education group

Every group fell. The total rose. The claret line is the headline; the four group lines are the reality for every kind of worker. The pooled median climbed only because the workforce's composition shifted toward higher-earning education groups — a reshuffle, not a raise.

The defensive habit: whenever a headline reports one overall rate — a national average, a combined success rate, a whole-company figure — ask what is it averaging over, and has the mix changed? The full entry has the Berkeley admissions case, the famous batting averages, and an interactive figure where you can watch the reversal happen.

2. The success stories you never hear

Business journalism runs on a renewable resource: interviews with winners. The founder who dropped out and bet everything; the fund that beat the market a decade running; the octogenarian who credits a daily glass of wine. Each story is true. The problem is the people who are structurally unavailable for comment — the founders who dropped out, bet everything and vanished; the funds that closed; the wine-drinkers who didn't make it to the interview.

This is survivorship bias, and its geometry is brutally simple: a filter sits between the world and the dataset, and the filter's discards are silent. A trait shared by the winners tells you nothing until you know how common it was among the losers — and if the losers can't be observed at all, the lesson is folklore, not evidence. The canonical case is Abraham Wald's wartime insight that returning bombers' bullet holes mapped the survivable wounds, so the armour belonged exactly where the holes weren't.

The interview poolOne startup cohort, drawn to scale

The sliver writes the books. Of a thousand startups in a cohort, a handful exit big — and the keynotes, podcasts and bestsellers are built almost entirely from that sliver. Whatever habits the eight share, most of the other 992 probably shared them too.

The defensive habit: ask who left the sample, and why? If exiting the dataset is correlated with the very outcome being celebrated — failure, closure, death — the remaining sample is biased by construction, no matter how large or how inspiring it is.

3. The test that's 99% accurate and usually wrong

Health stories love a test's accuracy figure, and accuracy figures are genuinely impressive — until the thing being tested for is rare. Suppose a condition affects 1% of people, and a screening test catches 99% of true cases while falsely flagging 5% of the healthy. Screen a thousand people and the arithmetic is unforgiving: about 10 of them are sick, and the test duly catches them. But 5% of the other 990 — roughly 50 healthy people — get flagged too. Sixty positives, ten of them real.

What “99% accurate” delivers at a 1% base rate1,000 people screened · sensitivity 99% · specificity 95%

Truly sick False alarm

The positive pile, gathered. The test performed exactly as advertised. But the rare condition supplies only ten true positives, while the enormous healthy majority supplies fifty false ones — so a positive result is right about 17% of the time.

That mismatch — between how often the test is right about the sick and how often a positive result is right about you — is the base-rate fallacy, and it shapes far more than medicine: security screening, fraud detection, doping tests and DNA database trawls all hunt rare things with imperfect filters. The defensive habit is one question: how common was this before the test? You can feel the whole mechanism in about ten seconds with the base-rate explorer, which re-sorts a thousand people live as you drag the rarity slider.

4. The discovery that was really a hunt

“Scientists find link between X and Y” deserves one question before any excitement: how many links did they test? A p-value below .05 promises that a result this strong would arise by luck only one time in twenty — for a single, pre-aimed test. Run twenty tests and that promise inverts: chance alone will hand you a “discovery” about 64% of the time. The figure below is twenty tests of pure noise; one of them duly comes up significant, and that one is the press release.

Twenty shots at a blank wallSimulated: both groups drawn from identical distributions in all twenty tests

One of these is a headline. Every hypothesis here is false by construction, yet a dot crosses the significance line — exactly as the arithmetic of multiple comparisons predicts. The nineteen misses stay in the drawer; the hit gets written up as though it were the only shot fired.

This is the Texas sharpshooter fallacy — fire first, paint the target afterwards — and it needn't involve any dishonesty: subgroup analyses, multiple outcomes and flexible analysis choices fire the extra shots invisibly. Famous demonstrations include the dead salmon that “responded” to photographs in an fMRI scanner, and the cardiology trial that solemnly reported aspirin doesn't work on people born under Gemini. The defensive habit: ask how many shots were fired before this target was painted? Or fire them yourself in the p-hacking sandbox, where you can manufacture a significant result from noise in four clicks.

5. The chart that lies with true numbers

The final illusion needs no statistics at all — only a designer's thumb on the y-axis. A bar chart encodes value as length, and length is only meaningful from a baseline of zero. Start the axis somewhere else and the bars stop depicting the quantities and start depicting their distance from an arbitrary line; a 3% difference can be drawn as a 30% difference or a 300% one, with every printed number correct.

Two honest numbers, two very different chartsQuarterly revenue: £4.80M then £4.96M — a true rise of 3.3%

The baseline does the lying. Left: the axis starts at zero and the change looks like what it is. Right: the identical bars on an axis starting at £4.70M — the visual difference is roughly nine times the real one. The most-shared offenders compound it by hiding the axis labels altogether.

The rule is narrower than “all axes must start at zero” — line charts, which encode position rather than length, can often defensibly zoom — but for bars it is close to absolute. The defensive habit: before reacting to a bar chart's shape, find where its axis starts; if it isn't zero, redraw it in your head. Or drag the baseline yourself in Lie With This Chart and watch a meter price the exaggeration in real time.

The thread that ties them together

Five illusions, one anatomy. In each, every individual number is true — the wages, the interviews, the test accuracy, the p-value, the bar heights — and the error is smuggled in at the interpretive step: what the average is over, who survived into the sample, which direction the probability points, how many comparisons were made, where the axis begins.

The defence is not more mathematics. It is a short list of questions, asked before you believe the number.

The compendium keeps that list as a pocket checklist, one question per illusion, each linked to a full interactive explainer. Read the news with it for a week and you will start spotting these five everywhere — which is unsettling at first, and then much better than the alternative.

Five statistical paradoxes that change how you read the news

1. The reversal hiding inside every average

2. The success stories you never hear

3. The test that's 99% accurate and usually wrong

4. The discovery that was really a hunt

5. The chart that lies with true numbers

The thread that ties them together

Keep reading

Simpson's Paradox

The Base-Rate Explorer

Simpson's paradox in the wild