unspurious.

The aggregation illusions · The Will Rogers phenomenon

Move one thing, and two averages go up.

Take a member out of one group and drop it into another, and you can raise the average of both — without a single value changing. It sounds impossible. It has flattered cancer statistics for decades.

The migration machine Click any dot to move it to the other group. Watch the two means — and the unchanging overall mean.
Group A Group B overall mean
Group A mean
Group B mean
Overall mean

Fig. 1 — Nobody changed; both averages did. Each dot is a fixed value. Click one in the shaded band — below Group A's mean but above Group B's — and send it across: Group A's mean rises because its lowest member left, and Group B's mean rises because its newest member is above its old average. The overall mean (claret line) never budges. No value moved on the scale; only its label changed.
The short answer

What is the Will Rogers phenomenon?

The Will Rogers phenomenon is when moving an item from one group to another raises the average of both groups, even though no individual value changes — because the item is below the average of the group it leaves and above the average of the group it joins. It is named after a joke attributed to Will Rogers about Okies migrating from Oklahoma to California raising the average intelligence of both states. In medicine it appears as “stage migration”: better scanners reclassify cancer patients between stages, making survival improve in every stage without anyone being treated differently.

The fast check“Did the groups improve, or did someone change groups?”

01 · What just happened

The averages moved because the labels did

The phenomenon takes its name from a line attributed to the humorist Will Rogers during the Depression-era westward migration: “When the Okies left Oklahoma and moved to California, they raised the average intelligence level in both states.” It is a joke about two groups at once, and it is also a precise piece of statistics. If the people who leave are below their old state's average but above their new state's average, then their departure lifts the average they leave and their arrival lifts the average they join. Both go up. Nobody got smarter.

The machine above is that joke made literal. There is a band between the two group means, and any value sitting in it has a peculiar property: it is dragging down the higher group and propping up the lower one. Move it across and you relieve both burdens at once. Crucially, the overall average — every value pooled together — cannot change, because no value changed; only its group membership did. That gap, between healthy-looking group averages and an unmoved total, is where the illusion lives.

It is a close cousin of Simpson's paradox: both are about how regrouping the same numbers tells a different story. Simpson's reverses a trend by pooling; Will Rogers improves every group by reshuffling.

02 · The anatomy

One band, two improvements

The condition is exact and worth stating plainly. Moving a value out of a group raises that group's average only if the value was below it. Adding a value to a group raises that group's average only if the value is above it. So to lift both at once you need a value that is below the average of the group it leaves and above the average of the group it joins — a value living in the gap between the two means.

The value that helps both groups by leaving oneGroup A's mean sits above Group B's; the band between them is the magic zone
the migration bandbetween the two meansGROUP Amean 76GROUP Bmean 27move
Fig. 2 — The migration band. The highlighted value is mediocre by Group A's standards but excellent by Group B's. Subtract it from A and A looks better; add it to B and B looks better. Every value in the shaded band has this dual effect. Reclassify enough of them and you can march both averages upward indefinitely — a free lunch paid for entirely by relabelling.

Nothing here is a trick of arithmetic gone wrong; every average is computed correctly. The deception is in the comparison. When you compare “Group A before” with “Group A after,” you are quietly assuming the two groups contain comparable members. The moment membership is allowed to shift, that assumption fails, and a rising average can mean improvement, reshuffling, or — most treacherously — pure reclassification with no improvement at all.

03 · The case that named it

How better scanners 'cured' cancer on paper

In 1985 the physician Alvan Feinstein gave the phenomenon its medical name in The New England Journal of Medicine, after spotting it in lung-cancer survival data. He compared a group of patients treated in 1977 with an earlier group from the 1950s and 60s at the same hospitals, and found that survival had improved — not just overall, but within every single stage of the disease. It looked like a triumph. It was an artefact.

The newer patients had been examined with better imaging. Those scanners revealed small metastases that the older technology had missed — metastases that had always been there, silently, in patients previously filed as “early stage.”

Stage migration, patient by patientEarly- and late-stage groups before and after a more sensitive scan
STAGE MIGRATION: A BETTER SCANNER, NOT A BETTER TREATMENTBEFORE (old scanner)Early stagesurvival: good — but two are secretly sickLate stagesurvival: poor↓ the scanner now sees the hidden metastases ↓AFTER (new scanner) — nobody was treated differentlyEarly stagesurvival: better — the sick ones leftLate stagesurvival: better — the new ones are healthier
Fig. 3 — The same patients, re-sorted. The early-stage group used to hide a few patients with undetected metastases (dashed). A better scanner finds them and moves them to the late-stage group. They were the sickest of the early group, so the early group's survival rises once they leave; and they are healthier than the typical late-stage patient, so the late group's survival rises once they arrive. Not one patient received different care.

In Feinstein's own words, the migrants' prognosis was “worse than that for other members of the good-stage group” but “better than that for other members of the bad-stage group,” so survival rose in each group “without any change in individual outcomes.” When he re-classified both cohorts by symptoms — a yardstick the new scanners couldn't shift — the apparent progress evaporated. The two eras had nearly identical survival all along.

04 · The signature

Every stage improves; nobody is saved

Stage migration leaves a fingerprint so distinctive it has become a diagnostic test for the illusion itself: survival improves in every subgroup while the survival of the whole population barely moves. Logically that should feel impossible — if every part got better, surely the whole did too? It doesn't, because the parts were repopulated. Patients shuffled from the good group to the bad group lift the statistics of both while changing the total not at all.

The tell-tale patternStage-specific survival vs. total survival, before and after reclassification
EVERY STAGE’S SURVIVAL IMPROVES — YET TOTAL SURVIVAL IS UNCHANGED0%25%50%75%6878Stage I4252Stage II1826Stage III4040Total(no change)beforeafter
Fig. 4 — The impossible-looking chart. Every stage posts better survival after reclassification, yet the total is flat. Whenever you see improvement in all subgroups but none in the whole, suspect that the subgroups were redrawn — that people, not outcomes, did the moving. This is the modern era's recurring trap: each new imaging tool, from CT to PET to PSMA scans, re-triggers stage migration and makes treatments look better than they are.

The stakes are not academic. Stage migration can make a useless new test or therapy look effective, justify expensive screening that never extended a life, and corrupt any comparison of cancer outcomes across hospitals, countries or decades whenever diagnostic standards differ. It also rides alongside its relative, lead-time bias — detecting disease earlier makes survival from diagnosis look longer even when the date of death is unchanged. Both inflate the same hopeful statistics; both are illusions of the clock and the label, not the cure.

05 · Field notes

Wherever the boxes can be redrawn

Beyond medicine. Any time a population is split into labelled groups and the labelling can shift, the Will Rogers phenomenon is available. Move a mid-tier student from the top set to a lower one and both sets' averages can rise. Reclassify a struggling fund from “growth” to “value” and both categories' returns can tick up. Promote a mediocre player from the minors to the majors and you may lift the standard of both leagues. Redraw the boundary of a credit rating, a tax bracket, a diagnostic threshold, and the averages on both sides can move with nobody underneath them changing.

The structural twin. Feinstein found his example while thinking hard about how categories deceive, and the lesson generalises the one behind Simpson's paradox and the ecological fallacy: a statistic computed within a group is only comparable across time if the group is the same group. Change the membership and you have changed the question, however identical the label looks.

When a category's average improves, there are two possible reasons — its members got better, or its membership changed. Only one of them is progress.

So the question to keep in your pocket is the one that separates them: did the groups improve, or did someone change groups? If the boundary moved — a new scanner, a new rule, a new definition — between the two numbers you are comparing, treat any improvement as suspect until you have ruled out the migration. The rest of the compendium is full of honest numbers behaving badly; this is the one where they march in two directions at once.

Continue the field guide

More ways to be honestly wrong