Survivorship bias: the evidence is in the graveyard
From Wald's bombers to falling cats and founder folklore — how filters write datasets, drawn out in five figures.
Every dataset you will ever meet has been through a doorway, and the doorway had rules. Companies had to stay solvent to appear in the database. Buildings had to stay standing to be admired. Patients had to return for the follow-up. Manuscripts had to be published, songs replayed, founders interviewed. Survivorship bias is what happens when we read the room and forget the door — when conclusions are drawn from whatever passed a filter, while the filter and everything it removed stay invisible.
It is, by a comfortable margin, the most intuitive illusion in this compendium and also the most quietly expensive. This post walks through its anatomy and four of its habitats — a war, a fund database, a veterinary journal and an airport bookshop — each with a picture of where the missing data went.
Wald and the planes that didn't come home
The founding story is from 1943. The mathematician Abraham Wald, working with the Statistical Research Group in New York, was handed a military damage problem: armour is heavy, bombers can't be plated everywhere, and surveys of returning aircraft showed exactly where the bullet holes accumulated — wings, tail, rear fuselage. The instinct, reasonable on its face, was to reinforce the places taking the most damage.
Wald's contribution was to name the doorway. The survey was not a sample of “where bombers get hit”; it was a sample of “where bombers get hit and survive to be surveyed.” Anti-aircraft fire doesn't aim for wingtips — hits were spread roughly evenly — which means the clean patches over the engines and cockpit weren't lucky. They were the wounds nobody came home from. His memos worked out how to estimate each section's true vulnerability from the survivors alone, and the recommendation inverted the instinct entirely: armour where the holes aren't.
The anatomy of the trap
Strip the war story to its skeleton and you get a structure worth memorising, because every case in this post — and most you'll meet in the wild — is the same machine with different labels. A filter sits between the world and your data. The filter's output is vivid, countable and easy to study. Its discards are silent, uncounted, and usually exactly where the answer lives.
Note the crucial condition in that caption: the bias bites when exiting the sample is correlated with the thing being measured. A filter that discards at random merely shrinks your data. A filter that discards the failures, the dead, the demolished and the forgotten — while you study success, longevity, durability and greatness — manufactures conclusions wholesale.
The graveyard in your portfolio
Finance runs Wald's problem every day, in reverse. Fund databases list the funds that exist now; the ones that performed badly were closed or quietly merged away, taking their track records with them. Compute “the average fund's historical return” from such a list and you are averaging over survivors — the corpses have been removed from the denominator before you arrived. Academic studies of US mutual funds have put the resulting inflation at roughly one to two percentage points a year, which is easily the difference between “active management earns its fees” and “it doesn't.”
The simulation below makes the mechanism visible with no fraud anywhere in it: one hundred funds launched, identical rules, a closure threshold for sustained losses, and ten years of honest noise.
The same arithmetic haunts backtests (strategies that would have failed were never brought to market), indices (failing companies are removed and replaced), and every “stocks always recover” argument built on the markets that happen to still exist. The investors of 1900 could have made the same argument about the St. Petersburg exchange.
The cat that fell eight storeys
In 1987, two New York veterinarians published a study of 132 cats brought in after falling from high-rise buildings, and reported one of the most delightfully counterintuitive curves in the literature: injuries rose with the height of the fall up to about six or seven storeys — and then declined. Cats falling from nine storeys arrived in better shape than cats falling from six. The proposed explanation was elegant physics: past a few storeys a cat reaches terminal velocity, relaxes, and spreads itself to land more softly.
It might even be true. But the dataset has a doorway, and the doorway is a veterinary clinic. A cat that dies on the pavement — or is so obviously beyond help that its owner never makes the journey — does not become a data point. The higher the fall, the more selective the doorway plausibly becomes, and the survivors of nine-storey falls may simply be the sturdy, lucky tail of a much grimmer distribution. The study can't tell the two stories apart, because the data it would need is precisely the data the filter removed.
Once you see the cats, you see their cousins everywhere. Old buildings seem better built than today's — because a century of demolition removed the flimsy ones, leaving a curated exhibition of the sturdiest. The music of past decades seems uniformly great — because radio replays only the survivors of a brutal forgetting. “They don't make them like they used to” is, almost always, a true statement about the filter and a false one about the making.
Advice from the survivors
The most lucrative habitat of all is the airport bookshop. Success literature is survivorship bias with a publishing deal: study a hundred celebrated founders and you will reliably find they took bold risks, ignored the doubters, dropped out, persisted past all reason. Every word true. The problem is the bar that never gets drawn — the thousands who took the same bold risks, ignored the same doubters, and quietly vanished, unprofiled and uninterviewed.
This is why “what do successful people have in common?” is the wrong question, no matter how rigorously it is answered. The right question — what distinguishes them from the equally bold failures — requires data the filter destroyed. Absent that, treat founder folklore the way Wald treated the bullet holes: as a map of what is survivable, not of what works.
Three questions that find the graveyard
What was the full starting cohort? “Funds in the database” and “funds ever launched” are different denominators. “Buildings from 1900” and “buildings built in 1900” are different populations. Always ask which one you are holding.
Who left, and was leaving correlated with the outcome? Random attrition shrinks a sample; selective attrition bends it. If exiting the data meant failing, dying, closing or being forgotten — and you are studying success, survival, longevity or greatness — the sample is biased by construction.
Would the celebrated trait also appear among the missing? Before crediting grit, risk, daily wine or terminal-velocity relaxation, ask whether the graveyard plausibly shares the trait. If you cannot observe the graveyard at all, hold the conclusion as folklore.
The full entry has the interactive bomber and the fund cohort live; its mirror image — Berkson's paradox, where the question is not who left the sample but how anyone got in — completes the pair of selection illusions in the compendium.