unspurious.

The presentation illusions · Anscombe's quartet

Four datasets. One set of statistics. Four different worlds.

In 1973 Francis Anscombe built four tiny datasets with the same mean, variance, correlation and regression line — to the decimal. Summarised, they are indistinguishable. Plotted, they could not be more different. It remains the most elegant argument ever made for looking at your data.

The quartet, live The statistics panel never changes as you switch datasets — only the picture does. Toggle the layers to see what the numbers miss.

Fig. 1 — The numbers can't tell them apart. Click between the four datasets and watch the statistics on the right sit perfectly still — same mean, same variance, same correlation, same line — while the cloud of points rearranges into a line, a curve, a line-plus-outlier, and a single point holding up the whole result. Turn on the residuals to see what a regression diagnostic would have caught instantly.
The short answer

What is Anscombe's quartet?

Anscombe's quartet is four datasets, built by statistician Francis Anscombe in 1973, that share nearly identical summary statistics — the same mean and variance of x and y, the same correlation of 0.816, and the same regression line y = 3.00 + 0.500x — yet look completely different when plotted. One is linear, one is curved, one has an outlier, and one is dominated by a single high-leverage point. It is the classic demonstration that you must graph data, not just summarise it, because identical statistics can hide entirely different relationships.

The fast check“Have you actually plotted it?”

01 · What just happened

The summary is not the data

A summary statistic is a compression — it throws away almost everything and keeps a single number. Usually that is exactly what we want. Anscombe's quartet is the unforgettable demonstration of what compression can hide: four datasets that agree on every standard summary a regression report would print, yet describe four completely different situations. One is a fair linear relationship. One is a perfect curve. One is a clean line knocked askew by a single outlier. One has no relationship at all, propped up entirely by one stray point.

If you had been handed only the table of statistics — mean, variance, correlation, the fitted line y = 3.00 + 0.500x, an R² of 0.67 — you could not tell which world you were in. You would fit a straight line to all four and report the same confident result for each. Three of those four reports would be wrong, and the numbers would never warn you. The only thing that does is the picture.

Anscombe's own framing, in his 1973 paper, was a rebuke to a belief he saw among statisticians: that “numerical calculations are exact, but graphs are rough.” His quartet flips it — here the calculations are the rough instrument and the graph is the exact one.

02 · What is held constant

Seven numbers, frozen solid

The trick is engineered with care. Across all four datasets, seven of the most-reported quantities in applied statistics are identical to the precision shown — not approximately, but by construction.

The statistics that agreeComputed identically for datasets I, II, III and IV
EVERY ONE OF THE FOUR DATASETS HAS THESE EXACT STATISTICSMean of x9.00Variance of x11.00Mean of y7.50Variance of y4.125Correlation (r)0.816Regression liney = 3.00 + 0.500x0.67identical to the decimal places shown — yet four utterly different pictures
Fig. 2 — Everything a regression printout shows. Mean and variance of both variables, the Pearson correlation, the least-squares line, and the coefficient of determination — all the same. A purely numerical workflow treats these four datasets as the same dataset. That is the whole problem.

These are not obscure quantities; they are the backbone of a first-pass analysis. Mean and spread describe each variable, correlation and R² claim to describe their relationship, and the regression line claims to summarise it. Anscombe's point is that this entire battery can be satisfied by data that violates every assumption underneath it.

03 · Four different realities

What the eye sees that the numbers can't

Switch through the four panels in the explorer above and each tells its own story. Dataset I is the honest one — a noisy straight line, exactly the situation linear regression is built for. Dataset II is a clean curve; the relationship is real and strong, but it bends, so a straight line and a correlation coefficient are answering the wrong question. Dataset III is ten points in a near-perfect line plus one outlier that tilts the fitted line and drags the correlation down from a perfect 1.00 to 0.816. Dataset IV is the most unsettling: every point shares the same x value except one, and that single high-leverage point at x = 19 fabricates the entire correlation out of nothing.

Here is the part most retellings leave out, and the part Anscombe actually cared about. His paper was an argument for graphical regression diagnostics — and the sharpest of those is the residual plot, which shows how far each point sits from the fitted line. Plot the residuals and the four datasets stop hiding.

The diagnostic Anscombe was really sellingResiduals from the identical fitted line
RESIDUAL PLOTS — WHAT A STATISTICIAN ACTUALLY LOOKS ATDataset Istructureless — a good fitDataset IIa clear arch — wrong modelDataset IIIone huge residual — outlierDataset IVdegenerate — no spread in x
Fig. 3 — The residuals confess. The same regression line leaves four utterly different footprints. Dataset I scatters with no pattern — a healthy fit. Dataset II's residuals form a clean arch, the unmistakable signature of a missed curve. Dataset III shows one residual towering over the rest — an outlier flag. Dataset IV collapses to a single vertical strip, revealing there is essentially no information about a slope at all. None of this is visible in the seven summary numbers; all of it is obvious here.

This is why “always plot your data” is too weak a moral. The richer one is: plot the right thing. A scatter reveals shape; a residual plot reveals whether your chosen model fits; a leverage check reveals whether any single point is quietly running the show.

04 · Feel the leverage

One point can own the whole line

Datasets III and IV are really about influence — the unequal power of individual points over a fit. The cleanest way to understand it is to do it. Below is a small linear cloud; drag the red point and watch the regression line, the correlation and R² lurch in response. Pull it far out along the x-axis, the way dataset IV's stray point sits, and you will find you can set the slope to almost anything you like, single-handed.

The leverage sandbox Drag the red point. The further out in x it sits, the more of the line it controls.
Fig. 4 — Influence is not democratic. A point near the centre of the x-range barely moves the line however far you drag it up or down. A point far out along x — a high-leverage point — can swing the entire fit by itself, and inflate or destroy the correlation while the other ten points sit unchanged. This is exactly how one observation manufactures (dataset IV) or distorts (dataset III) a result that the summary statistics then report with a straight face.

05 · Field notes

From a quartet to a dinosaur

The modern sequel. For decades nobody knew how Anscombe built his datasets. Then in 2017 Justin Matejka and George Fitzmaurice showed how to generate any shape you like with a fixed set of summary statistics, using simulated annealing to nudge points around while holding the numbers still. Their showpiece, the “Datasaurus,” is a scatter plot of a dinosaur whose mean, variance and correlation match a boring blob's — along with a dozen other shapes that all share the same statistics.

Same statistics, any picture at allAfter Matejka & Fitzmaurice (2017) · illustrative
SAME SEVEN NUMBERS — EVEN AS A DINOSAURthe “Datasaurus”…and a dozen other shapesMatejka & Fitzmaurice (2017): every shape here shares one set of summary statistics. Illustrative.
Fig. 5 — The quartet's wildest descendant. If four datasets sharing seven numbers feels like a contrivance, the Datasaurus dozen settles it: there is no limit. A dinosaur, a star, a set of stripes — all can be tuned to identical means, variances and correlation. Summary statistics constrain the data far less than we imagine.

Why it still matters. It would be comforting to think Anscombe's quartet is a museum piece from before everyone had plotting software. It isn't. Automated pipelines, dashboards and machine-generated reports increasingly hand people summary numbers — a correlation here, an R² there — with no plot attached, at a scale Anscombe never imagined. The failure mode he diagnosed is more common now, not less.

Every correlation, every trend line, every “strong relationship (r = 0.8)” is one of infinitely many datasets. The only way to know which one you have is to look at it.

So the checklist question is almost insultingly simple, and it is the one professionals still skip under deadline: have you actually plotted it? Not summarised it, not correlated it — plotted the raw points, and then plotted the residuals. The rest of the compendium is full of subtle illusions; this is the one with the simplest cure, and it is still the one most often left undone.

Continue the field guide

More ways to be honestly wrong