Apps

🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!

← Back to Library
blog.category.aspect Mar 29, 2026 9 min read

Regression to the Mean: Why Extremes Are Their Own Correction

A student scores brilliantly on their first exam — the best in the class. On the second exam, they do less well. Did pressure get to them? Did they slack off after the early success? Almost certainly not, at least not entirely. Some of that first exceptional score was genuine ability; some was luck — a good night's sleep, questions that played to their strengths, a favourable guess or two. On the next attempt, the luck component averages out. The score moves toward something closer to their true mean. This is regression to the mean: one of the most robust, universal, and consistently misunderstood phenomena in statistics — and in everyday life.

Francis Galton and the Discovery of Regression

The phenomenon was formally identified by Sir Francis Galton in the 1880s while studying the relationship between the heights of parents and their children. Galton noticed that tall parents tended to have children who were shorter than they were, and short parents tended to have children taller than they were. The children's heights "regressed" toward the population average. He called this "regression towards mediocrity in hereditary stature" — later shortened simply to "regression to the mean."

Galton's insight was mathematical: any time a variable is imperfectly correlated with itself across measurements, the extreme values in one measurement will tend to be less extreme in the next. The word "regression" in statistics comes entirely from this discovery — one of the most important statistical concepts carries the name because of its original application to hereditary height.

The mechanism is not mysterious. Any observed measurement has two components: a "true" underlying value and a random error component. If your true height is 6 feet but today you measured at 6′2″ due to measurement variation, the next measurement will likely be closer to 6 feet. If a child's genetic potential predicts a height of 5′10", but both parents happened to be at the tall end of their genetic potential, the child's actual height tends toward the mean of the genetic potential — not the extreme parental expression. The luck component doesn't sustain itself.

The Sports Illustrated Jinx and the Madden Curse

Athletes who appear on the cover of Sports Illustrated — or on the cover of the Madden NFL video game — are widely believed to subsequently suffer a drop in performance, an injury, or some other misfortune. Athletes, coaches, and fans invoke "the jinx" and "the curse" as quasi-supernatural explanations. The statistical explanation requires no supernatural machinery.

Athletes appear on magazine covers and video game covers when they are performing at exceptional levels — typically after their best season, their best game, or their most dramatic achievement. An exceptional performance is made up of genuine skill and an unusually favourable run of luck, health, and circumstance. In subsequent seasons, the luck component regresses toward the mean. Performance drops not because a jinx struck, but because the luck that inflated the exceptional performance doesn't persist.

The same mechanism explains the "sophomore slump" in baseball and other sports: a rookie who has an exceptional debut season is, on average, somewhat less exceptional in their second season — not because the league "figured them out" (though that is a real additional factor), but because extreme initial performance partly reflects a lucky first season that won't replicate at the same level.

Nobel laureate Daniel Kahneman has identified the sports jinx as his favourite example of regression to the mean being misread as a causal narrative. The jinx story is compelling because humans are pattern-recognition machines who instinctively construct causal explanations for sequences of events. Cover feature → decline = the magazine caused the decline. The causal story is vivid and memorable; the statistical truth is abstract and unsatisfying.

The Israeli Air Force and the Kahneman Revelation

Kahneman describes in his book Thinking, Fast and Slow a pivotal moment in his career when he recognised regression to the mean operating in military training. He was teaching Israeli Air Force instructors about reward and punishment in training. An instructor challenged him: the instructors had observed that praising cadets for excellent manoeuvres led to worse performance on the next attempt, while criticising poor performance led to improvement. Didn't this prove that punishment works better than reward?

Kahneman recognised immediately that regression to the mean explained everything. An exceptionally good manoeuvre reflects both skill and a lucky moment. The next attempt will, on average, be less exceptional — regardless of whether the instructor praises or ignores the performance. An exceptionally poor manoeuvre reflects skill level and an unlucky moment. The next attempt will, on average, be less poor — regardless of whether the instructor criticises or ignores it. The instructors had built a confident causal theory (punishment beats reward) on a statistical artefact. Their observations were perfectly accurate; their causal inference was completely wrong.

This example has broad implications for management, parenting, and coaching. Supervisors who react to unusually bad performance with criticism and unusually good performance with praise will observe that criticism seems to "work" and praise seems to "backfire" — because regression to the mean systematically produces improvement after extremes that coincide with criticism, and regression after peaks that coincide with praise. The causal narrative writes itself and is entirely false.

Medicine: Miracle Cures and Natural Recovery

Regression to the mean is a fundamental confound in medical research, particularly in uncontrolled studies and case reports. Patients typically seek medical attention when symptoms are at their worst — when pain is most severe, when depression is deepest, when blood pressure is most elevated. This means the baseline measurement is captured at a statistical extreme. Any subsequent measurement, whether or not any treatment is administered, will tend to be less extreme — simply because the initial extreme was partly random, and random extremes don't persist.

This is why patients who receive fake treatments, alternative remedies, and elaborate rituals often report feeling better: they sought the treatment at the worst point of their illness, the illness naturally improved (regression to the mean plus the natural course of most conditions), and the temporal coincidence of treatment and improvement creates an irresistible causal narrative. The patient genuinely improved; the treatment was not the reason. Distinguishing regression to the mean from genuine treatment effects requires randomised controlled trials with placebo groups — because the placebo group experiences the same regression while receiving no active treatment, providing a baseline against which the treatment can be evaluated.

A particularly important medical application involves treating extreme laboratory values. A single high blood pressure reading is likely partly elevated due to measurement error, anxiety about the test, or a temporarily stressful day. The next measurement will, on average, be lower — simply due to regression — regardless of whether any treatment is prescribed. This is one reason clinical guidelines recommend confirming extreme readings before initiating treatment: the initial extreme is not reliable evidence of the patient's true underlying level.

Educational Testing and Policy

Regression to the mean causes systematic errors in educational policy evaluation. Schools or districts that score worst on standardised tests in one year are targeted for special interventions — additional resources, new programmes, leadership changes — and then measured again the following year. They typically improve, and the intervention is credited. But statistically, the worst-performing schools in any year include many schools that had an unusually bad year due to temporary factors (high staff turnover, unusual student intake, flu season, data anomalies). They would have improved the following year even without intervention.

Similarly, schools that score best are given recognition and held up as models — and then in subsequent years often perform less exceptionally, which is misinterpreted as them "resting on their laurels" or the recognition having made them complacent. The actual driver is regression to the mean from an unusually good year.

This creates policy illusions: interventions at low-performing institutions appear to work (regression upward); success at high-performing institutions appears fragile (regression downward). Neither observation is reliable evidence about the interventions or about the institutions. Properly controlled evaluation — comparing schools that received interventions with similar schools that didn't — is the only way to separate regression effects from genuine treatment effects.

The Investor Who Picked the Right Funds

Mutual fund performance follows a pattern familiar from regression to the mean: funds that outperform in one period tend, on average, to underperform in the next, and vice versa. Investors who select last year's top-performing funds — a common strategy explicitly encouraged by performance tables in financial advertising — systematically experience lower-than-average returns because they are buying at the performance peak that regression will moderate.

This observation is one of the pillars of the argument for passive index investing: the apparent skill that produced last year's exceptional fund performance is partly luck, and luck regresses. The funds that consistently beat the market over 20-year periods are few and nearly impossible to identify in advance; the ones that occasionally appear to beat it for 3–5 years are numerous and predominantly lucky.

Recognising Regression in Everyday Life

The practical antidote to regression errors is the question: "How much of this extreme result is likely to be true signal, and how much is likely to be noise that will average out?"

  • Before attributing an improvement after a peak to any intervention, ask whether natural regression explains it.
  • Before concluding that praise is ineffective or punishment effective, consider whether performance improvement after the low point was inevitable.
  • Before investing in last year's top-performing fund, check whether its track record spans enough independent years to distinguish luck from skill.
  • Before treating a single extreme measurement as defining, request confirmation from an independent second measurement.

Regression to the mean interacts with the availability heuristic: dramatic improvements and declines are memorable and vivid, while the statistical gravity pulling extremes toward the mean is invisible. It also feeds false cause reasoning: the temporal coincidence of an intervention and an improvement feels like causation, even when regression would have produced the improvement regardless. Understanding it is closely connected to understanding p-hacking, where the testing of many extreme measurements produces "significant" results that are partly regression artefacts, and to ghost variables, where the uncontrolled initial severity acts as a confounder masquerading as treatment effect.

Sources

  • Galton, F. (1886). "Regression towards mediocrity in hereditary stature." Journal of the Anthropological Institute of Great Britain and Ireland, 15, 246–263.
  • Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux. Chapter 17: "Regression to the Mean."
  • Barnett, A. G., van der Pols, J. C., & Dobson, A. J. (2004). "Regression to the mean: What it is and how to deal with it." International Journal of Epidemiology, 34(1), 215–220.
  • Senn, S. (2011). "Francis Galton and regression to the mean." Significance, 8(3), 124–126.
  • McDonald, J. H. (2014). "Regression to the mean." Handbook of Biological Statistics. Sparky House Publishing.

Related Articles