Simpson's Paradox: When the Numbers Tell Two Opposite Truths
The Plot Twist
Two hospitals. You need surgery. Which one do you choose?
Hospital A: 900 patients survived out of 1,000. That's a 90% survival rate.
Hospital B: 800 patients survived out of 1,000. That's an 80% survival rate.
Easy choice, right? Hospital A is clearly better.
Now someone shows you the breakdown by patient condition:
| Hospital A | Hospital B | |
|---|---|---|
| Healthy patients | 870/900 = 96.7% survival | 245/250 = 98% survival |
| Serious cases | 30/100 = 30% survival | 555/750 = 74% survival |
Wait.
Hospital B has a higher survival rate for both healthy patients and serious cases — but Hospital A has a better overall survival rate?
That's not a typo. That's Simpson's Paradox.
And it means the same numbers, depending on how you look at them, can point to completely opposite conclusions.
What Just Happened?
The trick is in what statisticians call aggregation — when you combine groups, the mix of who's in each group can flip the result.
Hospital A mostly treats healthy patients — 900 of their 1,000 cases are low-risk. Easy wins. Good overall survival rate.
Hospital B mostly treats serious cases — 750 of their 1,000 cases are high-risk. They're harder to save, which drags down the overall number. But for each type of patient, they're actually doing better.
If you're seriously ill, Hospital B saves you almost 2.5 times as often. But if you just look at the headline number, you'd walk into Hospital A.
The numbers aren't wrong. They're just answering different questions. And if you don't ask the right question, the answer will mislead you.
Real-World Cases (Where This Actually Happened)
UC Berkeley admissions and gender bias (1973)
The university's overall admission rates seemed to show bias against women: 44% of male applicants were admitted, but only 35% of female applicants.
Activists were ready to sue.
Then someone looked at it by department. In most departments, women were admitted at higher rates than men. The overall numbers looked biased because women were applying to the most competitive departments (like law, medicine) at higher rates — and those departments had lower acceptance rates for everyone.
The aggregate data showed apparent discrimination. The disaggregated data showed the opposite.
COVID death rates by age (2020)
Early in the pandemic, some data seemed to show Italy had much worse death rates than China, even controlling for total cases. A head-scratcher — until you looked at the age breakdown. Italy has one of the oldest populations in the world. COVID kills older people at much higher rates. When you compare age group by age group, the rates looked similar. The overall difference was Simpson's Paradox driven by population age structure.
Batting averages in baseball
David Justice had a higher batting average than Derek Jeter in both 1995 and 1996 — but Jeter had a better combined average across both years. This is a famous Simpson's Paradox example from real sports data. (Look it up: the math checks out.)
Why Your Brain Struggles With This
Humans are bad at holding multiple comparisons in mind at once. We love a single number — a rank, a percentage, a score — because it feels like the truth.
But "which hospital is better?" isn't a question with one answer. It depends on:
- What kind of patient are you?
- What are you measuring?
- What are you comparing?
Simpson's Paradox shows that collapsing a complex situation into one number can actively reverse the truth.
It's not that the data lied. It's that the summary lost something critical.
How to Protect Yourself
When you see an overall statistic that surprises you:
1. Ask: "Overall" compared to what groups?
Are there subgroups that might behave differently? Break it down.
2. Ask: Who's in each group?
Are the groups being compared actually comparable? Mixing different compositions (high-risk + low-risk patients, competitive + easy departments) can flip results.
3. Ask: What's the question I'm actually trying to answer?
"Which hospital is better overall?" and "Which hospital gives me the best chance?" can have opposite answers.
4. Be suspicious of aggregates in competitive contexts.
Rankings, league tables, averages across very different groups — these are Simpson's Paradox breeding grounds.
The Challenge
Find (or invent) a scenario where two groups could each be "better" individually but "worse" overall.
Start here: A tutoring program claims it improves grades. School A ran the program and saw overall scores rise. School B ran it too and saw overall scores rise. But statewide, schools that used the program had lower average scores than schools that didn't.
How is this possible? (Hint: think about which schools chose to run the program.)
Write your explanation. If you can explain Simpson's Paradox in plain language to someone who's never heard of it — you actually understand it. Most adults don't.
Numbers don't lie. But they can tell you a truth that's completely useless — or worse, dangerously backwards — if you don't know what question you're asking.