Berkson's Paradox (Collider Bias)

Also Known As: collider bias Berkson's bias selection-distortion effect explain-away effect

Statistical Error ID: berksons_paradox

Definition

Berkson's Paradox occurs when conditioning on a shared consequence (a collider variable) of two independent causes creates a spurious negative correlation between those causes. When you select a sample based on some criterion that is influenced by both variables, you artificially introduce a relationship that does not exist in the general population. This is the opposite of confounding: instead of an upstream common cause creating a spurious positive correlation, a downstream common effect creates a spurious negative correlation.

Examples

Among hospitalized patients, a negative correlation is observed between diabetes and a bone fracture. This does not mean diabetes prevents fractures. Rather, people are in the hospital because they have diabetes OR a fracture (or both). Selecting only hospitalized people creates the illusion that these two independent conditions are inversely related.

A talent agency notices that among its signed actors, those who are highly attractive tend to be less talented, and vice versa. This seems to suggest looks and talent are negatively correlated — but in reality, both traits independently increase the chance of being signed, creating the illusion of a tradeoff.

A dating app analyst observes that among users who receive many matches, kindness and physical attractiveness appear negatively correlated. In the general population, the two traits are unrelated — but getting many matches requires being high on at least one dimension, distorting the apparent relationship.

Verification Steps

Verification Steps

Binary yes/no questions that an AI must answer to detect a reasoning pattern in a text.

Each of the 452 aspects has verification steps — simple yes/no questions designed to systematically detect whether a pattern appears in a text. For ad hominem: "Does the argument attack a person rather than their claim?" For false dichotomy: "Are only two options presented when more exist?" This ensures consistent, reproducible analysis.

View in glossary →

Binary (yes/no) questions an LLM must answer to identify this aspect:

1

Is the sample selected based on a criterion that both variables can influence?
Type: binary
2

Could the observed correlation be an artifact of the selection process?
Type: binary
3

Does the relationship hold in the general population, not just the selected subgroup?
Type: binary
4

Is a common effect (collider) being conditioned on, creating a spurious association?
Type: binary

Description

Why It Works

Selection bias is invisible to someone analyzing only the selected sample. The data genuinely shows the negative correlation within the sample; the problem lies in the sampling process itself.

How to Counter

Identify whether your sample was selected based on a variable that could be a collider. Draw a causal diagram and check if conditioning on a descendant of both variables might create a spurious association.

Also Known As

collider bias Berkson's bias selection-distortion effect explain-away effect

Real-World Context

Berkson's Paradox appears in hospital-based epidemiological studies, dating pools (the 'attractiveness vs. niceness' tradeoff perceived in available partners), and university admissions studies.