Collider Bias

Also Known As: Endogenous Selection Bias Berkson's Bias (specific case)

Discourse Mechanics ID: collider_bias

Definition

A statistical error that occurs when conditioning on a variable that is causally affected by two other variables creates a spurious association between those two variables. In a causal diagram, a collider is a variable where two causal arrows converge, and conditioning on it opens a non-causal path.

Examples

Among hospitalized patients (collider), a negative correlation appears between two diseases that are actually independent in the general population, because having either disease is sufficient for hospitalization.

A study of professional athletes finds a puzzling negative correlation between raw strength and cardiovascular endurance. In the general population the two traits are unrelated, but because both independently increase the chance of making it to elite sport (the collider), conditioning on being a professional athlete creates a spurious trade-off.

Researchers studying successful tech startups find that companies with charismatic founders tend to have weaker initial products. In the broader startup population, charisma and product quality are unrelated — but investors fund startups that have at least one of the two, so among funded companies (the collider), the two traits appear negatively correlated.

Verification Steps

Verification Steps

Binary yes/no questions that an AI must answer to detect a reasoning pattern in a text.

Each of the 452 aspects has verification steps — simple yes/no questions designed to systematically detect whether a pattern appears in a text. For ad hominem: "Does the argument attack a person rather than their claim?" For false dichotomy: "Are only two options presented when more exist?" This ensures consistent, reproducible analysis.

View in glossary →

Binary (yes/no) questions an LLM must answer to identify this aspect:

1

Is a statistical relationship between two variables being analyzed?
Type: binary
2

Is the analysis conditioned on a third variable that is causally influenced by both?
Type: binary
3

Does conditioning on this collider variable create a spurious association between the two variables of interest?
Type: binary
4

Would the association disappear or reverse without conditioning on the collider?
Type: binary

Description

Why It Works

Conditioning on a common effect creates a mathematical dependency between its causes, even when they are truly independent. This is counterintuitive because controlling for variables is usually seen as beneficial.

How to Counter

Draw the causal diagram before deciding which variables to control for. Never condition on a descendant of the exposure and outcome without understanding the causal structure.