Garden of Forking Paths

Also Known As: Researcher Degrees of Freedom Analytical Flexibility

Statistical Error ID: garden_of_forking_paths

Definition

The Garden of Forking Paths describes how researchers, even without malicious intent, make numerous small analytical decisions (how to define variables, which outliers to exclude, which covariates to include, when to stop collecting data) that collectively inflate the false positive rate. Unlike p-hacking (deliberate fishing for significance), this can happen unconsciously: each individual decision seems reasonable, but the cumulative effect of many 'reasonable' choices made while looking at the data dramatically increases the chance of finding a spurious result. Named by Andrew Gelman and Eric Loken (2013) after a Borges short story.

Examples

A psychology study finds that listening to classical music before a test improves scores by 15%. The finding depends on: which music tracks were played (three were tried), how 'improvement' was defined (two definitions were considered), which participants were excluded as outliers (three exclusion rules were tested), and which covariate was included (age was added after seeing the initial results). With different but equally defensible choices, the effect disappears.

A replication team follows the original paper's methods exactly but still finds no effect. The original researchers made dozens of small decisions — data collection timing, exact wording of instructions, analysis software — that shaped the result but weren't reported. The forking paths are invisible in the published article.

An economics paper finds that a minimum wage increase had no effect on employment in treated counties. The result holds with their chosen control counties, their chosen time window, and their model specification. Three independent teams reanalysing the same public data with different (equally valid) analytical choices find effects ranging from −8% to +4%.

Verification Steps

Verification Steps

Binary yes/no questions that an AI must answer to detect a reasoning pattern in a text.

Each of the 452 aspects has verification steps — simple yes/no questions designed to systematically detect whether a pattern appears in a text. For ad hominem: "Does the argument attack a person rather than their claim?" For false dichotomy: "Are only two options presented when more exist?" This ensures consistent, reproducible analysis.

View in glossary →

Binary (yes/no) questions an LLM must answer to identify this aspect: