🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
garden_of_forking_paths
The Garden of Forking Paths describes how researchers, even without malicious intent, make numerous small analytical decisions (how to define variables, which outliers to exclude, which covariates to include, when to stop collecting data) that collectively inflate the false positive rate. Unlike p-hacking (deliberate fishing for significance), this can happen unconsciously: each individual decision seems reasonable, but the cumulative effect of many 'reasonable' choices made while looking at the data dramatically increases the chance of finding a spurious result. Named by Andrew Gelman and Eric Loken (2013) after a Borges short story.
A psychology study finds that listening to classical music before a test improves scores by 15%. The finding depends on: which music tracks were played (three were tried), how 'improvement' was defined (two definitions were considered), which participants were excluded as outliers (three exclusion rules were tested), and which covariate was included (age was added after seeing the initial results). With different but equally defensible choices, the effect disappears.
A replication team follows the original paper's methods exactly but still finds no effect. The original researchers made dozens of small decisions — data collection timing, exact wording of instructions, analysis software — that shaped the result but weren't reported. The forking paths are invisible in the published article.
An economics paper finds that a minimum wage increase had no effect on employment in treated counties. The result holds with their chosen control counties, their chosen time window, and their model specification. Three independent teams reanalysing the same public data with different (equally valid) analytical choices find effects ranging from −8% to +4%.
Binary (yes/no) questions an LLM must answer to identify this aspect: