blog.category.aspect Mar 29, 2026 8 min read

Publication Bias: The Silent Graveyard of Null Findings

#blog.tag.aspect #blog.tag.d4_statistical_errors #blog.tag.bok #blog.tag.encyclopedia

In 2008, Erick Turner and colleagues at the FDA published a landmark paper in the New England Journal of Medicine. They had access to something almost no one outside regulatory agencies ever sees: every clinical trial submitted to the FDA for approval of 12 antidepressants — published and unpublished alike. The contrast was stark. Of 74 registered trials, 38 showed positive results. Of those 38, 37 were published. Of the 36 trials with negative or ambiguous results, 22 were simply not published, and 11 were published in a way that conveyed a positive outcome. The published literature showed 94% of trials with positive results. The complete registered evidence showed 51%. Publication bias had turned a coin flip into an apparent landslide.

What Is Publication Bias?

Publication bias is the systematic tendency for studies that produce statistically significant, positive, or theoretically interesting results to be published at higher rates than studies that find null, negative, or inconclusive results. It operates through multiple mechanisms: researchers choose not to write up null findings; journals prefer novel positive results over replications or failures to find effects; peer reviewers unconsciously hold null results to higher standards; and editors exercise selection pressure toward findings that will attract citations.

The "file drawer problem" — a term coined by Robert Rosenthal in 1979 — describes the consequence: a hidden population of null findings sitting in researchers' filing cabinets (or, today, on hard drives), never submitted, never reviewed, never counted. The published literature is the visible tip of an iceberg whose submerged bulk is invisible. When anyone — clinicians, policymakers, researchers conducting meta-analyses — reads the literature, they are reading a systematically biased sample.

The Mechanism: How Bias Enters the System

Publication bias does not require bad actors. It emerges from the aggregated incentive structures of academic science:

Researchers. Academic careers are built on publications, and publications require significant findings. A null result requires the same effort to collect as a positive result but produces far less career benefit. Researchers rationally allocate writing effort toward findings that will be accepted. Null results are rarely written up, and when they are, they often languish in submission loops.

Journals. Scientific journals face competitive pressures to publish findings that will be cited, discussed, and generate readership. Significant positive findings attract citations; null results rarely do. Journal impact factors — and the career incentives that flow from them — create selection pressure toward the novel and the positive.

Peer review. Studies have found that reviewers apply inconsistent standards to null versus positive findings, often demanding more stringent methodology from null-result papers. This is not always conscious bias; reviewers may genuinely find it harder to believe that an intervention has no effect than to believe it does.

Funders. Industry funding in medical research creates explicit incentives for positive outcomes. Pharmaceutical companies that fund clinical trials have commercial interests in positive findings. While data falsification is illegal and rare, selective submission and spin in interpretation of results are legal and well-documented. The Turner et al. antidepressant analysis revealed that FDA-registered trials funded by manufacturers showed the same publication asymmetry as the overall literature.

Consequences: Inflated Effect Sizes and the Replication Crisis

When you conduct a meta-analysis — combining results from multiple studies to estimate an overall effect — publication bias systematically inflates your estimate. If you are averaging over published studies, you are averaging over a selected sample in which positive findings are over-represented. The pooled effect is larger than the true population effect.

This has been demonstrated repeatedly in medical research. A 2010 analysis of Cochrane systematic reviews found that industry-funded trials reported significantly higher effect sizes than independent trials on the same interventions. A substantial part of this difference is attributable to publication bias and selective reporting, not to outright falsification.

The connection to the replication crisis is direct. The replication crisis — the discovery, beginning around 2011, that a substantial fraction of landmark findings in psychology and other social sciences cannot be reproduced — has multiple causes, of which publication bias is one of the most important. When the published literature contains primarily studies that happened to produce significant results (partly through chance, partly through p-hacking), replication studies — which encounter the same interventions with fresh samples — frequently fail to find the same effects. The original findings were not necessarily fraudulent; they were, in part, the product of systematic filtering that removed null results and left only the positives.

The 2015 Open Science Collaboration published a replication attempt of 100 published psychology experiments. Only 36 of the 100 produced results consistent with the original at conventional significance thresholds. Many of the non-replications involved findings that were widely cited and had been considered established. Publication bias was not the only cause — researcher degrees of freedom and p-hacking also played significant roles — but the biased literature they created was the foundation on which the failures were built.

Funnel Plots: Visualising the Invisible

Statisticians have developed methods to detect publication bias in a body of literature even without access to unpublished studies. The most widely used is the funnel plot. In a funnel plot, each study in a meta-analysis is plotted as a point, with effect size on the horizontal axis and study precision (often standard error or sample size) on the vertical axis. In the absence of bias, the plot should resemble a symmetric inverted funnel: large studies cluster tightly around the true effect; small studies scatter widely but symmetrically above and below it.

Publication bias creates asymmetry: the small studies that produced null results are absent, leaving a gap in the lower-left quadrant of the funnel. This asymmetry is detectable with statistical tests (Egger's test, Begg and Mazumdar's rank correlation) and is a standard diagnostic in meta-analytic practice. Trim and fill methods attempt to impute the missing studies and produce a corrected estimate. These methods are imperfect — funnel plot asymmetry has other causes, and trim-and-fill makes strong assumptions — but they represent the best available tools for correcting a bias that cannot be directly observed.

Registered Reports and Pre-Registration: The Structural Fix

The most promising response to publication bias is structural rather than statistical: pre-registration and registered reports. In pre-registration, researchers register their hypothesis, study design, and analysis plan in a public database (such as OSF.io, ClinicalTrials.gov, or the AEA Social Science Registry) before data collection begins. This creates a public record that null results can be compared against, making the file drawer visible.

Registered reports go further: journals commit to publish the study based on the quality of the design and the importance of the question, regardless of the result. The publication decision is made before any data exists. This removes the mechanism by which journals select for positive findings, since no one knows yet what the findings will be.

Pre-registration does not prevent all forms of bias — researchers can deviate from pre-registered plans, report additional unregistered analyses, or spin interpretations — but it creates accountability that was previously absent. ClinicalTrials.gov registration has been mandatory for certain FDA-regulated trials since 2008, and compliance rates have improved substantially. Comparing registered trials to published outcomes has become a standard method for documenting publication bias in specific therapeutic areas.

The Broader Epistemic Problem

Publication bias is a specific instance of a broader epistemic problem: the evidence base on which decisions are made is not a random sample of the evidence that exists. It is a filtered, incentive-shaped subset. This means that evidence-based medicine, evidence-based policy, and evidence-based practice in any field are, to an unknown and variable degree, building on a distorted foundation.

The implications are uncomfortable. Systematic reviews and meta-analyses — the gold standard of evidence synthesis — are only as good as the literature they synthesise. A meta-analysis of a biased literature produces a biased synthesis, no matter how technically sophisticated the synthesis process. The Cochrane Collaboration, the Campbell Collaboration, and other evidence synthesis bodies now routinely assess and report publication bias as a component of systematic review quality, but they cannot fully correct for it.

This connects to the base rate problem in a specific way: if we do not know the base rate of null findings in a research area — because they are systematically suppressed — we cannot correctly calibrate our confidence in the positive findings we observe. We are, in effect, reasoning about effects while missing the denominator.

What Readers Can Do

For non-specialist readers engaging with scientific claims, several heuristics help:

Seek pre-registered evidence. Findings from pre-registered studies — especially registered reports — are less susceptible to publication bias and should be weighted more heavily.
Check trial registries. For medical interventions, ClinicalTrials.gov allows comparison between registered trials and published results. Gaps between the two are informative.
Be sceptical of meta-analyses without funnel plot analysis. A meta-analysis that does not assess or report publication bias should be treated with increased caution.
Discount extraordinary effect sizes. Very large effect sizes in small samples are often products of publication bias and researcher degrees of freedom, not reflections of genuine effects. Effect sizes that shrink substantially in large, pre-registered replications are a tell.
Weight replications heavily. A finding that has been directly replicated in an independent sample — especially a pre-registered replication — carries far more evidential weight than a finding that exists only as a single published study.

Sources & Further Reading

Turner, E.H. et al. (2008). Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy. New England Journal of Medicine, 358(3), 252–260.
Rosenthal, R. (1979). The File Drawer Problem and Tolerance for Null Results. Psychological Bulletin, 86(3), 638–641.
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
Ioannidis, J.P.A. (2005). Why Most Published Research Findings Are False. PLOS Medicine, 2(8), e124.
Egger, M. et al. (1997). Bias in meta-analysis detected by a simple, graphical test. BMJ, 315(7109), 629–634.
Chambers, C.D. (2013). Registered Reports: A new publishing initiative at Cortex. Cortex, 49(3), 609–610.

Publication Bias: The Silent Graveyard of Null Findings

What Is Publication Bias?

The Mechanism: How Bias Enters the System

Consequences: Inflated Effect Sizes and the Replication Crisis

Funnel Plots: Visualising the Invisible

Registered Reports and Pre-Registration: The Structural Fix

The Broader Epistemic Problem

What Readers Can Do

Sources & Further Reading

Related Articles

The Base Rate Fallacy: Why a 99% Accurate Test Can Still Mostly Be Wrong

Berkson's Paradox: Why Your Dating Pool Lies to You About Reality

Confounding Variable Neglect: The Hidden Third Factor Behind Every Suspicious Correlation

Related Articles

blog.category.aspect 8 min read

The Base Rate Fallacy: Why a 99% Accurate Test Can Still Mostly Be Wrong

blog.category.aspect 8 min read

Berkson's Paradox: Why Your Dating Pool Lies to You About Reality

blog.category.aspect 8 min read

Confounding Variable Neglect: The Hidden Third Factor Behind Every Suspicious Correlation