🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
lindleys_paradox
Lindley's Paradox occurs when frequentist and Bayesian statistical methods produce contradictory conclusions from the same data. Specifically, a result can be statistically significant (low p-value) in a frequentist test while the Bayesian posterior probability strongly favors the null hypothesis. This disagreement becomes more pronounced with large sample sizes.
A clinical trial with 100,000 participants finds a treatment effect of 0.01 units with p = 0.03. The frequentist rejects the null hypothesis. However, a Bayesian analysis with a reasonable prior concludes there is a 95% probability that the null hypothesis is true, because the observed effect is so small that it is more consistent with noise than a real effect at the prior's scale.
A large government survey of 500,000 households finds that people in one region earn on average $200 more per year than the national average, with p = 0.04. The frequentist analyst declares a statistically significant regional wage gap. A Bayesian economist, incorporating prior knowledge that regional wage differences of that magnitude are extremely rare, concludes the posterior probability of a true gap is less than 15%.
A genomics study scanning 1 million genetic variants finds one SNP associated with a disease at p = 0.04 after correction. The frequentist flags it as significant. A Bayesian analysis incorporating the prior that most of the million tested variants have no true effect concludes the probability that this specific variant is a true positive is below 20%, suggesting the result is likely a false discovery.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Is a statistically significant result being reported from a frequentist hypothesis test?
Type: binaryWould a Bayesian analysis with a reasonable prior assign high probability to the null hypothesis despite the significant p-value?
Type: binaryIs the sample size very large, making even tiny effects statistically significant?
Type: binaryHas the prior probability of the alternative hypothesis been considered alongside the p-value?
Type: binaryLindley's Paradox occurs when frequentist and Bayesian statistical methods produce contradictory conclusions from the same data. Specifically, a result can be statistically significant (low p-value) in a frequentist test while the Bayesian posterior probability strongly favors the null hypothesis. This disagreement becomes more pronounced with large sample sizes.
With large samples, frequentist tests can detect arbitrarily small effects and produce significant p-values for practically meaningless differences. Bayesian analysis penalizes vague alternative hypotheses because the likelihood is spread thinly across the parameter space, so the precise null hypothesis receives comparatively more support.
Report effect sizes alongside p-values. Consider Bayesian approaches or Bayes factors when sample sizes are large. Evaluate whether a statistically significant result is also practically meaningful. Be explicit about prior assumptions and the distinction between statistical and substantive significance.
This paradox frequently arises in large-scale epidemiological studies, genomics (genome-wide association studies with millions of data points), and social science research with big data, where tiny effects routinely reach statistical significance.
Rejecting a true null hypothesis – finding a signal in noise.
Failing to reject a false null hypothesis – missing a valid signal.
Ignoring general statistical base rates in favor of specific individual-case info.
A model with higher accuracy can have worse predictive power than a less accurate one on imbalanced data.
A study with too few participants or observations to reliably detect the effect being investigated. Low statistical power increases both false negatives and the rate at which significant findings are false positives.
Use these tools to detect, analyze, or train this aspect.