Apps

🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!

Essentials / Statistical Errors / Data Dredging (Fishing Expedition)

Data Dredging (Fishing Expedition) — When Numbers Lie

Has this ever happened to you? A researcher has access to a large health database with 500 variables.

Also known as: fishing expedition, HARKing (Hypothesizing After Results are Known), post-hoc analysis disguised as a priori

What's Actually Happening

Data dredging is the practice of exhaustively searching through data for any statistically significant patterns without a prior hypothesis, then presenting discovered patterns as if they were predicted in advance. While exploratory data analysis is legitimate when labeled as such, data dredging crosses the line by disguising exploratory findings as confirmatory results. The sheer number of possible correlations in any dataset virtually guarantees that some will pass significance thresholds by ch

The published result looks identical to a hypothesis-driven finding: clean data, clear statistical test, significant p-value. The reader has no way to know how many tests preceded the reported one.

Real Talk: You See This Every Day

A researcher has access to a large health database with 500 variables. After testing all 124,750 possible pairwise correlations, they find that ice cream consumption is significantly correlated with drowning deaths. They publish this as a confirmed finding without mentioning it was one of 125,000 tests or that both variables are driven by warm weather.

Data dredging is facilitated by big data and machine learning, where massive datasets make spurious correlations inevitable. The website 'Spurious Correlations' by Tyler Vigen illustrates the absurdity of uncritical data mining.

Your BS Detector

Distinguish between exploratory and confirmatory analyses. Require replication on independent data for any dredged finding. Apply multiple comparison corrections appropriate to the number of tests actually conducted.

The Challenge

Next time someone throws a statistic at you — in class, online, in the news — don't just accept it. Ask: what's missing from this picture?


Part of the TellDear Teen Book — criticalthinking.guide

← All chapters Detailed aspect entry →