🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
harking
HARKing is the practice of presenting a hypothesis that was developed or refined after examining the data as though it had been formulated before data collection. This transforms exploratory analysis into what appears to be confirmatory research, creating a false impression that a specific prediction was confirmed. HARKing inflates the apparent evidential value of findings because post-hoc hypotheses are fitted to the data and therefore almost guaranteed to be supported by it.
A researcher studies the effect of a drug on 20 health outcomes. Only the effect on blood pressure is statistically significant. The published paper presents a focused hypothesis about blood pressure, with no mention of the other 19 outcomes tested, making it appear as though the drug's blood pressure effect was the predicted finding all along.
A marketing team runs an A/B test on five different ad designs and measures click-through rates, purchase conversions, and time-on-page. Only one metric — time-on-page — differs significantly for one ad variant. The final report is presented to leadership with the confident headline 'We hypothesized that Ad Variant C would boost user engagement,' framing a post-hoc observation as a planned prediction.
A sociologist collects survey data on 30 demographic and attitudinal variables, then runs correlations across all of them. She notices that people who own houseplants report slightly higher life satisfaction. She writes up the finding with an introduction citing theories of biophilia and human-nature connection, presenting it as a theoretically motivated hypothesis rather than a pattern she stumbled upon while data-mining.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Does the study present a hypothesis that fits the observed results suspiciously well?
Type: binaryWas the hypothesis plausibly formulated after seeing the data rather than before?
Type: binaryIs there a registered pre-analysis plan or pre-registration that confirms the hypothesis was stated a priori?
Type: binaryDoes the narrative present exploratory findings as if they were confirmatory?
Type: binaryHARKing is the practice of presenting a hypothesis that was developed or refined after examining the data as though it had been formulated before data collection. This transforms exploratory analysis into what appears to be confirmatory research, creating a false impression that a specific prediction was confirmed. HARKing inflates the apparent evidential value of findings because post-hoc hypotheses are fitted to the data and therefore almost guaranteed to be supported by it.
Readers cannot tell from a published paper whether a hypothesis was formulated before or after data analysis. The narrative format of academic papers naturally lends itself to coherent storytelling, and presenting a clean prediction-confirmation story is more compelling and publishable than reporting exploratory findings.
Pre-register hypotheses and analysis plans before data collection. Distinguish clearly between confirmatory and exploratory analyses in publications. Require access to pre-registration records during peer review. Value and publish exploratory research as such rather than disguising it as confirmatory.
Widespread across psychology, biomedical research, and economics. The replication crisis revealed that many published findings were likely HARKed, contributing to inflated effect sizes and failure to replicate.
Running multiple analyses until p<0.05 and only reporting significant results.
Searching through large datasets for any statistically significant pattern without a prior hypothesis. Found patterns are presented as confirmatory when they are actually exploratory and likely to be spurious.
Finding a pattern in data and testing significance on the same subset (circular analysis).
Studies with statistically significant or positive results are more likely to be published, while null results remain unpublished. This distorts the published literature and inflates apparent effect sizes in meta-analyses.
Using information that was not available at the point in time being analyzed.
Research funded by parties with financial interests tends to produce favorable results.
Splitting a single study into multiple publications to inflate publication count.
Use these tools to detect, analyze, or train this aspect.