🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
neyman_bias
Neyman bias occurs in cross-sectional or prevalence studies when cases that are fatal, short-lived, or lead to rapid recovery are systematically missed. Because the study captures only those who currently have the condition at the time of measurement, it overrepresents chronic or slowly progressing cases and underrepresents the full spectrum of disease outcomes.
A study of heart attack survivors at a cardiac clinic finds that most patients have mild to moderate disease. It misses the fact that many severe heart attack patients died before reaching the clinic, leading to an underestimate of the condition's true severity.
A survey of adults aged 60+ asks about past smoking habits and finds no strong link between heavy smoking and stroke. The association is weakened because many heavy smokers who suffered fatal strokes died decades earlier and were never available to be surveyed.
A cross-sectional study of office workers finds that anxiety disorders are relatively rare, leading researchers to conclude the workplace environment is low-stress. In reality, employees who developed severe anxiety had already quit or taken long-term sick leave and were therefore absent from the sample entirely.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Does the study examine prevalent (existing) cases rather than incident (new) cases?
Type: binaryCould cases with rapid fatality or quick recovery be systematically missed?
Type: binaryIs there a significant time gap between exposure and case identification?
Type: binaryAre conclusions drawn about causation or risk without accounting for missing cases?
Type: binaryNeyman bias occurs in cross-sectional or prevalence studies when cases that are fatal, short-lived, or lead to rapid recovery are systematically missed. Because the study captures only those who currently have the condition at the time of measurement, it overrepresents chronic or slowly progressing cases and underrepresents the full spectrum of disease outcomes.
Prevalent cases are the ones visible at any given moment. Fatal cases disappear from the observable pool, and quickly resolving cases leave before they can be counted. This creates a distorted snapshot of the condition's actual distribution and risk.
Use incident (new-case) study designs or prospective cohorts instead of cross-sectional studies. Track cases from onset rather than from a single point in time. Acknowledge the limitation when prevalence data is the only option.
Early HIV research underestimated mortality because studies conducted at a single time point captured long-term survivors, missing those who had already died. Similarly, occupational studies of toxic exposures often miss workers who left or died before the study began.
The statistical error of drawing conclusions from a dataset that has been filtered by a survival or success criterion, without accounting for the filtered-out cases. The surviving sample is systematically different from the full population, and conclusions drawn from it are biased.
Occupational studies overestimate worker health because severely ill people exit the workforce.
A bias in observational studies where a period of follow-up during which the outcome cannot occur (because the exposure has not yet happened) is misclassified as exposed person-time. This artificially inflates the exposed group's survival time and makes the exposure appear protective.
Systematic exclusion of certain participants from a study distorts results.
Diagnostic test accuracy varies when evaluated across different disease severity levels.
Use these tools to detect, analyze, or train this aspect.