🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
ecological_inference_fallacy
The error of drawing conclusions about individuals from aggregate (group-level) data. Correlations observed at the group level may not hold at the individual level due to within-group variation, confounding, and aggregation effects. This is the statistical formalization of the ecological fallacy. This statistical error is also classified as a logical fallacy (D1), known as the Ecological Fallacy, where conclusions about individuals are incorrectly drawn from aggregate group data.
States with higher average income have higher Democratic vote shares, but this does not mean that higher-income individuals within those states vote Democratic (in fact, the opposite may be true).
Countries with higher average chocolate consumption per capita have more Nobel Prize winners per capita, leading a journalist to suggest chocolate boosts cognitive achievement. This says nothing about whether the specific individuals eating more chocolate are the ones winning prizes — many other country-level factors explain both variables.
Cities with more libraries per capita have higher crime rates, leading a local politician to argue that libraries somehow contribute to crime. In reality, both variables are driven by population density — denser cities have more of everything, including libraries and crime — and individuals who use libraries are not more likely to commit crimes.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Is an inference about individual behavior or characteristics being made?
Type: binaryIs the inference based on aggregate (group-level) data?
Type: binaryCould the aggregate pattern be driven by compositional effects that do not apply to individuals?
Type: binaryThe error of drawing conclusions about individuals from aggregate (group-level) data. Correlations observed at the group level may not hold at the individual level due to within-group variation, confounding, and aggregation effects. This is the statistical formalization of the ecological fallacy. This statistical error is also classified as a logical fallacy (D1), known as the Ecological Fallacy, where conclusions about individuals are incorrectly drawn from aggregate group data.
Aggregate data is often the only data available, and it seems reasonable to assume that group-level patterns reflect individual-level relationships. The disconnect between levels of analysis is non-obvious.
Use individual-level data whenever possible. When only aggregate data is available, explicitly acknowledge the ecological inference limitation and avoid individual-level conclusions.
Political science (voting behavior inference), epidemiology (disease risk from regional data), and economics (prosperity correlations).
Structural and content errors in reasoning.
Generic generalisation occurs when a generic statement — one that captures a typical or characteristic property of a kind — is treated as a strict universal claim. Generic sentences like 'dogs have four legs' or 'mosquitoes carry malaria' express statistical tendencies, characteristic features, or normative expectations, but they tolerate exceptions. The fallacy arises when these defeasible generics are deployed as though they were exceptionless universal quantifications, licensing conclusions about specific individuals.
Statistical results change depending on how geographic boundaries are drawn or aggregated.
Nearby observations are correlated, violating the independence assumption in standard analyses.
Use these tools to detect, analyze, or train this aspect.