Apps

🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!

Sieve Bias

Also Known As: Cascading selection bias Sequential filtering bias
Statistical Error ID: sieve_bias

Definition

Sieve bias occurs when data passes through multiple filtering or selection steps, each of which may introduce its own subtle bias. While any single filter might have a minor effect, the cumulative result of successive filtering can produce a final sample that is profoundly unrepresentative of the original population. The compounding nature of sequential selection makes the total bias much larger and harder to predict than any individual step would suggest.

Examples

A clinical study starts with 10,000 patients, then restricts to those who completed intake forms (excluding the sickest), then to those with follow-up data (excluding dropouts who experienced side effects), then to those with complete lab results (excluding the poorest). The final 2,000 patients are healthier, wealthier, and more compliant than the original population.

A tech company surveys employees about workplace satisfaction, but only workers with a company email account are invited, then only those who open the HR newsletter see the survey link, then only those who feel strongly enough bother to respond. Each filter quietly removes a different type of employee — contractors, disengaged staff, and those with mild opinions — leaving a final sample that bears little resemblance to the actual workforce.

An economics study on the returns to education uses administrative records that first exclude anyone without a social security number, then drop records with incomplete wage data, then remove individuals who changed jobs more than twice. Immigrants, gig workers, and the most economically mobile people disappear through successive cuts, and the estimated wage premium for a college degree reflects only a narrow, stable slice of the labor market.

Verification Steps
Verification Steps
Binary yes/no questions that an AI must answer to detect a reasoning pattern in a text.
Each of the 452 aspects has verification steps — simple yes/no questions designed to systematically detect whether a pattern appears in a text. For ad hominem: "Does the argument attack a person rather than their claim?" For false dichotomy: "Are only two options presented when more exist?" This ensures consistent, reproducible analysis.

Binary (yes/no) questions an LLM must answer to identify this aspect:

  1. 1

    Has the data been filtered through multiple sequential selection criteria?

    Type: binary
  2. 2

    Could each filtering step disproportionately remove certain types of observations?

    Type: binary
  3. 3

    Is the remaining sample systematically different from the original population after all filters are applied?

    Type: binary
  4. 4

    Has the cumulative effect of all filtering steps on sample composition been assessed?

    Type: binary
Deep Dive
The expandable detail section on each aspect page with examples, psychology, and counter-strategies.
The Deep Dive section provides in-depth information about each aspect: a real-world example showing the pattern in action, an explanation of why it works psychologically, practical advice on how to counter it, alternative names, and links to related aspects.

Hierarchical Context