🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
undercoverage_bias
Undercoverage bias occurs when some members of the target population have zero or near-zero probability of being included in the sample because the sampling frame does not cover them. Classic examples include telephone surveys that miss people without phones, or online surveys that miss people without internet access. The excluded groups often differ systematically on the very variables being measured.
A 1936 Literary Digest poll predicted a landslide victory for Alf Landon based on 2.4 million responses, but only surveyed car owners and telephone subscribers, systematically excluding poorer voters who overwhelmingly supported Roosevelt.
A city conducts a resident satisfaction survey by emailing registered voters. Renters who move frequently, undocumented residents, and people without stable internet access are almost entirely excluded, making satisfaction scores appear higher than they would if the full residential population were captured.
A national health survey is conducted via landline telephone calls during weekday business hours. Working-age adults, people with only mobile phones, and shift workers are systematically underrepresented, causing the survey to overestimate retirement-age demographics and underestimate health issues common in younger working populations.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Does the sampling frame exclude identifiable segments of the target population?
Type: binaryAre the excluded groups likely to differ from the included groups on the outcome variable?
Type: binaryIs the sampling method (e.g., phone surveys, online polls) inherently inaccessible to some subgroups?
Type: binaryIs the claim generalized to the full population despite the exclusion?
Type: binaryUndercoverage bias occurs when some members of the target population have zero or near-zero probability of being included in the sample because the sampling frame does not cover them. Classic examples include telephone surveys that miss people without phones, or online surveys that miss people without internet access. The excluded groups often differ systematically on the very variables being measured.
Sampling frames are constructed for convenience and availability, not completeness. Researchers often underestimate how excluded groups differ from included ones.
Explicitly define the target population and evaluate how well the sampling frame covers it. Supplement with alternative data sources for underrepresented groups. Use weighting to adjust for known coverage gaps.
Surveys of internet users, smartphone owners, or social media participants systematically exclude older, poorer, or less tech-savvy populations.
Use these tools to detect, analyze, or train this aspect.