🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
false_positive_paradox
The false positive paradox occurs when a highly accurate test applied to a rare condition produces more false positives than true positives in absolute terms. Even a test with 99% sensitivity and 99% specificity will produce one false positive for every true positive when testing a population with 1% prevalence, and ten false positives for every true positive at 0.1% prevalence.
A disease affects 1 in 1,000 people. A test has 99% sensitivity and 99% specificity. Testing 100,000 people: 100 true cases, of which 99 test positive. But 99,900 healthy people test, of which 999 test positive (false positives). There are 10 false positives for every true positive.
An airport security algorithm flags potential threats with 99% accuracy and a false positive rate of just 1%. On a day with 10,000 travelers, if only 10 are genuine threats, the system correctly catches 9 of them — but also wrongly detains 100 innocent passengers. For every real threat identified, roughly 11 innocent people are flagged alongside them.
A social media platform deploys an AI to detect bot accounts, claiming 98% accuracy. If only 0.5% of its 10 million users are bots — that's 50,000 bots — the system correctly identifies 49,000 of them but also falsely flags 199,000 real users. The overwhelming majority of accounts banned are actually legitimate human users.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Is the test being applied to a low-prevalence condition?
Type: binaryIs the specificity of the test high enough to prevent false positives from dominating true positives?
Type: binaryIs the positive predictive value (PPV) calculated using the actual population prevalence?
Type: binaryAre absolute counts of true positives versus false positives reported, not just sensitivity and specificity?
Type: binaryThe false positive paradox occurs when a highly accurate test applied to a rare condition produces more false positives than true positives in absolute terms. Even a test with 99% sensitivity and 99% specificity will produce one false positive for every true positive when testing a population with 1% prevalence, and ten false positives for every true positive at 0.1% prevalence.
Sensitivity and specificity are conditional probabilities that seem impressive in isolation. The base rate transforms them into the positive predictive value (PPV), which is what matters for clinical and policy decisions.
Always calculate the positive predictive value: PPV = (sensitivity x prevalence) / [(sensitivity x prevalence) + (1 minus specificity) x (1 minus prevalence)]. Report absolute numbers, not just rates.
Airport security screening, mass COVID testing, and drug testing programs all face the false positive paradox; with rare conditions or infractions, most positives are false positives.
Use these tools to detect, analyze, or train this aspect.