Apps

🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!

Accuracy Paradox

Also Known As: Accuracy trap
Statistical Error ID: accuracy_paradox

Definition

The Accuracy Paradox occurs when a predictive model with higher overall accuracy performs worse at the task it was designed for than a model with lower accuracy. This typically happens when classes are imbalanced — a model that always predicts the majority class can score very high accuracy while being completely useless for detecting the minority class.

Examples

A fraud detection system classifies 99.5% of transactions correctly by labeling everything as legitimate. A competing model has only 95% accuracy but catches 80% of fraudulent transactions. The less accurate model is far more useful despite its lower accuracy score.

A hospital deploys an AI model to screen chest X-rays for a rare lung condition affecting 1% of patients. The model achieves 99% accuracy simply by flagging nobody as sick. A second, 'less accurate' model at 96% overall accuracy correctly identifies 70% of true cases and is far more clinically useful, yet the first model looks superior on the headline metric.

A content moderation team evaluates two spam filters for their platform, where only 0.5% of posts are spam. Filter A scores 99.5% accuracy by approving every post. Filter B scores 97% accuracy but catches 85% of actual spam. Management almost deploys Filter A after seeing the numbers, not noticing it would let every single piece of spam through.

Verification Steps
Verification Steps
Binary yes/no questions that an AI must answer to detect a reasoning pattern in a text.
Each of the 452 aspects has verification steps — simple yes/no questions designed to systematically detect whether a pattern appears in a text. For ad hominem: "Does the argument attack a person rather than their claim?" For false dichotomy: "Are only two options presented when more exist?" This ensures consistent, reproducible analysis.

Binary (yes/no) questions an LLM must answer to identify this aspect:

  1. 1

    Is the dataset highly imbalanced, with one class vastly outnumbering the other?

    Type: binary
  2. 2

    Could a naive model achieve high accuracy simply by predicting the majority class?

    Type: binary
  3. 3

    Does the model with higher accuracy fail to detect the minority class effectively?

    Type: binary
  4. 4

    Are metrics like precision, recall, or F1-score being ignored in favor of overall accuracy?

    Type: binary
Deep Dive
The expandable detail section on each aspect page with examples, psychology, and counter-strategies.
The Deep Dive section provides in-depth information about each aspect: a real-world example showing the pattern in action, an explanation of why it works psychologically, practical advice on how to counter it, alternative names, and links to related aspects.

Hierarchical Context