Multicollinearity

Also Known As: Collinearity Ill-conditioned design matrix

Statistical Error ID: multicollinearity

Definition

Multicollinearity occurs when two or more independent variables in a regression model are highly correlated, making it difficult to isolate the individual effect of each variable. While the overall model fit may remain good, standard errors become inflated, coefficient estimates become unstable, and statistical significance tests become unreliable. Perfect multicollinearity makes estimation impossible entirely.

Examples

A model predicting house prices includes both square footage and number of rooms as independent variables. Since larger homes typically have more rooms, the two variables are highly correlated, and the model cannot reliably separate their individual contributions to price.

A nutrition study modeling cholesterol levels includes both daily saturated fat intake and daily red meat consumption as predictors. Since people who eat more red meat also consume more saturated fat, the two variables are tightly correlated, and the model cannot reliably determine which one independently drives cholesterol levels.

An economic model predicting consumer spending includes both household income and household wealth as separate independent variables. Because wealthier households also tend to have higher incomes, the two variables move together so closely that neither coefficient is statistically significant, even though spending clearly depends on financial resources.

Verification Steps

Verification Steps

Binary yes/no questions that an AI must answer to detect a reasoning pattern in a text.

Each of the 452 aspects has verification steps — simple yes/no questions designed to systematically detect whether a pattern appears in a text. For ad hominem: "Does the argument attack a person rather than their claim?" For false dichotomy: "Are only two options presented when more exist?" This ensures consistent, reproducible analysis.

View in glossary →

Binary (yes/no) questions an LLM must answer to identify this aspect:

1

Are two or more independent variables in the model highly correlated with each other?
Type: binary
2

Are the standard errors of the coefficients unusually large relative to the coefficient estimates?
Type: binary
3

Do coefficient estimates change dramatically when a variable is added or removed?
Type: binary
4

Is the analysis drawing conclusions about individual variable effects despite collinearity?
Type: binary

Description

Why It Works

When predictors share much of the same information, the model cannot determine which variable is responsible for changes in the outcome. Small changes in data can cause large swings in estimated coefficients, creating an illusion of instability.

How to Counter

Calculate variance inflation factors (VIF) to detect collinearity. Consider combining correlated variables into a single index, dropping redundant predictors, or using regularization techniques like ridge regression that handle collinearity more gracefully.

Also Known As

Collinearity Ill-conditioned design matrix

Real-World Context

Frequently encountered in social science research where demographic variables (income, education, occupation) are correlated, and in financial models where economic indicators move together.

Related Aspects

Omitted Variable Bias Attenuation Bias Overfitting Ghost Variables

Related Aspects

→ correlates with

Omitted Variable Bias

Excluding a relevant confounding variable from a model biases the estimated effects.

→ correlates with

Attenuation Bias

Measurement error in predictor variables biases effect estimates toward zero.

→ correlates with

Overfitting

A model or analysis fits the noise in the training data so closely that it fails to generalize to new data. The model captures random fluctuations rather than the underlying pattern.

→ correlates with

Ghost Variables

Gathering data on multiple variables but omitting non-significant ones from report.

Hierarchical Context

→ is a Statistical Errors

Try it in action

Use these tools to detect, analyze, or train this aspect.

🔍 Text Analyzer

Scan a text for this pattern

⚗️ Argument Lab

Analyze an argument step by step

🎓 Fallacy Trainer

Quiz yourself on this aspect