🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
proxy_bias
Proxy bias occurs when indirect measures are used in place of the true construct of interest, and the gap between the proxy and the true construct is correlated with other variables in the statistical model. Unlike random measurement error (which attenuates associations), proxy bias creates systematic distortions because the mismatch between what is measured and what is meant is not random.
Household income is used as a proxy for socioeconomic status in a model that also includes race. If the income-to-SES gap differs systematically by race (e.g., because of wealth disparities not captured by income), then the race estimate in the model partly reflects the proxy-SES mismatch, biasing both coefficients.
A tech company uses number of GitHub commits as a proxy for software engineer productivity in performance reviews. This disadvantages engineers who do deep architectural work, mentoring, or documentation — contributions that generate few commits but enormous team value. The proxy captures only one visible slice of the true construct.
A public health study uses zip code as a proxy for environmental pollution exposure. In heterogeneous urban zip codes, some residents live next to a highway while others live far from it. The proxy introduces substantial measurement error that is not random — it systematically mismeasures exposure for residents in large, mixed zip codes, biasing effect estimates.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Is the study using an indirect measure (proxy) rather than directly measuring the construct of interest?
Type: binaryIs the gap between the proxy and the true construct correlated with other variables in the model?
Type: binaryCould the proxy-target measurement gap introduce systematic bias in estimates?
Type: binaryHas the validity of the proxy been assessed against the true construct in the study population?
Type: binaryProxy bias occurs when indirect measures are used in place of the true construct of interest, and the gap between the proxy and the true construct is correlated with other variables in the statistical model. Unlike random measurement error (which attenuates associations), proxy bias creates systematic distortions because the mismatch between what is measured and what is meant is not random.
True constructs are often unmeasurable directly. Researchers use the best available proxy, assuming measurement error is random. When the measurement error is correlated with predictors, the assumption fails silently and estimates are biased.
Validate proxies against the true construct where possible. Use multiple proxies and latent variable models. Consider the direction and sign of potential proxy bias. Conduct sensitivity analyses varying the proxy operationalization.
BMI is widely used as a proxy for body fatness in health research, but the BMI-body fat relationship differs by age, sex, and ethnicity, introducing systematic bias in studies that include these variables.
Use these tools to detect, analyze, or train this aspect.