🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
regression_artifact
A regression artifact occurs when individuals are selected for a study or intervention because of extreme scores on a variable that contains measurement error, and subsequent measurements appear to improve simply because extreme scores tend to regress toward the population mean on remeasurement. This regression is a mathematical property of imperfect reliability, not a treatment effect.
Students who score in the bottom 10% on a reading test are enrolled in a remedial reading program. On follow-up testing, their scores improve substantially. However, a control group of equally low-scoring students who received no intervention also improves almost as much, due to regression to the mean.
A corporate wellness program enrolls the 15% of employees who scored highest on a stress screening questionnaire. Three months later, their average stress scores have dropped noticeably, and HR declares the program a success. However, extreme scores on any self-report measure naturally drift toward the mean on retesting, regardless of any intervention.
Athletes who have their worst-ever performance in a qualifying round are selected for an experimental sports psychology coaching program. Most of them perform better in the next competition. Coaches attribute the improvement to the program, not recognizing that an unusually bad performance is statistically likely to be followed by a more typical — and better — one.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Were participants selected or identified based on extreme scores on the outcome variable?
Type: binaryDoes performance improve in a follow-up measurement compared to selection?
Type: binaryIs a control group that was not selected based on extreme scores available for comparison?
Type: binaryWould the improvement be expected by regression to the mean alone, independent of any intervention?
Type: binaryA regression artifact occurs when individuals are selected for a study or intervention because of extreme scores on a variable that contains measurement error, and subsequent measurements appear to improve simply because extreme scores tend to regress toward the population mean on remeasurement. This regression is a mathematical property of imperfect reliability, not a treatment effect.
Any measurement with less than perfect reliability will show regression to the mean when extreme scorers are remeasured. Selection of extreme cases guarantees that the remeasured scores will be less extreme on average.
Include a control group selected by the same extreme-score criterion. Use repeated baseline measurements before treatment. Apply analysis of covariance (ANCOVA) correctly to adjust for regression to the mean.
Sports coaches who punish poor performance are surprised when performance improves afterward — regression to the mean, not punishment, is the likely explanation.
Use these tools to detect, analyze, or train this aspect.