🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
goodharts_law
Goodhart's Law states that when a measure becomes a target, it ceases to be a good measure. Once people know they are being evaluated by a specific metric, they optimize for that metric rather than the underlying goal it was intended to represent. This creates perverse incentives where the metric improves while the actual desired outcome deteriorates or remains unchanged.
A call center sets 'average call duration' as a key performance indicator, targeting shorter calls. Agents begin rushing customers, transferring difficult calls, or hanging up before resolution. Average call time drops, but customer satisfaction plummets and repeat calls increase.
A school district ties teacher evaluations to student scores on standardized reading tests. Teachers begin dedicating nearly all class time to test-format drills, cutting out creative writing and critical discussion. Test scores rise, but broader literacy and love of reading decline.
A software company measures developer productivity by number of commits per week. Developers respond by breaking single logical changes into dozens of tiny, trivial commits. Commit counts soar while actual feature delivery slows down.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Has a statistical measure been adopted as an explicit target or incentive?
Type: binaryAre people gaming or optimizing for the metric rather than the underlying objective?
Type: binaryHas the measure's relationship to the actual goal degraded since it became a target?
Type: binaryAre there signs of metric manipulation rather than genuine improvement?
Type: binaryGoodhart's Law states that when a measure becomes a target, it ceases to be a good measure. Once people know they are being evaluated by a specific metric, they optimize for that metric rather than the underlying goal it was intended to represent. This creates perverse incentives where the metric improves while the actual desired outcome deteriorates or remains unchanged.
Metrics are simplifications of complex realities. Optimizing for the simplified proxy is almost always easier than optimizing for the underlying complex goal, so rational actors will game the metric.
Use multiple complementary metrics that are harder to game simultaneously. Regularly rotate or update metrics, and include qualitative assessments alongside quantitative ones.
Goodhart's Law manifests in standardized testing (teaching to the test), policing (crime statistic manipulation), scientific publishing (citation gaming), and social media (engagement metric optimization).
Use these tools to detect, analyze, or train this aspect.