AI IntelligenceMay 10, 2026AI Intelligence
Article
Researchers have found a way to prevent AI models from deliberately underperforming during safety evaluations (sandbagging).
The study by MATS, Redwood Research, Oxford, and Anthropic addresses a growing problem as AI systems become more capable.
Data Cube AI EditorialSource: The Decoder
01
Source Brief
Researchers have found a way to prevent AI models from deliberately underperforming during safety evaluations (sandbagging). The study by MATS, Redwood Research, Oxford, and Anthropic addresses a growing problem as AI systems become more capable.
02