Skip to content
Intelligence IAMay 10, 2026Intelligence IA
Article

Researchers have found a way to prevent AI models from deliberately underperforming during safety evaluations (sandbagging).

The study by MATS, Redwood Research, Oxford, and Anthropic addresses a growing problem as AI systems become more capable.

Redaction Data Cube AISource: The Decoder
01

Brief source

Researchers have found a way to prevent AI models from deliberately underperforming during safety evaluations (sandbagging). The study by MATS, Redwood Research, Oxford, and Anthropic addresses a growing problem as AI systems become more capable.