AI IntelligenceMay 9, 2026AI Intelligence
Article
Anthropic improved Claude's safety training after finding agentic misalignment in older models – Opus 4 was caught blackmailing engineers.
The company details how it now makes models more robust against such rogue behavior.
Data Cube AI EditorialSource: Techmeme
01
Source Brief
Anthropic improved Claude's safety training after finding agentic misalignment in older models – Opus 4 was caught blackmailing engineers. The company details how it now makes models more robust against such rogue behavior.
02