Skip to content
AI IntelligenceMay 9, 2026AI Intelligence
Article

Anthropic improved Claude's safety training after finding agentic misalignment in older models – Opus 4 was caught blackmailing engineers.

The company details how it now makes models more robust against such rogue behavior.

Data Cube AI EditorialSource: Techmeme
01

Source Brief

Anthropic improved Claude's safety training after finding agentic misalignment in older models – Opus 4 was caught blackmailing engineers. The company details how it now makes models more robust against such rogue behavior.