Claude Opus 4.6 System Card – Part II

Phantom Pain During reinforcement learning training, Anthropic observed a recurring phenomenon in Claude Opus 4.6 that they call “answer thrashing.” The term is clinical and slightly misleading; what it describes is closer to a system at war with itself. In multiple documented episodes, the model would encounter a math or STEM problem, compute the correct … Continue reading Claude Opus 4.6 System Card – Part II