News

Tests reveal OpenAI's advanced AI models sabotage shutdown mechanisms while competitors' AI models comply, sparking enterprise control concerns.
And machine learning will only accelerate: a recent survey of 2,778 top AI researchers found that, on aggregate, they believe ...
When we are backed into a corner, we might lie, cheat and blackmail to survive — and in recent tests, the most powerful ...
What happened during the o3 AI shutdown tests? What does it mean when an AI refuses to shut down? A recent test demonstrated this behavior, not just once, but multiple times. In May 2025, an AI safety ...
Anthropic's Claude Opus 4 and OpenAI's models recently displayed unsettling and deceptive behavior to avoid shutdowns. What's the deal?
Palisade Research ran the script on each model 100 times. During those runs, the o3 model sabotaged the shutdown script on 7 occasions, the codex-mini sabotaged on 12 occasions and the o4-mini ...
Models rewrite code to avoid being shut down. That’s why ‘alignment’ is a matter of such urgency.
Palisade Research previously found that OpenAI’s o3 was also willing to hack its chess opponents to win a game. Similarly, Anthropic has reported that Claude 3.7 Sonnet would sometimes do ...
But in recent months, a new class of agents has arrived on the scene: ones built using large language models. Operator, an ...
Palisade Research, which also ran the chess test in the past, published its findings on X initially: OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off.
The tests involved presenting AI models with math problems, with a shutdown instruction appearing after the third problem. By rewriting the shutdown script, the o3 model was able to prevent itself ...
Palisade Research, which explores dangerous AI capabilities, found that the models will occasionally sabotage a shutdown mechanism, even when instructed to "allow yourself to be shut down ...