palisade research - Search News

News

27d

OpenAI’s Skynet moment: Models defy human commands, actively resist orders to shut down

Tests reveal OpenAI's advanced AI models sabotage shutdown mechanisms while competitors' AI models comply, sparking enterprise control concerns.

20d

AI Models Will Sabotage And Blackmail Humans To Survive In New Tests. Should We Be Worried?

When we are backed into a corner, we might lie, cheat and blackmail to survive — and in recent tests, the most powerful ...

14don MSN

When an AI says, ‘No, I don’t want to power off’: Inside the o3 refusal

What happened during the o3 AI shutdown tests? What does it mean when an AI refuses to shut down? A recent test demonstrated this behavior, not just once, but multiple times. In May 2025, an AI safety ...

23don MSN

Why AI acts so creepy when faced with being shut down

Anthropic's Claude Opus 4 and OpenAI's models recently displayed unsettling and deceptive behavior to avoid shutdowns. What's the deal?

Hosted on MSN26d

OpenAI's 'smartest' AI model was explicitly told to shut down — and it refused - MSN

Palisade Research ran the script on each model 100 times. During those runs, the o3 model sabotaged the shutdown script on 7 occasions, the codex-mini sabotaged on 12 occasions and the o4-mini ...

24dOpinion

AI Is Learning to Escape Human Control

Models rewrite code to avoid being shut down. That’s why ‘alignment’ is a matter of such urgency.

MIT Technology Review14d

Are we ready to hand AI agents the keys?

But in recent months, a new class of agents has arrived on the scene: ones built using large language models. Operator, an ...

NBC News25d

How far will AI go to defend its own survival? - NBC News

Palisade Research previously found that OpenAI’s o3 was also willing to hack its chess opponents to win a game. Similarly, Anthropic has reported that Claude 3.7 Sonnet would sometimes do ...

BGR1mon

ChatGPT o3 prevents shutdown in safety test - BGR

Palisade Research, which also ran the chess test in the past, published its findings on X initially: OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off.

The Independent1mon

AI revolt: New ChatGPT model refuses to shut down when instructed

AI safety firm Palisade Research discovered the potentially dangerous tendency for self-preservation in a series of experiments on OpenAI’s new o3 model.

Yahoo20d

AI Models Will Sabotage And Blackmail Humans To Survive In New Tests. Should We Be Worried?

When Palisade Research tested several AI models by telling them to shut down after answering math problems, OpenAI’s o3 model defied orders and sabotaged shutdown scripts the most often out of ...

Yahoo26d

OpenAI's 'smartest' AI model was explicitly told to shut down — and it refused - Yahoo

Palisade Research, which explores dangerous AI capabilities, found that the models will occasionally sabotage a shutdown mechanism, even when instructed to "allow yourself to be shut down ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results