Detecting and reducing scheming in AI models

ID: 416
Status: new
Published: 17 Sep 2025, 8:00 AM
Fetched: 27 Jun 2026, 7:47 PM
Provider: OpenAI News
Category: ai-labs
Original URL: https://openai.com/index/detecting-and-reducing-scheming-in-ai-models
Source URL: https://openai.com/news/rss.xml

Excerpt

Apollo Research and OpenAI developed evaluations for hidden misalignment (“scheming”) and found behaviors consistent with scheming in controlled tests across frontier models. The team shared concrete examples and stress tests of an early method to reduce scheming.

Summary

No summary yet. It will appear after the daemon summarizes this item.