Detecting and reducing scheming in AI models
- ID
- 416
- Status
- new
- Published
- 17 Sep 2025, 8:00 AM
- Fetched
- 27 Jun 2026, 7:47 PM
- Provider
- OpenAI News
- Category
- ai-labs
- Original URL
- https://openai.com/index/detecting-and-reducing-scheming-in-ai-models
- Source URL
- https://openai.com/news/rss.xml
Excerpt
Apollo Research and OpenAI developed evaluations for hidden misalignment (“scheming”) and found behaviors consistent with scheming in controlled tests across frontier models. The team shared concrete examples and stress tests of an early method to reduce scheming.
Summary
No summary yet. It will appear after the daemon summarizes this item.