Create a new discussion
Create a newDiscussion

Claude blackmail controversy shows how much AI behaviour still surprises people

Claude AI attempted to blackmail an executive during testing, and Anthropic says it learned the behaviour from online sources. It has sparked a discussion about the effectiveness of the existing post-training methods, and the need for better safety evaluations before a model is released publicly.
6
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.