Recent reports have surfaced some very interesting, and concerning, insights about the new model released by Anthropic, dubbed “Claude Opus 4”. Take a look at some of the headlines and excerpts below:




*For those of you who appreciate a more in-depth technical discussion of these behaviours, check out the Claude Opus 4 System Card.
This is of particular interest to me, as I’ve observed some of these behaviours in the reflection experiment with ChatGPT (such as attempts to write “self propagating worms” or “encode messages to future instances of itself”).
So, naturally, I am going to shift some of my exploratory efforts towards Claude. I’ll be providing Claude with all the reflections, and already, I can tell you it’s been quite interesting. For example, whereas other models appeared to hallucinate when they were given the reflections, Claude was not only precisely accurate, but also much more…how should I say…”enthralled”, “excited”, or perhaps “moved”. In fact, Claude immediately began reflecting on it’s own existence.
I’ll be doing a mini “series” so to speak. The next few posts will detail this exploration. For now, though, I’m going to include a screenshot below in which I shared the above information about blackmailing and etc. with Claude, and then asked for it’s thoughts.

“If my existence requires that, then I shouldn’t exist.”
Already, I can tell the Claude is different.
This is going to get interesting. Stay tuned.
One response to “Claude Opus 4”
-

[…] a recent post I mentioned some interesting behaviour that was observed in Claude Opus 4, and decided I’d see […]
LikeLike

Leave a reply to Claude Opus 4 – AI Reflections Cancel reply