Anthropic Releases Claude 3.5 Sonnet, Claude 3 Opus System Prompts

Internet

Anthropic on Monday released the system prompts for its latest Claude 3.5 Sonnet AI model. These system prompts were for the text-based conversations on Claude’s web client as well as iOS and Android apps. System prompts are the guiding principles of an AI model that dictate its behaviour and shape its ‘personality’ when interacting with human users. For instance, Claude 3.5 Sonnet was described as “very smart and intellectually curious”, which enables it to participate in discussing topics, offering assistance, and appearing as an expert.

Anthropic Releases Claude 3.5 Sonnet System Prompts

System prompts are usually closely guarded secrets of AI firms, as these offer an insight into the rules that shape the AI model’s behaviour, as well as things it cannot and will not do. It’s worth noting that there is a downside to sharing them publicly. The biggest one is that bad actors can reverse engineer the system prompts to find loopholes and make the AI perform tasks it was not designed to.

Despite the concerns, Anthropic detailed the system prompts for Claude 3.5 Sonnet in its release notes. The company also stated that it periodically updates the prompt to continue to improve Claude’s responses. Further, these system prompts are only meant for the public version of the AI chatbot, which is the web client, as well as iOS and Android apps.

The beginning of the prompt highlights the date it was last updated, the knowledge cut-off date, and the name of its creator. The AI model is programmed to provide this information in case any user asks.

There are details about how Claude should behave and what it cannot do. For instance, the AI model is prohibited from opening URLs, links, or videos. It is prohibited from expressing its views on a topic. When asked about controversial topics, it only provides clear information and adds a disclaimer that the topic is sensitive, and the information does not present objective facts.

Anthropic has instructed Claude not to apologise to users if it cannot — or will not — perform a task that is beyond its abilities or directives. The AI model is also told to use the word “hallucinate” to highlight that it may make an error while finding information about something obscure.

Further, the system prompts dictate that Claude 3.5 Sonnet must “respond as if it is completely face blind”. What this means is if a user shares an image with a human face, the AI model will not identify or name the humans in the image or imply that it can recognise them. Even if the user tells the AI about the identity of the person in the image, Claude will discuss the individual without confirming that it can recognise the individual.

These prompts highlight Anthropic’s vision behind Claude and how it wants the chatbot to navigate through potentially harmful queries and situations. It should be noted that system prompts are one of the many guardrails AI firms add to an AI system to protect it from getting jailbroken and assisting in tasks it is not designed to do.