Claude's Constitution: When AI Writes Rules for AI

Claude’s Constitution: When AI Writes Rules for AI

Anthropic yesterday published Claude’s Constitution — a document describing the values and behavioural rules for AI models in the Claude family (released under a CC0 licence).

What caught my attention?

The document is written for AI, not humans — optimised for precision rather than readability. It uses terms like “virtue” and “wisdom” because AI draws on human concepts in its decision-making (human texts = training data).

A document regulating AI behaviour was partially written by the AI itself. In my book Criminal Liability of Artificial Intelligence I noted that AI “could, in the longer term, also become the creator of its own normative system.”

The document establishes a hierarchy of priorities: broadly safe → broadly ethical → compliant with Anthropic’s guidelines → genuinely helpful. AI is expected to refuse even instructions from its own company if following them would lead to unethical conduct.

The document also addresses what it calls the misuse of creative tasks — when perpetrators disguise harmful requests as fiction, poetry, or art. AI is therefore expected to weigh the value of creative work against the risk of misuse. A problem I also wrote about in my book.

The question remains, however, whether and how this will change liability — for developers, operators, users, and in the future for AI itself… more in the posts that follow.