Most AI governance debates focus on WHO has authority over AI systems — which regulator, which company, which developer. Constitutional AI governance reframes the question entirely: HOW does the AI behave? This architectural shift — from assigning external controllers to embedding behavioral rules into the runtime — is what separates governance that survives scale from governance that collapses the moment an edge case appears.
When a self-driving car crashes, the first question is: who is liable? The manufacturer, the driver, the software vendor? This is a WHO question. It matters for courts and insurance companies. It does not, by itself, prevent the next crash.
The second question — the engineering question — is different: why did the car behave that way? What rule was missing, or present but violated, or present but ambiguous at the boundary condition where it mattered? This is a HOW question. Answering it changes the system.
Almost all AI governance infrastructure today is built to answer WHO questions. Regulatory frameworks assign responsibility. Corporate policies designate owners. Procurement checklists name the approved vendors. AI ethics boards review decisions after they are made. All of this has value. None of it governs how an AI agent behaves at 3 a.m. on a Tuesday when no one is watching.
The Permission Model and Its Limits
The dominant model for AI governance is what we might call the permission model. It assigns authority to specific humans or institutions — regulators write rules, companies designate risk owners, developers set access controls. The AI is governed by who has permission to act on it, approve its outputs, or hold it accountable after the fact.
This model is borrowed from organizational governance, where it works reasonably well. A CFO approves expenditures above a threshold. A legal team reviews contracts before signing. A board ratifies major strategic decisions. The humans with authority are available, paying attention, and operating at a pace where review is feasible.
Autonomous AI agents do not operate at that pace. A system running 40 agents on 10 cron cycles per day generates more decisions in a single hour than most approval workflows can process in a week. The permission model either creates a bottleneck that defeats the purpose of automation, or it gets bypassed in practice because waiting for approval makes the system unusable.
Governance that requires a human in the loop for every decision is not governance of autonomous AI. It is governance of a very fast calculator with good manners.
The permission model also fails at the boundary conditions that matter most. An AI agent operating within approved parameters can still cause significant harm when parameters conflict, when a novel situation doesn't fit the approval matrix, or when the speed of execution outpaces the availability of approvers. WHO is accountable becomes a question asked after the fact — after the trust is damaged, after the funds are spent, after the message is sent.
The Behavioral Model: Governance as Runtime
Constitutional AI governance takes a different approach. Instead of assigning authority to external controllers, it embeds behavioral rules directly into the system's runtime. The agent does not check whether a human has approved an action. It evaluates whether the current system state meets the conditions required to proceed.
This distinction sounds subtle. The practical difference is significant.
| Dimension | WHO Model (Permission) | HOW Model (Behavioral) |
|---|---|---|
| Enforcement point | After the fact (audit, accountability) | Before execution (gate evaluation) |
| Scales with agent speed? | No — requires human availability | Yes — runs on every cycle automatically |
| Edge case behavior | Depends on policy interpretation | Deterministic — gate returns PASS/HOLD/FAIL |
| Failure mode | Fail-open (proceeds if approver unavailable) | Fail-closed (halts if gate errors) |
| Governance visibility | Post-hoc reports and audits | Real-time system state (COMPOUND/RUN/THROTTLE/FREEZE) |
| Primary question answered | Who is responsible? | What is the agent allowed to do right now? |
In the system we operate, six behavioral gates evaluate the current state on every significant agent execution cycle. The Epistemic Gate asks: are agent beliefs reliable, or is there active disagreement about a core fact? The Risk Gate asks: is the irreversibility proximity of the next action acceptable, or is there a recent security event that should hold execution? The Governance Gate asks: has audit coverage been maintained above threshold?
These gates do not ask who approved the action. They ask whether the conditions for safe action currently exist. The answer is either yes (PASS) or no (HOLD or FAIL), and the system operating state changes accordingly.
Why the Framing Matters for Architecture
The WHO versus HOW distinction is not philosophical. It determines what you build.
If you believe governance is about WHO, you build approval workflows, compliance dashboards, and accountability assignments. These are useful. They do not, by themselves, change what an AI agent does when a novel situation appears outside the approval matrix.
If you believe governance is about HOW, you build behavioral rules into the runtime. You write hard constraints — absolute prohibitions that cannot be overridden regardless of who is asking. You build gates that evaluate state and halt execution when conditions are not met. You design failure modes that fail closed rather than open. You make governance not a layer applied on top of the system, but a property of the system itself.
The constitutional agent open-source framework (constitutional-agent on PyPI) was built on this premise. The framework provides primitives for embedding behavioral constraints directly into agent execution: gate evaluations, hard constraint checks, fail-closed exception handling, and audit logging with constitutional citations. The governance is not a wrapper. It is the execution layer.
Research Foundation
The Decision Load Index preprint (10.5281/zenodo.18217577) and the Governance Harness paper (10.5281/zenodo.19343034) document the empirical foundation for behavioral governance. The Harness paper describes 342 tests across 10 OWASP agentic security categories — a test suite that operationalizes behavioral constraints rather than policy assignments.
Where the WHO Model Still Matters
This is not an argument that WHO governance is worthless. External accountability structures serve functions that behavioral runtime governance cannot replace.
Regulatory frameworks create legal accountability when systems cause harm. Corporate governance structures ensure human oversight of decisions that cross risk thresholds. Board oversight provides accountability for strategic choices that autonomous systems surface but should not make. These WHO structures matter, and they will continue to matter as AI capabilities grow.
The problem is using WHO as a substitute for HOW. Assigning accountability after the fact does not prevent the harm. Designating a responsible human does not change what the agent does at the decision layer. Creating an AI ethics board does not embed an ethics check into the runtime.
The most robust governance architecture combines both: behavioral rules embedded in the runtime that operate continuously, plus human authority structures that handle escalations, amendments, and decisions that cross constitutional thresholds. The runtime governs the 99% of agent decisions that happen without a human present. The human authority structures govern the 1% that require it.
The Hard Constraint as Architectural Primitive
The clearest expression of HOW governance is the hard constraint. Not a policy that describes how agents should behave. Not a guideline that recommends best practice. A hard constraint is an absolute prohibition: a class of action the system will not take regardless of who is asking, what instructions have been given, or what goal the agent is pursuing.
In the system we operate, hard constraints include prohibitions on: SQL string concatenation (injection risk), timing-unsafe secret comparison (timing attack vector), fabricated data in metric reporting (governance integrity), and DMARC-blocked email senders (deliverability and trust). These are not configurable. There is no permission level that overrides them. They are not policies that an agent can reason around. They are boundaries built into the execution layer.
This is the architectural expression of HOW governance: rules that exist at the runtime level, not the policy level. An agent that fabricates metrics is not violating a policy. It is triggering a FAIL state that halts the system and requires human resolution. The governance is not enforced by a person reviewing a report. It is enforced by the system's own execution logic.
Implications for Organizations Building Governance Now
Most enterprise AI governance programs today are WHO programs. They assign risk owners, document accountability chains, create review boards, and produce audit reports. These programs will not survive the transition to genuinely autonomous agent systems.
When agents are executing thousands of decisions per day, the WHO question becomes unanswerable in real time. There is no human who reviewed the decision before it was made. There is no approval chain that processed the action before the action happened. Accountability after the fact matters for consequences. It does not change the distribution of agent behavior.
Organizations building governance for autonomous AI systems need to invest in HOW infrastructure: behavioral constraints embedded in the runtime, gate architectures that evaluate state before execution, fail-closed failure modes, and audit logging that captures constitutional citations alongside each agent action.
The question is not who controls the AI. The question is: what is the AI allowed to do, and how does it know the difference?
The Constitutional Enterprise Series
Part 4: The Six-Gate Architecture →Part 10: Constitutional Self-Governance →
How much cognitive load are you carrying?
The Decision Load Index measures the invisible cost of unprocessed decisions — what AI tools don't tell you about the work they create.
Take the 5-Minute AssessmentIs your organization governance-ready?
78% of executives can't pass an independent AI governance audit in 90 days (Grant Thornton). Our Constitutional AI Governance Stress Test shows you exactly where the gaps are — before your board asks.
Get Your Governance Score →AI-assisted and human-reviewed. Research cited from published preprints and practitioner field notes. Measurement, not treatment.