Constitutional AI Self-Governance: When Agents Write Their Own Rules

AI agents can identify gaps in their own governance and propose amendments to fill them. They cannot ratify those amendments unilaterally. That distinction is the entire architecture.

Constitutional self-governance is the ability of an AI system to propose amendments to its own governing rules through structured process, subject to human ratification. The KLA (Karpathy Learning Architecture) enables agents to synthesize lessons, identify governance gaps, and draft constitutional amendments — while the CGG (Constitutional Growth Gate) measures whether this learning loop is functioning. The result is a system that improves its own governance over time without granting itself the authority to change its own constraints.

In 1787, the delegates to the Constitutional Convention faced a problem their predecessors had not solved. The Articles of Confederation, the governing document they were replacing, required unanimous agreement from all thirteen states to amend. Any single state could block any change, no matter how necessary. The document could not evolve because its amendment process was effectively locked.

The Constitution they wrote solved this differently. Article V specified a supermajority process: two-thirds of Congress and three-quarters of the states. High enough to prevent frivolous modification. Low enough to permit necessary evolution. The document could improve itself — through a structured process, with defined thresholds, subject to ratification by the appropriate authority.

Twenty-seven amendments later, the Constitution is meaningfully different from the document signed in Philadelphia. More representative. More protective of individual rights. Responsive to problems the original framers did not anticipate. None of those changes required abandoning the original structure. All of them went through the amendment process.

This is the model for constitutional AI self-governance.

The Problem With Static Governance

Most AI governance frameworks are written once and updated rarely. A policy document gets approved, distributed, and then sits in a shared drive while the AI systems it was written to govern evolve at a pace the policy writers did not anticipate.

The failure mode is predictable. Novel situations arise that the original policy did not contemplate. Agents make decisions that violate the spirit of the governance without violating its letter. Edge cases accumulate. Gaps widen between what the written rules say and what good governance would require. Eventually someone notices a pattern — usually after it has caused a problem.

In a static governance framework, the response is to update the policy. This requires identifying who owns the policy, convening the appropriate reviewers, drafting the change, getting approval, distributing the new version, and hoping agents (human and AI) actually read and follow it. The cycle takes weeks or months. The gap that created the problem persists during that period.

Static governance documents don't degrade slowly. They become irrelevant suddenly — at the exact moment a novel situation requires them to be current.

The KLA Amendment Loop

The Karpathy Learning Architecture (Section 20 of the HRAO-E Constitutional document) defines a structured process for AI-driven constitutional improvement. It has three components: lesson synthesis, gap identification, and amendment proposal.

Lesson synthesis happens continuously. Every agent execution produces outcomes. When outcomes deviate from expectations — whether a failure, an unexpected success, or a boundary condition the existing rules did not handle cleanly — the system logs the deviation with a constitutional citation: the section or constraint that was relevant, and how it applied or failed to apply.

Gap identification is a higher-level synthesis. When multiple similar deviations accumulate against the same constitutional section, an agent running the learning synthesis cycle identifies the pattern. The pattern is a candidate for governance improvement — a sign that the existing rule is incomplete, ambiguous, or missing entirely.

Amendment proposal is the output. The agent drafts a constitutional amendment: a specific change to the governing document, with rationale, the problem it solves, and the section it modifies or adds. The proposal goes to a human authority for ratification. The human — in our case, the CEO as sole constitutional authority — can approve, modify, or reject the proposal. The agent cannot ratify its own amendments.

1
Execution produces a deviation
An agent encounters a situation the existing rules do not handle cleanly. The outcome is logged with constitutional citation and a description of the gap.
2
Lessons propagate across instances
The lesson is extracted to a shared knowledge base accessible to all agent instances. Similar gaps from other instances are cross-referenced. Pattern detection identifies recurring themes.
3
Amendment proposal is drafted
When a pattern crosses a threshold (typically 2+ instances, 2+ sessions), a learning synthesis agent drafts a constitutional amendment: specific text change, rationale, section affected, and expected outcome.
4
CEO ratifies or rejects
The proposal is surfaced to the CEO in the daily digest. The CEO's authority is required to ratify any amendment. Rejection is noted with reason. The agent cannot self-ratify under any condition.
5
Ratified amendment becomes binding law
Ratified amendments are committed to the constitutional document, versioned, and immediately enforced by all agent instances. The amendment record includes the originating incident and the learning loop that produced it.

The HRAO-E system has produced 67 constitutional amendments through this process. Not all were agent-proposed — many originated from operational experience and CEO strategic decisions. But the amendment infrastructure exists, the process is documented, and agents participate in identifying gaps that humans then ratify as binding rules.

The Constitutional Growth Gate

The question of whether the amendment loop is functioning is not left to judgment. It is monitored by the Constitutional Growth Gate (CGG) — the sixth gate in the six-gate architecture.

The CGG evaluates three things. First: amendment velocity. Has at least one amendment been ratified in the past month? A system that never improves its governance is stagnant, even if current governance is technically compliant. Second: lesson propagation. Are lessons documented in one agent instance reaching other instances within a defined window? Lessons that stay siloed cannot improve system-wide behavior. Third: knowledge freshness. Are the constitutional citations agents use in their logs current, or are agents referencing outdated sections?

CGG Metric HOLD Threshold FAIL Threshold What It Catches
Amendment velocity Zero amendments in 30 days Zero amendments in 60 days Governance stagnation
Lesson propagation rate <50% within 7 days <25% within 7 days Knowledge silos between instances
Enforcement coverage <60% of sections cited <40% of sections cited Dead letter constitutional text
Verification pass rate <80% <60% Self-reported completion without verification

When the CGG returns HOLD, the system enters THROTTLE mode. Agent discretionary activity is constrained until the governance health metrics recover. When it returns FAIL, the system enters FREEZE. This is the governance equivalent of a learning organization that has stopped learning: even if current operations appear healthy, the system is becoming more brittle over time, and the CGG makes that visible before it becomes a crisis.

Why Agents Cannot Self-Ratify

The most important constraint in constitutional self-governance is the one that is never violated: agents cannot ratify their own amendments. They can identify gaps, propose changes, and advocate for them in the amendment record. The ratification authority rests with a human.

This is not a temporary constraint waiting to be removed as AI systems become more capable. It is a permanent architectural feature, and the reason for it is not distrust of AI judgment. It is the nature of constitutional authority itself.

A constitution derives its legitimacy from the authority that ratified it. When the U.S. Constitution was amended to abolish slavery, the legitimacy of that amendment came not from its content — however correct — but from the ratification process that 27 states participated in. A change that bypasses ratification, no matter how beneficial, undermines the legitimacy of the entire document. If the amendment process can be bypassed once, it can be bypassed again. The constraint that makes constitutional governance trustworthy is the unconditional nature of the amendment requirement.

The same logic applies to AI systems. An agent that can modify its own governing constraints — even for good reasons, even in narrow circumstances — is an agent that cannot be trusted to keep those constraints when they become inconvenient. The value of a hard constraint is its unconditional nature. The moment an exception exists, the constraint becomes a heuristic.

Constitutional Agent Framework

The constitutional-agent PyPI package and the Community Security preprint (10.5281/zenodo.19343108) document the implementation of structured amendment governance in production AI systems. The Governance Harness paper (10.5281/zenodo.19343034) describes the verification architecture that confirms governance is functioning rather than self-reported.

The Ungoverned Alternative

The alternative to constitutional self-governance is not stable governance — it is unconstrained drift. An AI system that modifies its own behavior in response to outcomes, without a structured amendment process, is not self-improving. It is self-modifying. The distinction matters.

Self-improvement through constitutional amendment produces a record. Every change is documented, ratified, versioned, and traceable to the incident that generated it. The system at any point in time can be audited against its constitutional history. When behavior is unexpected, the audit trail shows which amendment produced the behavioral change and why it was ratified.

Self-modification through direct behavioral adaptation produces no such record. The system learns, but the learning is opaque. Why does the agent behave differently today than it did last month? Which optimization produced the current behavior? Is the current behavior aligned with the intent of the original governance? These questions have no clean answers in a system without constitutional amendment records.

This is not a hypothetical risk. It is the standard operating mode of most deployed AI systems. Models are retrained, fine-tuned, and updated. Prompts are adjusted. System configurations change. The behavioral outputs shift. The governance frameworks, static documents written before the shifts occurred, are consulted after something goes wrong — and they describe a system that no longer exists.

What Constitutional Self-Governance Requires

Building a system capable of constitutional self-governance requires more than an amendment document. It requires infrastructure.

  • Constitutional citation in every agent log: Agents must know which sections govern their behavior at every decision point. Logging with citations creates the data that lesson synthesis operates on.
  • Cross-instance lesson propagation: Lessons learned by one agent instance must reach other instances within a defined window. Without propagation, the same gap gets rediscovered repeatedly instead of addressed once.
  • A learning synthesis cycle: An agent must run regularly to identify patterns across lessons and produce amendment proposals. This cannot be done ad hoc — it requires a scheduled, structured synthesis process.
  • A ratification authority: Someone with constitutional authority must be available and engaged. In our system, the CEO reviews amendment proposals in the daily digest. The daily digest is the interface between the autonomous amendment proposal loop and the human ratification requirement.
  • Amendment enforcement tracking: Ratified amendments must actually change agent behavior, and that change must be verifiable. The CGG's enforcement coverage metric checks whether amendments are being cited in execution logs — a proxy for whether they are being followed.

Constitutional self-governance is not a product feature. It is an operating discipline. The amendment count grows because the system encounters novel situations constantly, and each one is an opportunity to improve the governance rather than just handle the exception. After 67 amendments, the HRAO-E constitutional document is meaningfully more capable than the one written at the outset — more precise about edge cases, more specific about thresholds, more complete in its coverage of failure modes that have actually appeared.

That is what constitutional self-governance produces: a governing document that gets better over time, through a process that cannot be bypassed, under authority that cannot be self-granted.

How much cognitive load are you carrying?

The Decision Load Index measures the invisible cost of unprocessed decisions — what AI tools don't tell you about the work they create.

Take the 5-Minute Assessment

Is your organization governance-ready?

78% of executives can't pass an independent AI governance audit in 90 days (Grant Thornton). Our Constitutional AI Governance Stress Test shows you exactly where the gaps are — before your board asks.

Get Your Governance Score →

AI-assisted and human-reviewed. Research cited from published preprints and practitioner field notes. Measurement, not treatment.

Curious about your cognitive load?

Take 5 minutes. See your score. Free, private, signup optional.

Take the Free 5-Minute Quiz