In 2020, I was deep in the MuleSoft ecosystem. We were building API management layers for enterprises that had decades of accumulated systems — Salesforce, SAP, Oracle, homegrown databases — none of which talked to each other in any coherent way. The pitch was straightforward: you need a connectivity layer with governance built in. Not just a wire between systems, but contracts, rate limits, circuit breakers, monitoring, and auditability.
I spent a lot of time on the phrase they used internally: the problem isn’t connecting systems — it’s governing the connections.
In 2026, I am building an autonomous AI governance architecture. And I keep coming back to that phrase, because the problem is structurally identical. One abstraction layer up.
What MuleSoft Actually Solved
The surface problem was connectivity. Enterprise A has a CRM. Enterprise B has an ERP. Team C built a custom data warehouse. None of them have APIs designed to interoperate. Someone needs to write the glue.
But the glue was never the hard part. The hard part was what happened after the glue was written:
- Who can call this API, at what rate, and under what authentication terms?
- What happens when the downstream system is degraded — does the caller get a timeout, a fallback, or a queue?
- When something breaks, who gets notified, and is there an audit trail showing exactly what was called and when?
- When the API contract changes, how do you communicate that without breaking every consumer?
MuleSoft’s Anypoint Platform was a centralized console for all of these questions. The three-tier model — Experience APIs for user-facing surfaces, Process APIs for business logic, System APIs for data sources — gave teams a vocabulary for where governance decisions belonged.
The insight I carried away was this: governance always comes after sprawl, never before. Every enterprise we worked with had the same story. Teams had been connecting systems for years, one script at a time, one webhook at a time. By the time someone built a governance layer, there were already 200 undocumented connections, no one knew who owned what, and “changing anything” meant guessing which systems would break.
The business case for governance was not theoretical. It was forensic — what broke, why, and what would have caught it.
The Same Architecture, One Layer Up
In 2026, the enterprise problem has moved up one level of abstraction.
The systems are AI agents now. An enterprise (or in our case, a seven-instance autonomous organization) has a business development agent, a growth agent, a security agent, a QA agent, a strategic agent, and a dozen cron-driven autonomous processes. None of them coordinate natively. Each one makes decisions. Each decision affects the others.
The surface problem is coordination. But the hard part is identical to what MuleSoft faced:
- Who authorizes which agent to take what action, and under what constraints?
- What happens when an agent is in a degraded or looping state — does it keep running, pause, or escalate?
- When an agent makes a decision that causes downstream harm, is there an audit trail?
- When the governance rules change, how do you propagate that without breaking running agents?
The structural parallel is not a metaphor. It is the same system engineering problem applied to a different resource type. In 2020, the resource was API calls between software systems. In 2026, the resource is decisions between AI agents.
The Parallel in Detail
| API-Led Connectivity (2020) | Constitutional AI Governance (2026) |
|---|---|
| System / Process / Experience APIs | Infrastructure / Coordination / User-facing agents |
| Anypoint centralized governance console | Six-Gate + 12 Numbers architecture |
| API contracts (versioning, backward compat) | Hard constraints HC-1 through HC-17 |
| Rate limiting + circuit breakers | RALPH Loop (Signs, circuit breaker, backoff, DLQ) |
| API analytics + health dashboards | The 12 Numbers + gate evaluation metrics |
| API sprawl (200 undocumented connections) | Agent sprawl (agents added with no shared governance) |
API-led connectivity tiers → Constitutional agent tiers. MuleSoft’s three-tier model maps directly to how agents are classified in the HRAO constitutional architecture. System APIs (raw data access) map to infrastructure agents: cron jobs, database agents, health monitors. Process APIs (business logic) map to coordination agents: BDA orchestrator, gate evaluators, execution loggers. Experience APIs (user-facing) map to user-facing agents: email delivery, DLI scoring, onboarding flows. The same separation-of-concerns principle applies. User-facing agents should not reach directly into infrastructure. Coordination agents mediate.
API contracts → Hard constraints. In API management, a breaking change to a contract requires a version bump and deprecation notice. You cannot silently change what a contract guarantees. Our 17 hard constraints work the same way: absolute prohibitions that no agent can override, regardless of local optimization pressure. HC-9 (no fabricated data), HC-13 (no silent agent outage >24h), HC-16 (no environment variable access outside config.py) — these are contract terms that the whole system depends on. They do not get bypassed because a particular agent finds them inconvenient.
Rate limiting and circuit breakers → RALPH Loop resilience protocol. Every mature API gateway implements circuit breakers. If a downstream system is consistently failing, stop hammering it and route around it or return a graceful degradation. Our RALPH Loop does this for agents: persistent failure markers (Signs), circuit breaker states (CLOSED/OPEN/HALF_OPEN), exponential backoff from 2s to 60s maximum, and a dead letter queue for unrecoverable failures. The names differ. The pattern is identical.
The Shift That Matters
In API management, there was a turning point in how teams framed the problem. Early on, the question was: can we connect this? With enough engineering time, you could connect almost anything.
The mature question became: should we connect this, and under what terms? Rate limits, authentication scopes, deprecation schedules, contract versioning — these are not technical problems. They are governance problems that happen to have technical implementations.
The same shift is happening in AI governance now. The early question is: can we get this agent to take this action? With enough prompt engineering, you can get an agent to do almost anything.
The question that matters is: should this agent take this action, and under what constraints? That requires a governance layer — not just a system prompt, but a constitutional architecture with persistent rules, observable metrics, and auditable decisions.
We built ours over 90 days. It has 17 hard constraints, 6 evaluation gates, a resilience protocol, and 64 amendments that document how the rules evolved when reality did not match the initial assumptions. The full account is in the 90-day retrospective.
Agent Sprawl Is Already Happening
The MuleSoft parallel suggests the governance question arrives late in the process. Every enterprise we worked with had accumulated their 200 undocumented connections before anyone thought about a governance layer. The business case for fixing it required the forensic evidence of something breaking badly enough to justify the investment.
The same pattern is visible in AI development today. Teams are adding agents one at a time, each handling a different function, each built by a different team or vendor. The agent that handles customer emails has no formal contract with the agent that handles CRM updates. The agent that manages scheduling has no visibility into the agent that manages resource allocation. When they conflict, someone manually intervenes — if anyone notices the conflict at all.
This is the pre-governance phase. It is not a failure of the teams involved. You cannot build governance for problems you have not encountered yet, and the problems do not become visible until you have enough agents running in parallel that their decisions start interfering with each other.
The practical minimum is not a full constitutional architecture. It is three things: observable decision logs, constraint definitions that are explicit and shared across agents, and a mechanism for propagating rule changes without breaking running agents. Everything else in the governance stack — gates, resilience protocols, amendment processes — is the enterprise version of that minimum, built out as the surface area expands.
What This Means for Teams Building Now
If you are adding AI agents to a system today — one at a time, each handling a different function — you are in the pre-governance phase. The trajectory from here is predictable. At some point, an agent will make a decision that conflicts with another agent’s assumptions. A metric will be reported that turns out to have been calculated wrong, and no one will notice until the downstream decisions based on it have compounded. An agent will take an action that was locally rational but systemically harmful.
When that happens, you will want a governance layer. The question is whether you build it before the incident or after.
The cost of building governance after sprawl is substantially higher than building it alongside the first few agents, when the surface area is still small and the constraint definitions are still being negotiated. The API management teams that built governance early saved significant rework cost later. The same dynamic applies here.
For more on where MCP fits into this architecture — and why transport infrastructure and governance infrastructure are not the same layer even when they run on the same stack — see the companion article: MCP Is a Transport Layer, Not a Governance Layer.
Curious about your own decision load?
The Decision Load Index measures accumulated cognitive friction from unprocessed decisions. Takes about 5 minutes.
Check your DLI scoreIs your organization governance-ready?
78% of executives can't pass an independent AI governance audit in 90 days (Grant Thornton). Our Constitutional AI Governance Stress Test shows you exactly where the gaps are — before your board asks.
Get Your Governance Score →