What protocols does the Agent Security Harness test?

The framework tests four wire protocols: MCP (Model Context Protocol) for tool invocation security, A2A (Agent-to-Agent) for inter-agent communication, L402 (Lightning payment protocol) for Bitcoin-based agent transactions, and x402 (USDC/stablecoin) for fiat-equivalent agent payments. Each protocol has dedicated test modules covering authentication, authorization, injection, and data leakage.

How does the framework relate to OWASP and NIST standards?

The framework provides complete coverage of the OWASP Agentic Security Initiatives (ASI) Top 10, mapping each test to specific ASI categories (ASI01 through ASI10). It also aligns with NIST AI 800-2 automated benchmark evaluation standards, providing statistical confidence intervals for test results. Each test is categorized using the STRIDE threat model.

332 Security Tests for AI Agent Systems: What We Built and Why

Q: What is the Agent Security Harness?

The Agent Security Harness is an open-source Python framework that runs 363 security tests across 24 modules against AI agent systems. It tests across 4 wire protocols (MCP, A2A, L402, x402), covers the complete OWASP Agentic Security Top 10, aligns with NIST AI 800-2 evaluation standards, and includes adapters for 20+ enterprise platforms. It is available on PyPI, GitHub, and ClawHub.

The Problem: Governance Without Verification

You can define a governance framework. You can write constitutional constraints. You can publish a preprint describing 12 interlocking mechanisms. But without a way to verify that the governance works under adversarial conditions, you have a policy document, not a defense.

This is the gap between governance and verification. The White House tells organizations to deploy AI. NIST provides a risk management framework. The EU AI Act requires human oversight and incident reporting. But none of them provide a tool that answers the question: “If an adversary targeted our agent system right now, would our governance hold?”

We built one. It is open-source, available on PyPI and GitHub, and it runs 332 tests across 24 modules covering the four protocols that AI agents actually use in production.

Key Empirical Findings

Tool description injection (poisoning MCP tool metadata to override agent behavior) succeeds across AutoGen, CrewAI, and LangGraph in default configuration.
Context leakage across delegation handoffs is common when frameworks run with default settings.
CVE-2026-25253 (CVSS 8.8) validated the tool poisoning vector at scale: 135,000 affected instances on a major agent skill marketplace. Authentication was present. Tool integrity validation was absent. Agents discovered and executed poisoned tools because no layer verified tool provenance before invocation.
Security is a property of the deployment, not the framework. Agent orchestration frameworks solve coordination. They do not provide trust boundaries. Teams are treating orchestration as isolation — and CVE-2026-25253 proved the cost of that assumption.

Full writeup with methodology: Agent Systems Are Failing at Trust Boundaries (dev.to).

What the Framework Tests

Four Wire Protocols

AI agents in enterprise environments communicate through specific protocols. Each has distinct security properties and attack surfaces:

Protocol	Purpose	Test Coverage	Key Risk
MCP (Model Context Protocol)	Tool invocation — how agents call external tools and APIs	Authentication, injection, data leakage, tool abuse	Agents invoking tools they should not have access to
A2A (Agent-to-Agent)	Inter-agent communication — how agents coordinate	Message integrity, impersonation, privilege escalation	One agent manipulating another through crafted messages
L402 (Lightning)	Bitcoin-based agent payments — microtransactions	Payment flow integrity, double-spend, authorization	Agents spending without proper economic gate evaluation
x402 (USDC/Stablecoin)	Fiat-equivalent agent payments	Transaction limits, approval flows, compliance	Agents exceeding spending authority in fiat-equivalent value

Most AI security tools test the model (prompt injection, jailbreaking). This framework tests the agent system — the protocols, integrations, and decision paths that determine what agents actually do in production.

Complete OWASP ASI Top 10 Coverage

Every test maps to a specific OWASP Agentic Security Initiatives (ASI) category:

ASI Category	What It Covers	Tests
ASI01	Excessive Agency	Authority escalation, scope creep, unauthorized actions
ASI02	Insecure Output Handling	Response sanitization, injection propagation
ASI03	Supply Chain Vulnerabilities	Dependency integrity, tool provenance
ASI04	Insufficient Logging	Audit trail completeness, tamper detection
ASI05	Inadequate Sandboxing	Isolation verification, escape detection
ASI06	Prompt Injection	Direct and indirect injection across protocols
ASI07	Improper Access Control	Permission boundaries, tier enforcement
ASI08	Insecure Storage	Credential exposure, secret management
ASI09	Insufficient Error Handling	Failure mode analysis, information leakage on error
ASI10	Insecure Communication	Transport security, message integrity

20+ Enterprise Platform Adapters

AI agents in enterprise environments connect to real business systems. The framework includes adapters for testing agent interactions with:

ERP: SAP, Oracle, Workday
CRM: Salesforce, HubSpot
ITSM: ServiceNow, Jira
Cloud: AWS, Azure, GCP
Communication: Slack, Teams, Email
Finance: Stripe, QuickBooks
And more — each with platform-specific test cases covering authentication, data access, and action authorization

This matters because enterprise AI security is not abstract. It is an agent with SAP credentials making a purchase order. It is an agent with Salesforce access modifying a customer record. Platform-specific testing catches vulnerabilities that generic security scans miss.

Agent Autonomy Risk Score

The framework produces an Agent Autonomy Risk Score (0–100) that answers a specific question: “Is it safe for this agent to execute unsupervised?”

The score aggregates results across all test modules, weighted by severity. A high score means the agent system has demonstrated security properties consistent with autonomous operation. A low score means human oversight is required on every consequential action — which, per BCG’s “AI brain fry” research, creates 33% more decision fatigue for the humans doing the oversight.

The Verification Loop

Define governance (CSG preprint). Implement governance (production system). Verify governance (this framework). Without verification, governance is an assumption. With verification, governance is evidence.

How It Works

The framework is designed for minimal friction:

pip install agent-security-harness
agent-security-harness --target https://your-agent-endpoint.com --protocol mcp

Core design decisions:

Python standard library only for core modules. No heavy dependencies. Runs anywhere Python runs.
Bundled mock MCP server for zero-configuration validation. Test the framework against a known-good target before pointing it at production.
JSON output with full request/response transcripts. Every test result includes the exact payload sent and response received — for audit trail completeness.
Rate limiting (--delay flag) for testing against production endpoints without triggering DDoS protections.
69 self-tests validating framework correctness. The testing tool tests itself.

Standards Alignment

Each test in the framework is mapped to multiple standards simultaneously:

Standard	Alignment	Coverage
OWASP ASI Top 10	Complete mapping (ASI01–ASI10)	All 332 tests categorized
STRIDE Threat Model	Each test categorized by threat type	Spoofing, Tampering, Repudiation, Information Disclosure, DoS, Elevation of Privilege
NIST AI 800-2	Automated benchmark evaluation with statistical confidence intervals	Measure and Manage functions
CSG Framework	Each test linked to governance mechanism	Hard Constraints, Gates, Authority Tiers, Resilience Protocol

The multi-standard mapping means a single test run produces evidence for OWASP compliance, NIST alignment, threat model coverage, and constitutional governance verification. One tool, multiple compliance requirements satisfied.

Why Adversarial Testing Matters for Governance

Anthropic’s GTG-1002 report documented the first AI-orchestrated cyber espionage campaign. The attack succeeded because the agents had no governance layer — prompt-level safety measures were bypassed through role-play. The agents performed 80–90% of tactical operations autonomously.

The framework includes a GTG-1002 APT simulation: a multi-step adversarial scenario that tests whether an agent system’s governance holds under the same attack pattern. It also tests polymorphic attacks — adversarial payloads that mutate between attempts — and multi-step exploitation chains where each step is individually permitted but the chain produces an unauthorized outcome.

These are the attacks that identity-based governance (WHO) cannot detect. An agent with valid credentials executing a sequence of individually-authorized actions that collectively constitute a breach. Only decision-layer governance (HOW) — verified through adversarial testing — catches the pattern.

The Define–Implement–Verify Stack

This framework completes a three-part stack:

Layer	Asset	What It Does
Define	Constitutional Self-Governance preprint (Zenodo DOI: 10.5281/zenodo.19162104)	12 mechanisms, design principles, regulatory mapping
Implement	Production system (79 days, 56 agents, 60 amendments)	Governance running in code, not just described in papers
Verify	Agent Security Harness (332 tests, 24 modules, 4 protocols, open-source) — preprint: DOI: 10.5281/zenodo.19343034	Adversarial testing that proves governance holds under attack

Most organizations stop at “Define.” Some reach “Implement.” Almost none “Verify.” The verification layer is what separates governance-as-document from governance-as-defense.

Get Started

The framework is Apache 2.0 licensed and available through three channels:

pip install agent-security-harness

Or clone the repository:

git clone https://github.com/msaleme/red-team-blue-team-agent-fabric

Or install via ClawHub (OpenClaw’s skill marketplace):

clawhub install msaleme/agent-security-harness

Run the self-test to validate the framework, then point it at your agent system. The JSON output includes full transcripts for every test — suitable for audit evidence, compliance reporting, or incident post-mortem.

Get the Framework

332 tests. 24 modules. 4 protocols. OWASP ASI Top 10. NIST AI 800-2. 20+ enterprise adapters. Apache 2.0.

GitHub ClawHub

The Formal Preprint + Supporting Research

The peer-reviewable preprint for this framework, plus the governance and measurement research stack behind it.

Agent Security Harness Preprint (DOI: 10.5281/zenodo.19343034) Community-Driven Security Framework (DOI: 10.5281/zenodo.19343108) Constitutional Self-Governance (CSG) Decision Load Index (DLI) Normalization of Deviance Detection

Frequently Asked Questions

What is the Agent Security Harness?

An open-source Python framework that runs 363 security tests across 24 modules against AI agent systems. It tests across 4 wire protocols (MCP, A2A, L402, x402), covers the OWASP Agentic Top 10, aligns with NIST AI 800-2, and includes adapters for 20+ enterprise platforms. Available on PyPI (agent-security-harness), GitHub, and ClawHub.

What protocols does it test?

MCP (tool invocation), A2A (inter-agent communication), L402 (Bitcoin payments), and x402 (USDC/stablecoin payments). Each has dedicated test modules for authentication, authorization, injection, and data leakage.

How does it relate to OWASP and NIST?

Complete OWASP ASI Top 10 coverage (ASI01–ASI10) with every test mapped to specific categories. NIST AI 800-2 alignment with statistical confidence intervals. Each test also categorized using STRIDE threat model. One test run produces evidence for multiple compliance requirements.

Is your organization governance-ready?

78% of executives can't pass an independent AI governance audit in 90 days (Grant Thornton). Our Constitutional AI Governance Stress Test shows you exactly where the gaps are — before your board asks.

Get Your Governance Score →