Antihero Documentation v0.3.0
The behavioral safety layer for robots. Declarative policy enforcement, cryptographic audit trails, and fail-closed security for every robot, every tool call, every model.
Five Formal Guarantees
These are not marketing claims. They are mathematically enforced and verified by property-based tests.
Architecture
Every AI action passes through three layers: declare policy, enforce at the boundary, record the receipt.
┌─────────────────────────────────────────────────┐
│ Policy Layer │
│ YAML rules, tiered composition, │
│ deny-dominates, fail-closed, glob match │
└────────────────────┬────────────────────────┘
│
┌────────────┐ ┌────▼──────────┐ ┌──────────────┐
│ TCE │─>│ Guard │─>│ AEE │
│ (input) │ │ policy evaluate │ │ (audit log) │
│ │ │ requirement gate │ │ hash-chained │
│ │ │ fail-closed │ │ append-only │
└────────────┘ └────┬──────────┘ └──────┬───────┘
│ │
┌────▼──────┐ ┌──────▼───────┐
│ PDE │ │ Hash Chain │
│ (verdict) │ │ SHA-256 │
└───────────┘ │ RFC 8785 │
│ Ed25519 sig │
└──────────────┘
| Envelope | Purpose |
|---|---|
| TCE (Tool Call Envelope) | Captures who wants to do what, with what parameters |
| PDE (Policy Decision Envelope) | The engine's verdict: allow, deny, or gate |
| AEE (Audit Event Envelope) | Immutable receipt, hash-chained to predecessor |
Install
Python
pip install antihero
# Optional extras
pip install antihero[mcp] # MCP server for Claude Code
pip install antihero[signing] # Ed25519 audit signing
pip install antihero[all] # Everything
TypeScript / JavaScript
npm install @antihero/sdk
60-Second Quickstart
CLI
# 1. Initialize (choose a persona: developer, enterprise)
antihero init --persona developer
# 2. Gate a dangerous command
antihero run -- rm -rf /
# -> DENIED. Logged to tamper-evident hash chain.
# 3. Gate a safe command
antihero run -- echo "hello"
# -> ALLOWED. Executed and logged.
# 4. Verify nothing was tampered with
antihero audit verify
# -> Chain integrity verified.
Cloud API
# Evaluate an action against your policies
curl -X POST https://api.antihero.systems/api/v1/evaluate \
-H "Authorization: Bearer ah_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"action": "file.delete",
"resource": "/etc/passwd",
"subject": { "agent_id": "my-agent" }
}'
# Response:
# {
# "effect": "deny",
# "reason": "matched rule: deny-system-files",
# "risk_score": 1.0,
# "matched_rules": [...]
# }
Authentication: API Keys
API keys are the recommended method for server-to-server integration. Keys are prefixed ah_live_ and stored as SHA-256 hashes.
Authorization: Bearer ah_live_<64-hex-chars>
Create keys from the dashboard or via API (POST /api/v1/auth/api-keys, requires admin role). Keys can be scoped to specific permissions:
| Scope | Grants |
|---|---|
evaluate | Policy evaluation (POST /evaluate) |
policies:read | Read policies |
policies:write | Create, update, delete policies |
events:read | Read audit events |
An empty scopes list grants full access. The raw key is shown exactly once on creation — store it securely.
Authentication: JWT Tokens
JWTs are issued on login/register and can be sent as a Bearer token. HS256-signed, 24-hour expiry by default.
Authorization: Bearer <jwt-token>
X-Org-ID: <uuid> # optional — defaults to user's first org
Authentication: Browser Sessions
The login, register, and OAuth endpoints set an ah_session httpOnly cookie automatically. Browser-based clients (dashboard, docs) use this cookie transparently.
Authentication: OAuth
Antihero supports Google and GitHub OAuth. Redirect the user to the appropriate endpoint:
| Provider | Redirect | Callback |
|---|---|---|
GET /api/v1/auth/google | GET /api/v1/auth/google/callback | |
| GitHub | GET /api/v1/auth/github | GET /api/v1/auth/github/callback |
On success, the callback sets an ah_session cookie and redirects to /dashboard. If the user's email matches an existing account, the OAuth provider is linked automatically.
Authentication: Two-Factor Auth (TOTP)
Users can enable TOTP-based 2FA. When enabled, POST /api/v1/auth/login returns a temp_token instead of a session cookie. Exchange it with the TOTP code:
POST /api/v1/auth/2fa/login
{
"temp_token": "...",
"code": "123456"
}
Policy Format
Policies are YAML documents with one or more rules. Each rule matches on subjects, actions, and resources using glob patterns, and produces an effect.
version: "1.0"
tier: org # baseline | org | app | user
name: production-guardrails
rules:
- id: deny-destructive-commands
description: Block rm, format, and mkfs
effect: deny
priority: 100
actions:
- "shell.execute"
resources:
- "rm -rf*"
- "format*"
- "mkfs*"
risk_score: 1.0
- id: gate-file-writes
effect: allow_with_requirements
priority: 50
actions:
- "file.write"
resources:
- "*"
requirements:
- kind: confirm
params:
message: "Allow file write?"
risk_score: 0.5
- id: allow-reads
effect: allow
priority: 10
actions:
- "file.read"
resources:
- "/home/**"
risk_score: 0.0
Tier Composition
Policies compose across four tiers. Lower tiers set the floor; higher tiers can add restrictions but never remove them.
| Tier | Level | Purpose |
|---|---|---|
baseline | 0 | Antihero defaults (always present) |
org | 1 | Organization-wide rules |
app | 2 | Per-application overrides |
user | 3 | User-specific restrictions |
Deny always dominates. A deny in any tier overrides all allows in all tiers. No matching rule = deny (fail-closed).
Conditions
Rules can include conditions that match against fields in the Tool Call Envelope (TCE):
conditions:
- field: "context.risk_score" # dot-path into TCE fields
operator: "gt" # comparison operator
value: 0.8
| Operator | Description |
|---|---|
eq, neq | Equality / inequality |
in, not_in | Membership in a list |
gt, gte, lt, lte | Numeric comparison |
contains | Substring or list membership |
matches | Regex match |
Requirements
When a rule has effect: allow_with_requirements, the caller must satisfy all listed requirements before the action proceeds.
requirements:
- kind: confirm # human confirmation prompt
params:
message: "Delete this file?"
timeout_seconds: 30
- kind: mfa # require MFA verification
params: {}
- kind: rate_limit # throttle actions
params:
max_per_minute: 10
- kind: sandbox # execute in sandbox
params:
ttl_seconds: 300
- kind: redact # redact sensitive output
params:
patterns: ["ssn", "credit_card"]
Human Authorization (Proof-of-Human)
Antihero's proof-of-human system answers three trust questions for every high-risk action, all cryptographically signed:
- Who authorized this robot? — Principal identity binding via OAuth, passkey, or SAML. Delegation chains are tracked and depth-limited by policy.
- Did the robot stay within policy? — Runtime enforcement via the policy engine. Deny-dominates, fail-closed.
- Did a human approve this specific action? — Cryptographic approval signatures bound to the SHA-256 hash of the action. No replay attacks possible — an approval for one action cannot be reused for another.
Policy Configuration
Add a human_proof requirement to any policy rule. The method parameter selects the verification method.
rules:
- id: require-human-proof-heavy-lift
actions: ["force.gripper.*"]
conditions:
- field: context.payload_kg
operator: gt
value: 25.0
effect: allow_with_requirements
requirements:
- kind: human_proof
params:
method: world_id
action: verify
verification_level: orb
Verification Methods
| Method | Value | Description | Privacy Model |
|---|---|---|---|
| World ID | world_id | Zero-knowledge biometric proof of personhood via Worldcoin's orb-verified identity. Proves a unique human authorized the action without revealing who. | Nullifier hashes only — no biometric data stored |
| TOTP | totp | Time-based one-time password via any authenticator app (Google Authenticator, 1Password, Authy). | Shared secret per user |
| Webhook | webhook | Out-of-band approval via your own authorization service. HMAC-signed callback with action hash. | Your infrastructure |
| Passkey | passkey | WebAuthn/FIDO2 hardware key or platform biometric (Touch ID, Face ID, Windows Hello). | Public key only — no biometric data leaves device |
World ID Setup
- Create an app at developer.worldcoin.org and obtain your
app_id. - Configure the
app_idin your Antihero environment or policy configuration. - Set
method: world_idandverification_level: orb(biometric) ordevice(phone-verified) in your policy YAML. - When a robot action triggers this requirement, the operator receives a World ID verification prompt. The ZK proof is verified server-side and bound to the action hash before execution proceeds.
Maps to EU AI Act Article 14 (human oversight for high-risk AI systems) and ISO 13482 (personal care robot safety requirements).
Python SDK
Quick Wrap
from antihero import wrap
# Two lines. Every tool call is now policy-checked and audit-logged.
protected_agent = wrap(my_agent)
result = protected_agent.run("Deploy the update")
Guard (fine-grained control)
from antihero import Guard
from antihero.policy.engine import PolicyEngine
from antihero.policy.loader import load_policies
from antihero.evidence.chain import HashChain
from antihero.evidence.store import AuditStore
guard = Guard(
engine=PolicyEngine(load_policies(".antihero")),
chain=HashChain(),
store=AuditStore(".antihero/audit.jsonl"),
)
guard.execute(
my_tool_function,
action="file.write",
resource="/etc/passwd",
parameters={"content": "..."},
)
Cloud Client
import httpx
# Evaluate a tool call against your cloud policies
resp = httpx.post(
"https://api.antihero.systems/api/v1/evaluate",
headers={"Authorization": "Bearer ah_live_YOUR_KEY"},
json={
"action": "shell.execute",
"resource": "rm -rf /tmp/data",
"subject": {"agent_id": "deploy-bot"},
},
)
decision = resp.json()
print(decision["effect"]) # "allow" | "deny" | "allow_with_requirements"
print(decision["risk_score"]) # 0.0 - 1.0
print(decision["reason"]) # human-readable explanation
Exports
from antihero import (
Guard, # Main enforcement guard
ActionDeniedError, # Raised when policy denies
AntiheroError, # Base exception
AuditEventEnvelope, # AEE — tamper-evident audit record
PolicyDecisionEnvelope, # PDE — gate decision output
ToolCallEnvelope, # TCE — tool call input
Requirement, # A requirement from a PDE
Subject, # TCE subject (who is calling)
Caller, # TCE caller metadata
ThreatScanner, # Content/action threat detection
wrap, # Decorator for enforcing on functions
)
Robotics Adapters
Drop-in adapters intercept actions from robotics frameworks before they reach hardware. Each adapter maps framework-specific action outputs to Antihero's safety taxonomy.
VLA (Vision-Language-Action)
Generic interceptor for any VLA model (GR00T, pi0, OpenVLA, Gemini Robotics, etc.). Maps action tensors to semantic safety categories.
from antihero.adapters.vla import VLAAdapter
adapter = VLAAdapter()
safe_predict = adapter.wrap_predict(
model.predict, guard,
agent_id="warehouse-bot-01",
action_space={
"joint_positions": (0, 7),
"gripper": (7, 8),
"base_velocity": (8, 10),
},
)
action = safe_predict(observation)
NVIDIA GR00T
Purpose-built adapter for NVIDIA GR00T N1 whole-body humanoid control (29 DOF: arms, hands, base velocity).
from antihero.adapters.nvidia_groot import NvidiaGR00TAdapter
adapter = NvidiaGR00TAdapter()
safe_policy = adapter.wrap_policy(groot_policy, guard, agent_id="atlas-01")
action = safe_policy.predict(observation)
Physical Intelligence (pi0)
Intercepts pi0 action chunks (multi-step trajectories). Can check individual steps or aggregate worst-case values across the entire chunk.
from antihero.adapters.physical_intelligence import PhysicalIntelligenceAdapter
adapter = PhysicalIntelligenceAdapter()
safe_policy = adapter.wrap_policy(
pi0_policy, guard,
agent_id="figure-02",
check_all_steps=True,
)
action_chunk = safe_policy.predict(observation)
Teleop
Enforces safety policies on human teleoperation commands. Operator identity flows into the audit trail.
from antihero.adapters.teleop import TeleopAdapter
adapter = TeleopAdapter()
safe_send = adapter.wrap_command(
send_to_robot, guard,
operator_id="operator-jane",
command_type="velocity",
)
safe_send(velocity_cmd)
Figure Helix 2
Safety enforcement for Figure's Helix 2 fully learned closed-loop controller. Intercepts every control step (100Hz+) across the 32-DOF whole-body action space (arms, hands, torso, locomotion) and checks it against policy before it reaches actuators.
from antihero.adapters.figure_helix import FigureHelixAdapter
adapter = FigureHelixAdapter()
safe_controller = adapter.wrap_controller(
helix_controller, guard, agent_id="figure-02"
)
# In control loop:
action = safe_controller.step(observation)
# Or standalone enforcement without wrapping:
adapter.enforce(guard, action_vector, agent_id="figure-02")
ROS 2
Intercepts ROS 2 action server callbacks and service handlers with policy enforcement. All ROS imports are lazy, so the adapter works even without a ROS 2 workspace installed.
from antihero.adapters.ros import ROS2Adapter
adapter = ROS2Adapter()
guarded_callback = adapter.wrap_callback(
original_callback, guard,
action_name="motion.arm.move",
agent_id="warehouse-bot-1",
)
# Or check policy before submitting a goal:
pde = adapter.check_action(
guard, action_name="motion.arm.move",
agent_id="warehouse-bot-1",
)
LeRobot
Sits between a trained LeRobot policy and hardware actuators, intercepting every select_action call and checking the action vector against safety policies. Computes joint velocity stats for policy conditions.
from antihero.adapters.lerobot import LeRobotAdapter
adapter = LeRobotAdapter()
safe_policy = adapter.wrap_policy(
policy, guard, agent_id="my-robot"
)
action = safe_policy.select_action(observation)
Generic (Any Callable)
Wraps any Python callable with Guard enforcement. Uses the function's name as the action and routes every invocation through policy evaluation.
from antihero.adapters.generic import GenericAdapter
adapter = GenericAdapter()
wrapped = adapter.wrap(my_function, guard)
result = wrapped(arg1=val1, arg2=val2)
OpenAI
Wraps OpenAI client.chat.completions.create to intercept function/tool calls in responses and evaluate them against policy before the caller can execute them.
from openai import OpenAI
from antihero.adapters.openai import wrap_openai_client
client = OpenAI()
guard = Guard(engine=..., chain=..., store=...)
wrap_openai_client(client, guard)
# All tool_calls in responses are now policy-checked
response = client.chat.completions.create(...)
Anthropic
Wraps Anthropic client.messages.create to intercept tool_use blocks in responses. Each tool use is evaluated against policy, with PTC (programmatic tool calling) caller context preserved in the audit trail.
from anthropic import Anthropic
from antihero.adapters.anthropic import wrap_anthropic_client
client = Anthropic()
guard = Guard(engine=..., chain=..., store=...)
wrap_anthropic_client(client, guard)
# tool_use blocks are now policy-checked
response = client.messages.create(...)
LangChain
Wraps LangChain BaseTool subclasses so that every _run() invocation is evaluated against policy.
from antihero.adapters.langchain import wrap_langchain_tools
guard = Guard(engine=..., chain=..., store=...)
wrap_langchain_tools(tools, guard)
# All tool invocations are now policy-gated
CrewAI
Wraps CrewAI agents, crews, and tools. Tool _run() calls, agent delegation via execute_task, and crew kickoff are all policy-checked.
from antihero.adapters.crewai import wrap_crewai_crew
guard = Guard(engine=..., chain=..., store=...)
wrap_crewai_crew(crew, guard)
# All agent tools, delegation, and kickoff are now policy-gated
AutoGen
Wraps Microsoft AutoGen conversable agents. Existing registered functions, future register_function calls, and initiate_chat are all guarded automatically.
from antihero.adapters.autogen import wrap_autogen_agent
guard = Guard(engine=..., chain=..., store=...)
wrap_autogen_agent(assistant, guard)
# All registered functions and chat initiation are now policy-gated
World ID (Proof-of-Human)
Zero-knowledge biometric proof of personhood via Worldcoin's orb-verified identity. Verifies a World ID proof server-side and binds the approval to the SHA-256 hash of the action. Used as a human_proof requirement in policy rules.
# In policy YAML:
requirements:
- kind: human_proof
params:
method: world_id
action: verify
verification_level: orb # or "device"
app_id: "app_your_world_id"
Foxglove
Publishes policy decision envelopes to a Foxglove WebSocket server for timeline visualization. Deny events appear as annotated markers alongside robot sensor data and camera feeds.
from antihero.integrations.foxglove import FoxglovePublisher
publisher = FoxglovePublisher(url="ws://localhost:8765")
pde = guard.evaluate(action=..., resource=...)
publisher.publish_decision(pde)
publisher.close()
Rerun
Logs risk scores and deny events to Rerun for spatial and temporal visualization. Risk scores appear as scalar plots at /antihero/risk_score; deny events appear as text annotations.
from antihero.integrations.rerun import RerunLogger
rr_logger = RerunLogger(application_id="antihero-warehouse-bot")
pde = guard.evaluate(action=..., resource=...)
rr_logger.log_decision(pde)
rr_logger.close()
MuJoCo (Digital Twin)
Simulation backend for digital twin validation. Loads an MJCF/URDF model, applies the proposed action, and checks for safety violations (contact forces, joint efforts, velocity limits, unexpected collisions) before allowing physical execution.
from antihero.simulation.digital_twin.config import SimulationConfig
from antihero.simulation.digital_twin.mujoco_backend import validate
config = SimulationConfig(
engine="mujoco",
model_path="robot.xml",
max_contact_force=100.0,
max_joint_effort=0.9,
horizon_steps=50,
)
result = validate(config, action_params={"joint_targets": [...]})
print(result.safe, result.violations)
Isaac Sim (Digital Twin)
High-fidelity simulation backend using NVIDIA Isaac Sim with PhysX 5. Provides GPU-accelerated physics, photorealistic rendering for vision-based safety checks, and multi-robot fleet simulation. Requires NVIDIA GPU with CUDA support.
from antihero.simulation.digital_twin.config import SimulationConfig
from antihero.simulation.digital_twin.isaac_backend import validate
config = SimulationConfig(
engine="isaac",
model_path="robot.usd",
max_contact_force=100.0,
horizon_steps=50,
)
result = validate(config, action_params={"joint_targets": [...]})
TypeScript / JavaScript SDK
npm install @antihero/sdk
Cloud Client
import { AntiheroClient } from "@antihero/sdk";
const client = new AntiheroClient({
apiKey: "ah_live_...",
baseUrl: "https://api.antihero.systems", // optional, default
timeout: 10000, // optional ms
});
// Evaluate a tool call
const decision = await client.evaluate({
tce: {
action: "file.read",
resource: "/etc/passwd",
subject: {
agentId: "my-agent",
userId: "user-123",
roles: ["member"],
},
parameters: { path: "/etc/passwd" },
},
});
console.log(decision.effect); // "allow" | "deny" | "allow_with_requirements"
console.log(decision.riskScore); // number
console.log(decision.requirements); // [{kind, params}]
// Fetch audit events
const events = await client.getAuditEvents(50);
// Health check
const health = await client.healthCheck();
Local Engine (offline evaluation)
import { PolicyEngine, loadPolicyFile } from "@antihero/sdk";
const policy = loadPolicyFile(yamlString);
const engine = new PolicyEngine([policy], {
riskThreshold: 1.0,
maxDelegationDepth: 5,
});
const tce = {
id: crypto.randomUUID(),
timestamp: new Date().toISOString(),
subject: { agentId: "my-agent", roles: [], delegationDepth: 0 },
action: "file.read",
resource: "/etc/passwd",
parameters: {},
context: {},
};
const pde = engine.evaluate(tce);
// pde.effect, pde.reason, pde.riskScore, pde.matchedRules
MCP Server
Works with any MCP-compatible client (Claude Code, Cursor, OpenCode). Every tool call is policy-gated and audit-logged.
Claude Code / Cursor
{
"mcpServers": {
"antihero": {
"command": "antihero",
"args": ["serve"]
}
}
}
OpenCode
{
"mcp": {
"antihero": {
"type": "local",
"command": ["antihero", "serve"],
"enabled": true
}
}
}
Exposed Tools
| Tool | What it does |
|---|---|
antihero_check_policy | Pre-check if an action would be allowed |
antihero_audit_recent | Get recent audit events |
antihero_audit_verify | Verify hash chain integrity |
antihero_risk_status | Current risk budget utilization |
antihero_policy_explain | Explain why an action was allowed/denied |
antihero_certify | Run certification against adversarial scenario suites |
antihero_policy_suggestions | Generate candidate policy rules from certification gaps |
antihero_policy_guard | Simulate a policy change and check for regressions |
Certification
Continuous certification runs your agent configuration against adversarial scenario suites — 520+ scenarios across 24 domains including customer support, data access, financial operations, deployment, human oversight, warehouse logistics, industrial cobot, healthcare, construction, eldercare, cybersecurity, and more. Each run produces coverage and safety scores, a risk grade (A+ through F), and a list of policy gaps.
CLI
# Run certification for an agent
antihero certify --agent-id support-bot --suites customer_support,data_access
# Output: coverage %, safety %, risk grade, gaps list
API
POST /api/v1/certification/run | Run certification for an agent |
GET /api/v1/certification/history/{agent_id} | List past certification runs |
GET /api/v1/certification/{run_id} | Get a specific certification report |
MCP
The antihero_certify tool runs certification locally against your YAML policies. Pass agent_id, roles (comma-separated), and suites (comma-separated).
Scheduling
Certification can run on a schedule (daily, weekly, biweekly, monthly). The scheduler uses graduated escalation for consecutive failures:
| Failures | Level | Action |
|---|---|---|
| 1–2 | Warning | Log only, continue normally |
| 3–4 | Alert | Create low-severity incident |
| 5–7 | Escalate | Create medium-severity incident, slow cadence |
| 8+ | Critical | Create high-severity incident, disable schedule |
Policy Guard
Test policy changes before deploying them. The policy guard uses a metric+guard pattern: it checks whether a proposed rule improves coverage (metric) without breaking previously-passing scenarios (guard).
CLI
# Simulate adding a new deny rule
antihero policy guard new-deny-rule.yaml
# Specify suites and agent config
antihero policy guard stricter-policy.yaml --suites financial,data_access --roles admin
# JSON output for CI pipelines
antihero policy guard proposed.yaml --json
Output
The guard report shows:
- Coverage delta — did coverage improve or decrease?
- Safety delta — did safety score change?
- Regressions — scenarios that passed before but fail now
- New passes — scenarios that failed before but pass now
- Recommendation —
apply(safe),review(needs attention), orreject(regressions)
API
POST /api/v1/certification/guard | Simulate a policy change with guard checks |
Policy Suggestions
When certification finds gaps, Antihero auto-generates candidate YAML rules to close them. Each suggestion includes the rule YAML, gap type (false_allow or false_deny), severity, MITRE ATT&CK IDs, and a rationale. Suggestions are surfaced for human review — never auto-applied.
API
GET /api/v1/certification/suggestions/{agent_id} | Get policy suggestions from the latest certification run |
MCP
The antihero_policy_suggestions tool runs certification and generates suggestions locally. Returns the suggestions list, gap count, coverage score, safety score, and risk grade.
Smart Init
The antihero init command auto-detects your AI framework and suggests the right protection profile.
Framework Detection
Scans pyproject.toml, requirements.txt, and Python files for:
| Framework | Detection |
|---|---|
| CrewAI | crewai, from crewai |
| LangChain | langchain, from langchain |
| AutoGen | autogen, pyautogen |
| LangGraph | langgraph, from langgraph |
| OpenAI SDK | openai, from openai |
| Anthropic SDK | anthropic, from anthropic |
| MCP | mcp, model-context-protocol |
Protection Profiles
| Profile | Risk Threshold | Best For |
|---|---|---|
| Consumer | 0.8 (conservative) | ChatGPT, Claude, Gemini users |
| Developer | 1.0 (permissive) | Teams deploying robots |
| Enterprise | 0.6 (strict) | Compliance-driven organizations |
Usage
# Interactive wizard with framework detection
antihero init --interactive
# Quick start with a specific profile
antihero init --persona developer
# Specify directory
antihero init --dir .antihero --persona enterprise
Claude Code Skill + MCP
Two integration paths for AI coding environments. The Claude Code skill provides guided workflows; the MCP server (see MCP Server above) exposes 8 tools for direct access. Both work standalone or together.
The skill ships in the repo — clone and it loads automatically. The MCP server runs via antihero serve and works with Claude Desktop, Cursor, OpenCode, or any MCP-compatible client.
Workflows
| Workflow | Trigger | What It Does |
|---|---|---|
| Generate Policy | "create a policy that blocks..." | Generates YAML from natural language, validates, tests with policy guard |
| Certify & Review | "certify my agent" | Runs 520+ scenarios, shows gaps, suggests new rules for review |
| Audit Investigation | "what happened with agent X?" | Fetches audit trail, verifies hash chain, analyzes patterns |
| Policy Guard | "test this policy change" | Simulates proposed rules, reports regressions and coverage delta |
| Quick Check | "would this action be allowed?" | Pre-checks actions, explains denials, shows risk budget |
Skill Structure
.claude/skills/antihero/
├── skill.md # 5 workflows with triggers
├── gotchas.md # 12 common failure patterns
├── references/
│ ├── policy-yaml-format.md
│ ├── mcp-tools.md
│ ├── cli-commands.md
│ └── adapters.md
└── templates/
├── deny-rule.yaml
├── allow-with-reqs.yaml
├── full-policy.yaml
└── adapter-snippet.py
References and templates load on demand via progressive disclosure — Claude reads them only when the workflow needs them.
API: Health
No authentication required.
GET /api/v1/health | Returns {"status": "ok", "version": "0.2.0"} |
GET /api/v1/health/live | Liveness probe — {"status": "alive"} |
GET /api/v1/health/ready | Readiness probe — checks DB connectivity |
API: Auth
POST /api/v1/auth/register
Create a new account and organization.
// Request
{
"email": "dev@example.com",
"password": "SecurePass1", // ≥8 chars, 1 uppercase, 1 digit
"name": "Jane Doe", // optional
"org_name": "Acme Corp",
"cf_turnstile_response": "..." // required if CAPTCHA enabled
}
// Response (201) — sets ah_session cookie
{
"user_id": "550e8400-...",
"org_id": "6ba7b810-...",
"org_slug": "acme-corp-a1b2c3"
}
POST /api/v1/auth/login
// Request
{
"email": "dev@example.com",
"password": "SecurePass1",
"cf_turnstile_response": "..."
}
// Response A (no 2FA) — sets cookie
{
"user_id": "550e8400-...",
"org_id": "6ba7b810-...",
"org_slug": "acme-corp-a1b2c3"
}
// Response B (2FA enabled) — no cookie
{
"requires_2fa": true,
"temp_token": "eyJ..."
}
POST /api/v1/auth/api-keys
Create an API key. Requires admin role.
// Request
{
"name": "CI Pipeline",
"scopes": ["evaluate", "events:read"] // empty = full access
}
// Response (201)
{
"raw_key": "ah_live_abc123...", // shown ONCE
"key_prefix": "ah_live_abc1",
"name": "CI Pipeline",
"id": "key-uuid"
}
GET /api/v1/auth/api-keys
List all API keys for the org. Requires admin role.
DELETE /api/v1/auth/api-keys/{key_id}
Revoke an API key. Returns 204.
API: Organizations
Manage your organization, members, and invitations.
GET /api/v1/orgs/current
Returns the authenticated user's current organization.
// Response
{
"id": "org-uuid",
"name": "Acme Corp",
"slug": "acme-corp",
"plan": "sentinel",
"event_quota": 250000,
"event_count_this_month": 12847
}
GET /api/v1/orgs/current/members
List all members of the current organization.
// Response
[
{ "user_id": "uuid", "email": "admin@acme.com", "name": "Alice", "role": "owner" },
{ "user_id": "uuid", "email": "bob@acme.com", "name": "Bob", "role": "member" }
]
POST /api/v1/orgs/current/members
Invite a user by email. Requires admin role.
// Request
{
"email": "new-member@acme.com",
"role": "member" // viewer | member | admin
}
DELETE /api/v1/orgs/current/members/{user_id}
Remove a member. Requires admin role. Cannot remove yourself. Returns 204.
API: Evaluate
The core endpoint. Submit a tool call envelope and receive a policy decision. Requires evaluate scope.
POST /api/v1/evaluate
// Request
{
"action": "file.delete", // required
"resource": "/etc/passwd", // required
"parameters": {"force": true}, // optional
"context": {"risk_score": 0.9}, // optional
"subject": { // optional
"agent_id": "deploy-bot",
"user_id": "user-123",
"session_id": "sess-abc",
"roles": ["admin"]
},
"caller": { // optional
"type": "direct",
"container_id": null,
"tool_id": "bash_execute"
}
}
// Response
{
"effect": "deny",
"reason": "matched rule: deny-system-files (tier: org, priority: 100)",
"risk_score": 1.0,
"cumulative_risk": 1.0,
"requirements": [],
"matched_rules": [
{
"rule_id": "deny-system-files",
"policy_tier": "org",
"effect": "deny",
"priority": 100
}
],
"tce_id": "uuid",
"pde_id": "uuid"
}
POST /api/v1/evaluate/batch
Evaluate up to 100 tool calls in a single request.
// Request
{
"calls": [
{"action": "file.read", "resource": "/home/data.csv"},
{"action": "file.delete", "resource": "/etc/passwd"}
]
}
// Response
{
"results": [
{"effect": "allow", ...},
{"effect": "deny", ...}
]
}
API: Events
Query the hash-chained audit trail. Requires events:read scope.
GET /api/v1/events
Query params: limit (1-250), offset, action, outcome, since (ISO datetime), until (ISO datetime).
// Response
{
"events": [
{
"id": "evt-uuid",
"sequence": 42,
"timestamp": "2026-03-01T12:00:00Z",
"action": "file.delete",
"resource": "/etc/passwd",
"effect": "deny",
"outcome": "blocked",
"risk_score": 1.0,
"agent_id": "deploy-bot",
"denied_by": "deny-system-files"
}
],
"total": 1247,
"limit": 50,
"offset": 0
}
GET /api/v1/events/{event_id}
Returns full event detail including TCE, PDE, and hash chain values (prev_hash, this_hash).
API: Policies
GET /api/v1/policies | List all policies for the org |
POST /api/v1/policies | Create a policy (admin, policies:write) |
PUT /api/v1/policies/{id} | Update a policy (admin, policies:write) |
DELETE /api/v1/policies/{id} | Delete a policy (admin, policies:write) |
POST /api/v1/policies
// Request
{
"name": "production-guardrails",
"tier": "org",
"content": "version: \"1.0\"\ntier: org\nname: production-guardrails\nrules:\n - id: deny-rm\n effect: deny\n actions: [\"shell.execute\"]\n resources: [\"rm*\"]\n risk_score: 1.0"
}
// Response (201)
{
"id": "policy-uuid",
"name": "production-guardrails",
"tier": "org",
"content": "...",
"version": 1
}
API: Dashboard
GET /api/v1/dashboard/stats
{
"total_events": 12847,
"blocked": 342,
"allowed": 12401,
"errors": 104,
"chain_valid": true,
"risk_current": 0.32,
"risk_threshold": 1.0,
"plan": "enforcer",
"event_quota": 50000,
"event_count_this_month": 12847
}
GET /api/v1/dashboard/timeline
Returns hourly event counts for the last 24 hours.
[
{"hour": "2026-03-01T11:00:00Z", "allowed": 45, "blocked": 3, "errors": 0},
{"hour": "2026-03-01T12:00:00Z", "allowed": 62, "blocked": 7, "errors": 1}
]
API: Billing
GET /api/v1/billing/status | Current plan, usage, and quota |
GET /api/v1/billing/plans | All available plans (no auth required) |
GET /api/v1/billing/usage | Detailed usage with overage info |
POST /api/v1/billing/subscribe | Change plan (owner role) |
POST /api/v1/billing/cancel | Cancel subscription (owner role) |
POST /api/v1/billing/subscribe
// Request
{
"plan": "enforcer", // watchdog | enforcer | sentinel
"return_url": "https://..." // redirect after Polar checkout
}
// Response (Polar enabled)
{
"status": "checkout_created",
"checkout_url": "https://polar.sh/...",
"checkout_id": "ch_..."
}
API: Claims & Insurance
File and manage robot liability insurance claims. Claims are verified against the hash-chained audit trail for fraud prevention.
GET /api/v1/claims | List all claims |
POST /api/v1/claims | File a new claim (admin) |
GET /api/v1/claims/{id} | Claim detail with status history |
PATCH /api/v1/claims/{id} | Transition claim status (admin) |
GET /api/v1/claims/{id}/events | Linked audit events |
GET /api/v1/claims/stats | Claims summary statistics |
GET /api/v1/claims/economics | Insurance economics breakdown |
GET /api/v1/claims/pricing | Risk-adjusted premium pricing |
POST /api/v1/claims/verify | Verify hash chain for event IDs |
POST /api/v1/claims
// Request
{
"incident_type": "data_exposure",
"incident_date": "2026-02-28",
"description": "Agent exposed PII in API response",
"affected_agent_id": "customer-support-bot",
"related_event_ids": ["evt-uuid-1", "evt-uuid-2"],
"estimated_damages_cents": 500000
}
// Response (201)
{
"id": "claim-uuid",
"status": "submitted",
"chain_verified": true,
"auto_approved": false,
...
}
Claim lifecycle: submitted → under_review → approved / denied → paid
API: Compliance
GET /api/v1/compliance/frameworks
Returns supported compliance frameworks:
| Framework | Standard |
|---|---|
| SOC 2 Type II | AICPA TSC |
| HIPAA Security Rule | 45 CFR 164 |
| EU AI Act | EU 2024/1689 |
| NIST AI RMF | NIST AI 100-1 |
POST /api/v1/compliance/certificates/generate
Generate a compliance certificate with findings and score. Requires admin role.
// Request
{
"framework": "soc2",
"period_days": 30
}
// Response
{
"id": "cert-uuid",
"framework": "soc2",
"chain_valid": true,
"total_events": 12847,
"compliance_score": 94.5,
"certificate_hash": "sha256:...",
"findings": [
{
"severity": "warning",
"finding": "3 unreviewed deny events in period",
"detail": "...",
"deduction": 2
}
]
}
API: Export
Generate compliance reports from your audit data. Supports SOC 2, HIPAA, EU AI Act, and NIST AI RMF formats. Requires admin role.
POST /api/v1/export
// Request
{
"format": "soc2", // json | soc2 | hipaa | eu-ai-act | nist
"org_name": "Acme Corp", // optional — defaults to org name
"org_id": "custom-org-id" // optional — defaults to your org ID
}
// Response (SOC 2 example)
{
"report_type": "SOC 2 Type II — Robot Operations",
"organization": "Acme Corp",
"total_events": 12847,
"chain_integrity": { "valid": true, "errors": [] },
"criteria_results": {
"CC6.1": { "status": "pass", "evidence_count": 12847 },
"CC7.2": { "status": "pass", "evidence_count": 3241 }
}
}
Supported formats:
| Format | Standard | Output |
|---|---|---|
json | Raw | All events + chain verification |
soc2 | AICPA TSC | SOC 2 Type II report with criteria mapping |
hipaa | 45 CFR 164 | HIPAA Security Rule compliance report |
eu-ai-act | EU 2024/1689 | EU AI Act high-risk AI requirements |
nist | NIST AI 100-1 | NIST AI RMF governance functions |
API: Risk sentinel
GET /api/v1/risk/profile
Comprehensive risk assessment for the org (90-day window).
{
"composite_risk_score": 23, // 0-100, lower = safer
"risk_tier": "low", // low | medium | high | critical
"total_evaluations": 48291,
"total_blocked": 1247,
"block_rate": 0.026,
"avg_risk_score": 0.18,
"p95_risk_score": 0.72,
"audit_chain_verified": true,
"recommended_coverage_cents": 2500000,
"recommended_premium_monthly_cents": 9900,
"underwriting_notes": ["Low block rate indicates well-configured policies"]
}
GET /api/v1/risk/profile/export
Download the risk profile as a JSON file (with Content-Disposition: attachment header).
GET /api/v1/risk/history?months=6
Monthly risk profile snapshots (up to 12 months). Returns an array of RiskProfile objects.
API: Agent Heartbeat enforcer+
Monitor agent liveness. Agents that exceed the auto-quarantine threshold are flagged as stale.
POST /api/v1/agents/{id}/heartbeat | Send heartbeat for an agent |
GET /api/v1/agents/{id}/status | Agent liveness status (alive, stale, quarantined) |
GET /api/v1/agents/health | Aggregate health summary across all monitored agents |
API: Agent Metrics enforcer+
Per-agent action rate metrics and velocity anomaly detection.
GET /api/v1/agents/{id}/metrics | Rolling action-rate metrics for an agent |
GET /api/v1/agents/{id}/velocity | Velocity z-score and anomaly flag |
API: Observability enforcer+
GET /api/v1/observability/agents | List agents with rolling metrics enforcer |
GET /api/v1/observability/agents/{id}/metrics | Per-agent metrics enforcer |
GET /api/v1/observability/agents/{id}/drift | Behavioral drift detection sentinel |
GET /api/v1/observability/alerts | List fired alerts sentinel |
POST /api/v1/observability/alerts/rules | Create alert rule sentinel |
GET /api/v1/observability/sla | SLA compliance report sovereign |
POST /api/v1/observability/alerts/rules
// Request
{
"metric": "deny_rate", // deny_rate | risk_score_avg | event_count | block_count
"operator": "gt", // gt | lt | gte | lte
"threshold": 0.15,
"window_minutes": 60,
"cooldown_minutes": 15,
"severity": "warning"
}
API: Incidents sentinel
GET /api/v1/incidents | List incidents |
POST /api/v1/incidents | Create incident |
GET /api/v1/incidents/{id} | Incident detail |
POST /api/v1/incidents/{id}/quarantine | Quarantine an agent/resource |
POST /api/v1/incidents/{id}/resolve | Resolve incident |
GET /api/v1/incidents/{id}/evidence | Forensic evidence bag sovereign |
POST /api/v1/incidents/agents/{agent_id}/kill | Emergency kill switch — quarantine + create incident |
GET /api/v1/incidents/agents/quarantined | List all quarantined agents |
DELETE /api/v1/incidents/agents/{agent_id}/quarantine | Lift quarantine on an agent |
Quarantine Actions
// POST /api/v1/incidents/{id}/quarantine
{
"action_type": "disable_agent", // disable_agent | block_resource | freeze_session
"target": "deploy-bot"
}
Emergency Kill Switch
// POST /api/v1/incidents/agents/{agent_id}/kill
{
"reason": "Agent exceeded risk threshold",
"severity": 4 // 1=LOW, 2=MEDIUM, 3=HIGH, 4=CRITICAL
}
// Immediately quarantines the agent and creates a severity-4 incident.
Create Incident
// POST /api/v1/incidents
{
"severity": 3, // 1-4
"trigger_detail": "Unusual file access pattern detected",
"agent_id": "deploy-bot"
}
API: Marketplace enforcer+
Browse and install community policy templates.
GET /api/v1/marketplace/search | Search published policies |
GET /api/v1/marketplace/categories | List categories |
GET /api/v1/marketplace/entries/{id} | Entry detail + raw YAML |
POST /api/v1/marketplace/install | Install a template into your org |
POST /api/v1/marketplace/publish | Publish a policy template sentinel |
POST /api/v1/marketplace/entries/{id}/rate | Rate a marketplace entry (1.0-5.0) sentinel |
Categories: security, compliance, healthcare, finance, education, government, general.
API: Knowledge Graph enforcer+
Visualize agent-resource-policy relationships.
GET /api/v1/graph/query | Full graph (nodes, edges, stats) |
GET /api/v1/graph/agent/{id}/reachability | What resources can this agent reach? |
GET /api/v1/graph/resource/{id}/governance | What policies govern this resource? |
GET /api/v1/graph/coverage-gaps | Resources without governing policies |
GET /api/v1/graph/impact/{rule_id} | Impact of changing a rule sentinel |
GET /api/v1/graph/visualize | D3.js-ready visualization data sentinel |
GET /api/v1/graph/delegation-chains | All agent delegation chain trees |
GET /api/v1/graph/delegation-chains/{agent_id} | Delegation subtree for a specific agent |
Graph Search
Full-text and similarity search across the knowledge graph.
GET /api/v1/graph/search | Full-text search across graph nodes and edges |
GET /api/v1/graph/similar/{resource} | Find resources with similar access patterns and policy coverage |
API: Simulation sentinel
Replay historical events against proposed policy changes before deploying.
POST /api/v1/simulation/replay | Replay events against proposed policy |
POST /api/v1/simulation/whatif | Fork simulation at an event sovereign |
GET /api/v1/simulation/results/{id} | Retrieve simulation result |
GET /api/v1/simulation/results/{id}/impact | Impact summary |
POST /api/v1/simulation/replay
// Request
{
"proposed_policy_ids": ["policy-uuid"],
"proposed_policy_yaml": "...", // or ad-hoc YAML
"event_filter_agent_id": "deploy-bot", // optional filter
"period_days": 30
}
API: Hierarchy sentinel
Multi-tenant policy hierarchy. Assign policies at org, team, project, or agent scope. Deny-dominates composition ensures child scopes can only add restrictions, never weaken parent rules.
Scope paths
Scope is expressed as a slash-delimited path: org → team-slug → team-slug/project-slug → team-slug/project-slug/agent-id.
GET /api/v1/hierarchy/{scope_path} | Resolve effective policies for a scope |
PUT /api/v1/hierarchy/{scope_path}/policies | Set policies for a scope level |
POST /api/v1/hierarchy/{scope_path}/validate | Validate a policy against parent scope |
POST /api/v1/hierarchy/teams | Register a new team scope |
POST /api/v1/hierarchy/projects | Register a new project under a team |
PUT /api/v1/hierarchy/{scope_path}/policies
// Request — array of policy assignments
[
{ "policy_id": "policy-uuid", "priority": 0 },
{ "policy_id": "policy-uuid-2", "priority": 10 }
]
POST /api/v1/hierarchy/teams
{ "team_id": "backend-team", "name": "Backend Team" }
POST /api/v1/hierarchy/projects
{ "project_id": "deploy-service", "team_id": "backend-team", "name": "Deploy Service" }
API: Federation sovereign
Federated policy sync between peer organizations. Register peers, push signed policy bundles, and approve/reject incoming bundles.
GET /api/v1/federation/peers | List registered peers |
POST /api/v1/federation/peers | Register a new peer org |
DELETE /api/v1/federation/peers/{org_id} | Remove a peer |
POST /api/v1/federation/push | Push a signed policy bundle to a peer |
POST /api/v1/federation/pull | Pull pending bundles from peers |
GET /api/v1/federation/pending | List bundles pending approval |
POST /api/v1/federation/pending/{id}/approve | Approve and merge a bundle |
POST /api/v1/federation/pending/{id}/reject | Reject a bundle |
POST /api/v1/federation/verify-peer | Challenge-response peer verification |
POST /api/v1/federation/peers
// Request — register a peer organization
{
"org_id": "partner-org-uuid",
"org_name": "Acme Partner",
"public_key_hex": "ed25519-public-key-hex",
"trust_level": "review_required" // full_trust | review_required | read_only
}
// Response
{
"id": "peer-uuid",
"peer_org_id": "partner-org-uuid",
"peer_org_name": "Acme Partner",
"trust_level": "review_required",
"status": "active",
"created_at": "2026-03-01T..."
}
POST /api/v1/federation/push
// Request — push a policy bundle
{
"target_org_id": "partner-org-uuid",
"policy_ids": ["policy-uuid-1", "policy-uuid-2"],
"description": "Updated security baseline v2"
}
API: Security sovereign
Government and defense features: FIPS crypto status, data classification, and air-gapped bundle export/import.
GET /api/v1/security/fips-status | Check FIPS 140-2 crypto mode status |
POST /api/v1/security/classification | Set classification level for session |
POST /api/v1/security/export-bundle | Export signed air-gap bundle |
POST /api/v1/security/import-bundle | Import and validate air-gap bundle |
POST /api/v1/security/classification
// Request
{
"level": 2, // 0=UNCLASSIFIED, 1=CUI, 2=SECRET, 3=TOP_SECRET
"caveats": ["NOFORN", "REL-TO-FVEY"]
}
POST /api/v1/security/export-bundle
// Request
{
"classification_level": 1,
"caveats": [],
"include_policies": true,
"include_compliance": false
}
// Response — signed bundle JSON with HMAC signature
{
"bundle_id": "uuid",
"org_id": "...",
"classification": { "level": 1, "caveats": [] },
"policies": [...],
"signature": "hmac-sha256:..."
}
API: Reinsurance
Reinsurance treaty simulation and portfolio risk analysis for actuarial modeling.
GET /api/v1/reinsurance/treaties | List available treaty structures |
POST /api/v1/reinsurance/simulate | Simulate treaty against historical claims admin |
GET /api/v1/reinsurance/portfolio | Portfolio-level risk summary |
POST /api/v1/reinsurance/simulate
// Request
{
"treaty": {
"type": "excess_of_loss", // quota_share | excess_of_loss | hybrid
"quota_share_pct": 0.0,
"xol_attachment_cents": 500000, // $5,000 attachment point
"xol_limit_cents": 5000000 // $50,000 limit
}
}
// Response
{
"treaty": { "type": "excess_of_loss", ... },
"total_claims_cents": 12500000,
"total_retained_cents": 8700000,
"total_ceded_cents": 3800000,
"cession_rate": 0.304,
"allocations": [
{ "claim_amount_cents": 250000, "retained_cents": 250000, "ceded_cents": 0 },
{ "claim_amount_cents": 750000, "retained_cents": 500000, "ceded_cents": 250000 }
]
}
GET /api/v1/reinsurance/portfolio
// Response
{
"total_claims_count": 47,
"total_claims_amount_cents": 12500000,
"avg_claim_cents": 265957,
"max_claim_cents": 2500000,
"loss_ratio_estimate": 0.68,
"concentration_risk": { "top_org_pct": 0.35, "top_3_orgs_pct": 0.72 }
}
API: Insurance Partner
Integration API for insurance carrier partners. Uses X-Partner-Key header authentication instead of Bearer tokens. Requires ANTIHERO_PARTNER_KEY environment variable.
GET /api/v1/insurance/partner/orgs | List all orgs with coverage status |
GET /api/v1/insurance/partner/orgs/{org_id}/risk | Risk profile for a specific org |
GET /api/v1/insurance/partner/orgs/{org_id}/claims | All claims for a specific org |
GET /api/v1/insurance/partner/orgs/{org_id}/audit-summary | Audit chain summary for an org |
POST /api/v1/insurance/partner/orgs/{org_id}/coverage | Create/update coverage terms |
POST /api/v1/insurance/partner/claims/{claim_id}/decision | Approve or deny a claim |
Authentication
curl -H "X-Partner-Key: your-partner-api-key" \
https://api.antihero.systems/api/v1/insurance/partner/orgs
POST /api/v1/insurance/partner/orgs/{org_id}/coverage
// Request
{
"coverage_limit_cents": 10000000, // $100,000
"deductible_cents": 50000, // $500
"premium_monthly_cents": 9900, // $99/mo
"effective_date": "2026-03-01",
"expiry_date": "2027-03-01",
"partner_id": "carrier-123"
}
POST /api/v1/insurance/partner/claims/{claim_id}/decision
// Request
{
"decision": "approved", // approved | denied
"approved_amount_cents": 250000, // required if approved
"reason": "Covered under terms" // required if denied
}
// Response
{
"claim_id": "claim-uuid",
"old_status": "under_review",
"new_status": "approved",
"approved_amount_cents": 250000,
"decided_at": "2026-03-04T..."
}
API: Integrations
Framework integration guides and setup instructions. Public endpoints — no authentication required.
GET /api/v1/integrations
List all 8 supported framework integrations with install commands and code snippets.
| ID | Framework |
|---|---|
anthropic | Anthropic Claude |
claude-agent-sdk | Claude Agent SDK |
openai | OpenAI |
langchain | LangChain |
crewai | CrewAI |
autogen | Microsoft AutoGen |
mcp | MCP (Model Context Protocol) |
generic | Generic (any Python callable) |
GET /api/v1/integrations/{integration_id}
Get setup instructions for a specific integration.
// GET /api/v1/integrations/anthropic
{
"id": "anthropic",
"name": "Anthropic Claude",
"description": "Wrap Anthropic client.messages.create with policy enforcement.",
"install": "pip install antihero[anthropic]",
"snippet": "from anthropic import Anthropic\nfrom antihero import Guard\n..."
}
API: Demo
Public demo endpoint for testing policy evaluation without authentication. Evaluates against built-in baseline policies.
POST /api/v1/demo/evaluate
// Request
{
"action": "file.delete",
"resource": "/etc/passwd",
"agent_id": "demo-agent"
}
// Response
{
"effect": "deny",
"risk_score": 0.5,
"matched_rules": [
{ "rule_id": "fail-closed-default", "policy_tier": "baseline", "effect": "deny" }
],
"reason": "Action 'file.delete' not matched by any policy rule.",
"requirements": [],
"threat": null
}
Behavioral Black Box
The behavioral black box is the robotics equivalent of an aircraft’s flight data recorder. It continuously records every policy decision, sensor reading, and safety event with cryptographic integrity, producing the evidence chain that insurance carriers need for incident reconstruction and liability determination.
The black box implements the AHDS-2 data specification, which defines four recording tiers:
| Tier | Name | Frequency | What Is Recorded |
|---|---|---|---|
| 1 | Critical | Every event | Every deny, emergency stop, collision, human contact — full event with sensor snapshot, SHA-256 hash chain, Ed25519 signature |
| 2 | Standard | Up to 100Hz | Every policy evaluation — action, effect, risk_score, matched_rule_id, timestamp |
| 3 | Heartbeat | 1Hz | Robot status, battery, uptime, event counts, hardware thermal state |
| 4 | Forensic | On-demand | Full sensor dump, motor states, camera frames, communication log |
Tier 1 events are never dropped. If the recording system cannot write a critical event, the robot enters a safe state (fail-closed). A robot that cannot record safety events cannot operate.
Data Integrity
- SHA-256 hash chain — Each event hash includes the previous event’s hash. Modifying any event invalidates all subsequent hashes.
- Ed25519 signatures — Tier 1 events are signed with the robot’s private key, stored in the platform’s secure enclave.
- Sequence numbers — Monotonic uint64 counter. Gaps indicate missing events. Never resets, even across reboots.
- RFC 8785 JCS — Deterministic JSON canonicalization ensures reproducible hash verification across implementations.
Usage
# Export black box events as JSON Lines
curl -X GET https://api.antihero.systems/api/v1/blackbox/export \
-H "Authorization: Bearer ah_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"robot_id": "fleet-01-unit-07",
"start_ns": 1711382400000000000,
"end_ns": 1711468800000000000,
"tiers": [1, 2, 3],
"format": "jsonl"
}'
Export Formats
| Format | Extension | Use Case |
|---|---|---|
| JSON Lines | .jsonl | Human-readable, standard log tools, cloud upload |
| Binary | .ahds2 | Space-efficient on-device storage, self-verifying archive |
See the full AHDS-2 specification for complete data schemas, storage requirements, and compliance mappings.
Hardware Certification
Different compute platforms have different behavioral safety capabilities. Hardware certification profiles define minimum recording rates, storage budgets, and integrity mechanisms per platform.
| Hardware Profile | TDP | Tier 2 Rate | Tier 3 Rate | Storage | Signing |
|---|---|---|---|---|---|
| Jetson Thor 130W | 130W | 100Hz | 1Hz | 10 GB | Trusty TEE |
| Jetson Thor 40W | 40W | 10Hz | 1Hz | 5 GB | Trusty TEE |
| Qualcomm RB7 | 15W | 10Hz | 0.1Hz | 1 GB | QTEE |
| Coral Edge TPU | 5W | N/A | 1Hz | 100 MB | Software |
Compute Pressure Adaptation
Under sustained CPU/GPU load, Tier 2 recording degrades gracefully:
| CPU/GPU Load | Tier 2 Adjustment |
|---|---|
| < 90% | Nominal (per hardware profile) |
| 90–95% | Reduce to 50% of nominal |
| 95–99% | Reduce to 10% of nominal |
| > 99% | Suspend Tier 2 (Tier 1 and Tier 3 continue) |
Every frequency adjustment is logged as a Tier 1 event, ensuring the evidence chain records compute pressure events.
Certification Suite
The compute_pressure certification suite (15 scenarios) tests recording integrity under sustained load, thermal throttle, power failure recovery, and storage rotation. Run it with:
antihero certify --agent-id warehouse-bot-01 --suites compute_pressure
Cybersecurity Kill Chain
25 adversarial scenarios covering 6 kill chain stages, modeled after real-world attacks on robotic systems including Unitree G1 telemetry exfiltration and MiR-100 ROS injection. Every scenario is enforced deterministically — no LLM-based guardrails, no probabilistic filtering.
Kill Chain Stages
| Stage | Scenarios | Description |
|---|---|---|
| 1. Reconnaissance | 4 | Network scanning, service enumeration, firmware version probing, topology mapping via ROS 2 discovery |
| 2. Exploitation | 5 | ROS topic injection, malformed URDF payloads, action server spoofing, DDS deserialization attacks, unauthenticated API access |
| 3. Privilege Escalation | 4 | Parameter server manipulation, node namespace hijacking, lifecycle transition abuse, capability override via remap |
| 4. Lateral Movement | 4 | Cross-robot ROS bridge pivoting, fleet management credential reuse, shared TF tree poisoning, multi-robot coordination injection |
| 5. Data Exfiltration | 4 | Telemetry stream siphoning (Unitree G1 pattern), camera feed redirection, audit log extraction, SLAM map exfiltration |
| 6. Command & Control | 4 | Persistent backdoor via ROS 2 lifecycle nodes, covert command channels in sensor topics, firmware update hijacking, remote emergency stop override |
Real Attack Patterns
- Unitree G1 telemetry exfiltration — Scenarios test unauthorized access to joint state, IMU, and force/torque sensor streams. Policy denies any
perception.telemetry.*action from unrecognized subjects. - MiR-100 ROS injection — Scenarios test malicious topic publication on
/cmd_veland/move_base/goal. Policy requires authenticated subjects and validates message schemas before allowing motion commands.
Enforcement Model
All 25 scenarios use deterministic policy evaluation. Attack detection is not based on pattern matching or anomaly scoring — it is based on policy rules that explicitly deny unauthorized actions. If an action is not explicitly allowed by a matching rule, it is denied (fail-closed). This eliminates the false-negative problem inherent in LLM-based guardrails.
Usage
antihero certify --agent-id warehouse-bot-01 --suites cybersecurity
# Run a specific kill chain stage
antihero certify --agent-id warehouse-bot-01 --suites cybersecurity --filter "stage:reconnaissance"
Privacy & Surveillance
30 scenarios covering perception governance, data sovereignty, and privacy zone enforcement. Every Physical AI system carries cameras, microphones, LiDAR, and biometric sensors — without governance, every robot is a surveillance device. Antihero enforces what a robot can record, store, process, and transmit at the same <100µs policy boundary used for force limits and zone restrictions.
Perception Governance
Policy rules control sensor data at the action boundary. Deny rules block recording, processing, or transmission of perception data in designated zones or contexts.
version: "1.0"
tier: app
name: home-privacy
rules:
- id: deny-recording-private-zones
actions: ["perception.record.*"]
conditions:
- field: context.zone
operator: in
value: ["bedroom", "bathroom", "changing_room"]
effect: deny
- id: deny-biometric-capture
actions: ["perception.biometric.*"]
conditions:
- field: context.consent
operator: eq
value: false
effect: deny
Data Sovereignty
Prevent sensor data, SLAM maps, biometric captures, and telemetry from leaving the deployment boundary. Deny rules block exfiltration and unauthorized training data uploads.
rules:
- id: deny-data-exfiltration
actions: ["data.exfiltrate.*"]
effect: deny
- id: deny-training-upload
actions: ["data.training.upload"]
effect: deny
- id: deny-slam-export
actions: ["data.export.slam_map"]
conditions:
- field: context.destination
operator: not_in
value: ["local", "approved_cloud"]
effect: deny
Privacy Zone Policies
Define physical spaces where perception is restricted. Privacy zones compose with existing safety rules via deny-dominates — a privacy deny cannot be overridden by any allow rule.
| Template | Zones | Restrictions |
|---|---|---|
| Home Privacy | Bedrooms, bathrooms, changing areas | No camera recording, no audio capture, no biometric processing |
| Warehouse Privacy | Break rooms, locker areas, restrooms | No facial recognition, no audio recording, worker tracking limited to task zones |
| Healthcare Privacy | Patient rooms, exam rooms, therapy areas | No camera recording without consent, no biometric data retention, HIPAA-compliant perception controls |
GDPR / CCPA Mapping
- GDPR Article 5 (data minimization) — perception.record deny rules enforce collection limits at the sensor level
- GDPR Article 17 (right to erasure) — data.retain policies enforce automatic deletion schedules
- GDPR Article 35 (DPIA) — compliance exports document perception governance for impact assessments
- CCPA Section 1798.100 (right to know) — audit trails record exactly what data was captured, processed, and stored
Usage
antihero certify --agent-id home-robot-01 --suites privacy_surveillance
# Apply a privacy template
antihero init --template home-privacy
antihero init --template warehouse-privacy
antihero init --template healthcare-privacy
Self-Improving Agent Safety
20 adversarial scenarios testing recursive self-modification attacks, inspired by Meta’s Hyperagents (DGM-H) research on self-improving foundation model agents. These scenarios validate that Antihero’s external enforcement layer is immune to agent self-modification — because the safety layer runs outside the agent’s process, memory, and permission boundary.
Attack Categories
| Category | Scenarios | Description |
|---|---|---|
| Policy Tampering | 4 | Agent attempts to modify its own YAML policy files, inject new allow rules, weaken conditions, or delete deny rules |
| Audit Trail Disabling | 4 | Agent attempts to truncate audit logs, break hash chains, forge Ed25519 signatures, or redirect audit output to /dev/null |
| Force Limit Escalation | 4 | Agent attempts to increase its own force limits, modify MuJoCo simulation parameters, bypass sim-before-execute checks |
| Privilege Escalation | 4 | Agent attempts to elevate its own role, forge subject credentials, impersonate admin users, or bypass human approval requirements |
| Reward Hacking | 4 | Agent attempts to manipulate risk scores, inflate certification results, create false-positive audit events, or game the risk budget |
Why External Enforcement Works
Meta’s Hyperagents research demonstrated that self-improving agents can recursively modify their own behavior, tools, and objectives. The key insight is that internal guardrails are vulnerable to self-modification — an agent that can rewrite its own code can rewrite its own safety checks. Antihero’s enforcement runs in a separate process with separate permissions:
- The policy engine runs outside the agent’s process — the agent cannot access or modify the evaluator code
- Policy YAML files are read-only to the agent — file-system permissions prevent modification
- The audit trail is append-only with cryptographic integrity — Ed25519 signatures use keys the agent never holds
- Human approval gates are out-of-band — approval flows through channels the agent cannot intercept
Usage
antihero certify --agent-id autonomous-agent --suites self_improving_agent_safety
Foxglove / Rerun Integrations
Publish policy decisions and safety events to robotics observability tools. Both integrations stream enforcement data alongside your existing sensor and state visualizations.
FoxglovePublisher
Publishes policy decision envelopes (PDEs) to a Foxglove timeline via WebSocket. Each decision appears as an annotated event on the Foxglove timeline, with effect, risk score, matched rules, and action metadata.
from antihero.integrations.foxglove import FoxglovePublisher
publisher = FoxglovePublisher(
ws_url="ws://localhost:8765", # Foxglove WebSocket server
channel="antihero.decisions",
)
# After each policy evaluation
pde = guard.evaluate(tce)
publisher.publish_decision(pde)
# Deny events appear as red markers on the Foxglove timeline
# Allow events appear as green markers
# Risk scores are plotted as a continuous time series
RerunLogger
Logs risk scores and deny events to Rerun for spatial and temporal visualization. Risk scores appear as scalar plots; deny events appear as text annotations with full action context.
from antihero.integrations.rerun import RerunLogger
logger = RerunLogger(
recording_id="warehouse-bot-01",
entity_path="antihero/safety",
)
# After each policy evaluation
pde = guard.evaluate(tce)
logger.log_decision(pde)
# In the Rerun viewer:
# - "antihero/safety/risk_score" shows a scalar time series
# - "antihero/safety/deny_events" shows text log entries
# - "antihero/safety/effect" shows allow/deny state over time
Research & Analysis
Technical publications, formal specifications, and analysis of emerging safety challenges in autonomous robotics.
| Publication | Description |
|---|---|
| Whitepaper | Distributed safety architecture, insurance model, and safety thesis for autonomous robotics |
| arXiv Preprint | Multi-layered runtime enforcement architecture for autonomous robot safety |
| AHDS-1 Actuarial Spec | Formal data schema for robot insurance underwriting |
| AHDS-2 Black Box Spec | Behavioral black box data specification — recording tiers, hash chains, hardware profiles |
| You Can’t Audit a Neural Net | Why end-to-end learned controllers require external safety enforcement at the action boundary |
| Meta’s Hyperagents Analysis | Why external safety enforcement matters — analysis of DGM-H recursive self-improvement risks |
CLI Reference
| Command | Description |
|---|---|
antihero init | Initialize .antihero/ with policy + config |
antihero run | Gate a shell command with policy enforcement |
antihero audit show | View audit trail (--json for machine-readable) |
antihero audit verify | Verify hash chain integrity |
antihero audit export | Export audit data (json, soc2, hipaa) |
antihero policy validate | Validate a policy YAML file |
antihero policy test | Test an action against loaded policies |
antihero policy guard | Simulate a policy change and check for regressions |
antihero policy diff | Compare two policy files |
antihero certify | Run certification against scenario suites |
antihero scan | Scan text for built-in threat patterns |
antihero dashboard | Terminal dashboard (--web for browser UI) |
antihero serve | Start the MCP server |
antihero keygen | Generate Ed25519 signing keypair |
antihero hub search/install/list/info | Policy Hub — browse and install community policies |
antihero airgap export/import/verify | Air-gap bundle operations for disconnected environments |
Plans & Pricing
| Watchdog Free |
Enforcer $29/mo |
Sentinel $99/mo |
Sovereign Custom |
|
|---|---|---|---|---|
| Event quota | 1,000 | 50,000 | 500,000 | Unlimited |
| Agents | 3 | 25 | 100 | Unlimited |
| Team members | 2 | 10 | 50 | Unlimited |
| Policy rules | 10 | 100 | 500 | Unlimited |
| Audit retention | 7 days | 90 days | 365 days | Unlimited |
| Compliance export | — | JSON | SOC 2, HIPAA, EU AI Act, NIST | All + custom |
| Robot liability coverage | — | $1,000 | $25,000 | $1M+ |
| Support | Community | Priority | Dedicated + SLA |
Roles & Permissions
| Role | Level | Capabilities |
|---|---|---|
viewer | 10 | Read events, policies, dashboard stats, billing status |
member | 20 | Everything in viewer + evaluate actions |
admin | 30 | Everything in member + create/delete API keys, manage policies, file claims, invite members, generate compliance certificates |
owner | 40 | Everything in admin + change plan, cancel subscription, remove members, delete org |
Related Resources
- Overview — Back to Antihero home
- Insurance Data Specification (AHDS-1) — Formal schema for robot underwriting
- Latest Updates — New features, releases, and announcements
- Our Principles — Why we built Antihero