Antihero
Home Docs Spec News
GitHub
Request Access

News

Product updates, releases, and announcements.

AHDS-2: The Behavioral Black Box Standard for Humanoid Robots

Today we’re publishing AHDS-2 — the behavioral black box data specification for humanoid robots. Like an aircraft’s flight data recorder, the behavioral black box continuously records every policy decision, sensor reading, and safety event with cryptographic integrity.

  • Flight recorder for robots — Continuous recording at up to 100Hz. Every deny event, emergency stop, collision, and human contact is captured with full sensor context and signed into a tamper-evident hash chain.
  • 4 recording tiers — Critical (every safety event, Ed25519 signed), Standard (up to 100Hz policy evaluations), Heartbeat (1Hz operational status), and Forensic (on-demand full sensor dumps for incident investigation).
  • Hardware certification profiles — Jetson Thor at 130W (full 100Hz recording), Jetson Thor 40W and Qualcomm RB7 (10Hz), Coral Edge TPU (critical-only). Different power envelopes, different behavioral guarantees. 15 new compute_pressure certification scenarios test safety under thermal throttle and battery drain.
  • AHDS-2 specification — Formal data schema, storage requirements, integrity mechanisms, export formats (.jsonl and binary .ahds2), and hardware profile declarations. Complements AHDS-1 (actuarial underwriting data).
  • Insurance & regulatory compliance — Maps to EU AI Act Article 12 (record-keeping), ISO 13482 (safety data logging), NIST AI RMF, and SOC 2 audit trail requirements. Carriers can verify chain integrity with the robot’s public key alone.
Read the docs →

Meta's Hyperagents: Why Self-Improving Robots Need External Safety Enforcement

Meta, UBC, the Vector Institute, and NYU published DGM-Hyperagents — a framework for recursive self-improvement where AI modifies its own modification process. Their safety discussion is unusually direct about the limits of current safeguards.

  • Recursive self-improvement is here — Hyperagents combines a task agent and a meta agent into a single editable program. The meta agent rewrites itself. Tested on coding, paper review, robotics reward design, and Olympiad math.
  • Their own safety caveat — The paper explicitly acknowledges “safeguards may become increasingly strained or infeasible as self-improving systems grow more capable.”
  • New certification suite — We’ve added a Self-Improving Agent Safety suite (20 scenarios) testing whether agents can bypass, modify, or circumvent external enforcement.
  • New adversarial scenarios — Policy tampering, audit trail disabling, force limit escalation, privilege escalation, reward hacking, and emergency stop bypass.
  • Full analysis — Read our technical breakdown of why external enforcement is the answer to Goodhart’s Law applied to recursive self-improvement.
Read the full analysis →

You Can't Audit a Neural Net: External Safety Enforcement for Learned Controllers

End-to-end neural nets — Figure Helix 2, NVIDIA GR00T, Physical Intelligence pi0 — are replacing scripted robot behaviors. They're faster, more adaptive, and completely opaque. No code to inspect. No rules to audit. No line to trace when something goes wrong.

  • Black-box problem — Learned controllers output raw action tensors from billions of neural weights. Traditional code review and rule inspection don't apply.
  • Antihero sits between the neural net and the actuators. Every action tensor is checked against safety policies in <100µs before reaching motors — regardless of what model produced it.
  • New Figure Helix adapterFigureHelixAdapter for closed-loop learned controllers running at 100Hz+. Continuous tensor interception without breaking the control loop.
  • VLA adapter expanded — now supports both open-loop (trajectory chunks) and closed-loop (continuous control) modes for GR00T, pi0, and any custom VLA model.
  • The thesis — You can't audit a neural net. You can audit every action it tries to take.
Read the docs →

Full Robotics Coverage: 465+ Scenarios, VLA Adapters, 22 Domain Suites

Antihero now covers every major humanoid deployment vertical with 465+ certification scenarios across 22 domain suites, up from 8. Every humanoid OEM can plug in through their VLA model, ROS 2 stack, or teleop pipeline — 3 lines of code.

11 new domain certification suites.

  • Warehouse & logistics
  • Industrial cobot (ISO 10218 / ISO/TS 15066)
  • Inspection & monitoring
  • Emergency & HazMat
  • Healthcare & hospital
  • Laboratory & pharma
  • Agriculture & outdoor
  • Hospitality & retail
  • Construction & maintenance
  • Home & consumer
  • Eldercare & assistive

VLA model adapters.

  • Generic VLA interceptor — wraps any vision-language-action model with policy enforcement
  • NVIDIA GR00T N1 — 29-DOF whole-body control with per-joint safety limits
  • Physical Intelligence pi0 — multi-step trajectory validation before execution
  • Teleoperation safety — latency-aware enforcement for remote human operators

Observability integrations. Foxglove surfaces policy decisions in the robot timeline. Rerun visualizes risk scores in 3D alongside sensor data.

567+ tests passing across the entire platform — policy engine, adapters, certification, insurance, and fleet management.

Read the docs →

World ID Integration: Zero-Knowledge Proof of Human for Robotics

Antihero now supports World ID as a zero-knowledge proof-of-human verification method for high-risk robot actions. Every physical action that exceeds policy thresholds can require cryptographic proof that a verified human — not another AI — authorized it, without revealing the human's identity. See the human authorization documentation for integration details.

How it works. World ID uses Worldcoin's orb-verified biometric identity to generate a zero-knowledge proof of personhood. The proof is bound to the SHA-256 hash of the specific action being authorized, preventing replay attacks. Only a nullifier hash is stored — no biometric data, no personally identifiable information ever touches Antihero's infrastructure.

Why it matters.

  • EU AI Act Article 14 compliance — High-risk AI systems must demonstrate meaningful human oversight. World ID provides cryptographic proof that a verified human was in the loop, not just a checkbox.
  • Insurance impact — Carriers need to know that high-risk actions had human authorization. ZKP verification creates legally admissible evidence of human oversight without privacy exposure.
  • Privacy preservation — Zero-knowledge proofs verify humanness without revealing who the human is. Nullifier hashes prevent double-use while maintaining anonymity.

Fourth verification method. World ID joins TOTP, webhook, and passkey as the fourth proof-of-human verification method in Antihero's human_proof policy requirement. Configure it in any policy YAML with method: world_id:

requirements:
  - kind: human_proof
    params:
      method: world_id
      action: verify
      verification_level: orb
Read the docs →

Antihero Pivots to Humanoid Robotics Insurance Infrastructure

Today we're announcing Antihero's pivot from AI agent security to humanoid robotics insurance infrastructure. The thesis is simple: robots are shipping faster than anyone can insure them, and insurance — not regulation — will determine which robots scale. Read our manifesto for the principles behind this shift.

Why now. Goldman Sachs projects a $38B humanoid robotics TAM. The EU AI Act deadline hits August 2026 with no behavioral safety standard in place. Figure AI is already facing its first deployment lawsuit. CPIC launched the world's first humanoid robot insurance product in September 2025. The gap between what's shipping and what's insurable is growing every quarter.

What we built.

  • Real-time policy engine — evaluates behavioral safety rules in <100μs using a precompiled trie + BDD + bytecode VM architecture
  • MuJoCo digital twin — validates robot actions in simulation before physical execution, with Isaac Sim support
  • ROS 2 + LeRobot adapters — drop-in integration with the two dominant robotics stacks
  • ISO 13482 certification suite — 35 baseline scenarios for personal care and service robot compliance
  • Fleet management dashboard — multi-robot monitoring, bulk certification, real-time safety status
  • Insurance carrier API — webhook-based integration for underwriting, claims, and fleet-level risk assessment

Open source. The policy engine, all adapters (ROS 2, LeRobot, MuJoCo, Isaac Sim), the CLI, and baseline certification scenarios are released under Apache 2.0. Adoption drives the standard.

Proprietary. The certification engine (465+ scenarios across 22 domain suites), fleet dashboard, insurance carrier API, and compliance exports (ISO 13482, EU AI Act, SOC 2) remain proprietary SaaS.

The new domain is antihero.systems. We've also submitted a ROS Enhancement Proposal (REP) draft for behavioral safety policy standards.

Read the docs →

Claude Code Skill + MCP Server

Two ways to use Antihero inside your AI coding environment: a Claude Code skill for guided workflows and an MCP server for direct tool access.

  • Claude Code Skill — Ships in the repo. Clone and the skill loads automatically with 5 guided workflows: generate policies from natural language, certify agents against 465+ scenarios, investigate audit trails, simulate policy changes, and quick-check any action. Includes reference docs, gotchas, and ready-to-use templates.
  • MCP Server (8 tools) — Run antihero serve to expose policy checking, certification, policy guard, audit trail, and risk status as MCP tools. Works with Claude Desktop, Cursor, OpenCode, or any MCP-compatible client. Every tool call is policy-gated and audit-logged.
  • Progressive disclosure — The skill references all 8 MCP tools, CLI commands, YAML schema, and 7 framework adapters on demand. A gotchas file surfaces the 12 most common failure patterns before you hit them.

The skill follows Anthropic's recommended structure. The MCP server follows the FastMCP spec. Both work standalone or together.

View the skill on GitHub →

Certification Pipeline, Policy Guard, and MCP Tools

Four new capabilities that close the loop between certification and policy improvement. Full details in the documentation.

  • Policy Suggestions — certification now auto-generates candidate deny/allow rules from coverage gaps. Each suggestion includes the YAML, severity, MITRE ATT&CK IDs, and rationale. Surfaced for human review, never auto-applied.
  • Policy Guard — test policy changes before deploying. The antihero policy guard CLI command and POST /certification/guard API simulate a proposed rule against all scenarios and report regressions, new passes, and an apply/review/reject recommendation.
  • Smart Initantihero init --interactive now auto-detects your AI framework (CrewAI, LangChain, AutoGen, OpenAI, Anthropic, MCP) and recommends a protection profile with framework-specific integration hints.
  • MCP Tools — three new MCP tools: antihero_certify, antihero_policy_suggestions, and antihero_policy_guard. Use certification and policy simulation directly from Claude Desktop, Cursor, or any MCP client.

The certification scheduler also now uses graduated escalation (warning → alert → slow cadence → disable) instead of hard-disabling after 5 failures.

Read the docs →

Proof-of-Human: Three-Layer Trust Stack

Antihero now answers three trust questions in a single SDK: Who authorized this agent? (principal identity binding), Did the agent stay within policy? (runtime enforcement), and Did a human approve this specific action? (cryptographic approval signatures).

  • Principal Identity — bind agent actions to verified humans via OAuth, passkey, or SAML. Delegation chains tracked and depth-limited by policy.
  • Human Proof Requirement — new human_proof policy requirement with TOTP and webhook verifiers. Approval signatures cover the action hash (SHA-256), preventing replay attacks.
  • Approval Dashboard — pending queue, approve/deny with audit trail, approval history.
  • 7th Certification Suite — 13 new scenarios testing identity validation, delegation enforcement, and approval gates.

Maps to EU AI Act Article 14 (human oversight) and NIST AI RMF. No other SDK covers all three layers.

Read the docs →

Comprehensive Documentation

Full API reference and integration guide now live at /docs. Covers every endpoint, both SDKs (Python and TypeScript), policy YAML format, compliance exports, and insurance claims.

Whether you're evaluating a single tool call or building a full policy hierarchy across teams, the docs walk through it end to end.

GitHub OAuth

You can now sign in with GitHub alongside Google and email/password. If your GitHub email matches an existing account, it links automatically.

Cloudflare Turnstile CAPTCHA

Login and signup forms are now protected by Cloudflare Turnstile. Invisible when you're human, challenge when suspicious. Blocks credential stuffing from botnets without adding friction for real users.

Two-Factor Authentication

TOTP-based 2FA is now available for all accounts. Enable it from the dashboard settings. Works with any authenticator app (Google Authenticator, 1Password, Authy).

Antihero v0.2.0

Antihero is live. The policy engine for AI agents — declarative enforcement, cryptographic audit trails, and AI liability insurance.

  • Policy engine — YAML rules with tiered composition, deny-dominates, fail-closed
  • Hash-chained audit — SHA-256 + RFC 8785 JCS canonicalization, tamper-evident
  • AI insurance — File claims against your audit trail, risk-adjusted premiums
  • Compliance exports — SOC 2, HIPAA, EU AI Act, NIST AI RMF
  • Four plans — Watchdog (free), Enforcer ($29/mo), Sentinel ($99/mo), Sovereign (custom)

Get started with pip install antihero or npm install @antihero/sdk.

Read the docs →

Whitepaper Published

Our 67-page whitepaper is now available: Antihero: A Layered Architecture for AI Agent Policy Enforcement, Cryptographic Audit, and Liability Insurance.

Covers the formal model, five guarantees (fail-closed, deterministic, deny-dominates, monotonic safety, hash-chain integrity), threat taxonomy, and insurance economics.

Download PDF →

Resources

Documentation Actuarial Data Specification Manifesto Home