An agentic regulatory attestation pipeline

Four questions every compliance team gets asked.

Attestloop is a multi-agent regulatory attestation pipeline. v2.0.0 ships five LLM-driven agents — Classifier, Extractor, Mapper, Critic, Clarifier — connected by a typed state machine with conditional routing and parallel execution. Currently a research artefact demonstrating the approach against the EU AI Act and NIST AI RMF. Total cost per attestation: $2.09. Total wall-clock: 13 minutes. Open source under Apache 2.0.

Currently a research artefact built by Simon Newton. v2.0.0 — see GitHub for source.

Question 01 / 04

Are we covered?

When a regulator publishes something new, leadership wants to know whether it affects us — and whether our existing controls already handle it. The honest answer is usually "we don't know yet, give us two weeks." Attestloop closes the per-publication slice of that gap to about fifteen minutes: read the document, extract the binding obligations, map them against a named control framework, surface the gaps. Multi-source monitoring across regulators is v3 work.

Source: Commission Guidelines on prohibited AI practices · Mapped against: NIST AI Risk Management Framework 1.0

71

binding obligations identified

61

mapped to existing controls

10

framework gaps surfaced

Run completed 2026-05-01 in 13 minutes 26 seconds · $2.09 total cost

Classifier confirms scope; Clarifier handles ambiguous cases; Extractor identifies binding obligations from the source PDF; Mapper compares each to the active control framework with explicit confidence floors; Critic second-passes any low-confidence mappings.

Question 02 / 04

What do we need to do, and by when?

Once a regulation is in scope, compliance teams need a list of concrete actions, owners, and deadlines — not "review your AI governance posture" but actual tasks tied to actual control IDs.

ID Article Requirement Scope Deadline
EUAIA-OBL-001 Article 5(1)(a) Providers and deployers of AI systems that deploy manipulative or deceptive techniques shall comply with relevant appli… Providers and deployers of AI systems deploying manipulative or deceptive techn… 2025-02-02
EUAIA-OBL-008 Section 2.4 / Article 3(3) AI Act; recital context of Article 5 AI Act Providers must ensure their AI systems meet all relevant requirements before placing them on the market or putting them… Providers of AI systems placed on the EU market or put into service in the Union before placing on the market or putting into service
EUAIA-OBL-020 Section 2.9.1 (paragraph 53) Member States must designate their competent market surveillance authorities by 2 August 2025. Member States 2025-08-02
EUAIA-OBL-071 Article 113 Providers and deployers of AI systems shall take necessary measures to ensure that they do not place on the market, put… Providers and deployers of AI systems subject to Article 5 AI Act prohibitions 2025-02-02

Excerpt from the v6 canonical run. The full report contains 71 obligations with mapped control IDs, proposed actions, and Critic flags on low-confidence mappings. View the full report on GitHub.

Each obligation extracted with source paragraph, regulator-defined scope, deadline where specified, and evidence required. Mapper produces 1–3 control mappings per obligation, each with confidence score and reasoning anchored in specific control text.

Question 03 / 04

Prove it.

At audit or board review time, compliance teams need evidence: which obligations were assessed, when, by whom, against which version of the regulation, with what conclusion. This is what most existing tools do badly because they weren't designed for AI-pace regulatory change.

Provenance footer · v6 canonical run

- Regulation: EU Artificial Intelligence Act (Regulation 2024/1689) (`eu_ai_act`, EU)
- Framework: NIST AI Risk Management Framework 1.0 (`nist_ai_rmf`, 72 controls)
- Classifier model: `claude-haiku-4-5-20251001`
- Extractor model: `claude-sonnet-4-6`
- Mapper model: `claude-sonnet-4-6`
- Critic model: `claude-sonnet-4-6`
- Classifier prompt SHA-256: `b59962514c4342fc1d6181fb3964dd366c8f6e450218d4e4ff3b02c50038b099`
- Extractor prompt SHA-256: `0828eebb6dd8ad34d769f36773f14888bb048bcdc5ca02e940509fd42701b7ba`
- Mapper prompt SHA-256: `9090c11e1e4b04f07ab617e765a4d0342497ebdccdc2faa88410b8d2424d9cfd`
- Critic prompt SHA-256: `8de784ba4876b414c22f901c530dd2321c591eeac2c9fd36481bc3d0231979c7`
- Critic decisions: 44 reviewed (15 flagged)
- Started at: 2026-05-01T00:08:08.730468+00:00
- Total cost: $2.0926
- Total tokens: 218,476 input / 62,434 output

Every LLM call also logs input, output, model, prompt version, cost, and latency to immutable JSON. Each run produces a hashed provenance footer that survives audit.

  • Per-call audit trail
  • Hashed prompt versions
  • Sourced control library
  • Reproducible against same source
  • Second-pass review by independent Critic agent

Provenance is a first-class output, not an afterthought. The system is designed so that every claim in the report links back through a chain of inputs, prompts, model versions, and timestamps that an auditor can verify.

Question 04 / 04

What's coming next?

Boards ask about the regulatory pipeline, not just last week's publication. What's in flight at the regulator that we should be preparing for? Today (v2.0.0) Attestloop answers this on demand — point it at any regulator URL for ad-hoc triage. Scheduled monitoring of regulator sources is the v3 work.

Today (v2.0.0)

On-demand. Point Attestloop at any regulator URL and it tells you whether the publication is in scope and what obligations it contains. Useful for ad-hoc triage but not for forward monitoring.

Architected, shipping in v3

Watcher agent polls regulator sources on a schedule — EUR-Lex, FCA Handbook, EBA, ICO, ESMA — deduplicates against history, and surfaces new in-scope publications before they hit your inbox manually. Per-regulator scrape adapters in development; the architecture and registry already support multi-source polling.

v3 backlog

Forward-looking signals from regulator pipelines: published consultations, draft technical standards, work programmes. Distinguishes between 'binding next month' and 'consultation closing in March' so compliance teams can prioritise. Gives boards real horizon answers, not just last-week summaries.

Today: ad-hoc triage. v3: scheduled monitoring. v3+: regulator pipeline awareness. Honesty about which question Attestloop answers when matters as much as the answers themselves.

The Watcher agent is the v3 work that converts Attestloop from on-demand triage to scheduled regulatory monitoring. Multi-source polling, per-regulator adapters, deduplication against historical runs, and alerting integration are all tracked in the GitHub backlog.

The Watcher agent is the v3 work that converts Attestloop from on-demand triage to scheduled regulatory monitoring. Multi-source polling, per-regulator adapters, deduplication against historical runs, and alerting integration are all tracked in the GitHub backlog.

Pipeline

How the pipeline works

The diagram below is generated directly from the compiled LangGraph state machine — scripts/render_graph.py writes the Mermaid source to docs/orchestration/v6_pipeline.mmd, which the page renders client-side. If the orchestration changes, the diagram regenerates automatically. v2.0.0's five-agent topology with conditional routing is what's drawn here. Click any agent below for role, prompt, sample input and output, and per-call metrics from the canonical v6 run.

Compiled LangGraph state machine · v2.0.0

LLM agents · click to expand

Output

What it produces

Seven runs against the same source document. v1 through v5 walk the v1 architecture; v5_eq and v6 isolate the orchestration impact in v2.0.0.

Version Approach Obligations Mappings Unmapped Cost (USD) Runtime
v1 Truncated extractor (50 K char cap), mapper unconstrained 18 54 0 $0.62 5m 17s
v2 Chunked extractor (12 chunks), mapper unconstrained 68 203 0 $2.61 21m 22s
v3 Mapper confidence floor 0.75, no slot-filling 72 164 12 $2.78 41m 35s
v4 Anthropic prompt caching on mapper controls list 69 124 24 $1.19 14m 51s
v5 Fuzzy dedup, title fallback, null rendering, mapper nudge 71 154 13 $1.31 17m 17s
v5_eq v6 code, V5_EQUIVALENT config (serial Mapper, no Critic, no Clarifier) 72 157 13 $1.31 12m 38s
v6 LangGraph + Critic + Clarifier + 8-way concurrent Mapper 71 160 10 $2.09 13m 26s

Six iterations from a 50,000-character truncated baseline to a LangGraph state machine with second-pass review and parallel execution. Each step kept the previous quality while changing one variable. The v6 canonical run produces the same Commission Guidelines attestation in 13 minutes 26 seconds at $2.09 — Mapper wall-clock 8.13× faster than v5_eq, total wall-clock essentially flat because the Critic adds ~3 minutes of sequential review work.

v3 → v4 caching delivered a 30× return on the cache write cost. v4 → v5 dedup removed 12 paraphrased duplicates that substring-match missed. v5 → v6 swapped sequential function calls for a typed state machine and 8-way concurrent Mapper. The v5_eq vs v6 pair isolates the orchestration impact under identical code. Read the writeup for the engineering detail.

Read the writeup →

Reproducibility

Run it yourself

Live re-run capability is tracked for v3. The cached v6 run above shows the full output — every prompt, every LLM response, every cost line, every obligation, every mapping. The Python source is on GitHub; the pipeline runs end-to-end on a single machine with an Anthropic API key.

v3 feature