diff --git a/CHANGELOG.md b/CHANGELOG.md
index 7fb434a..cd0184a 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -42,15 +42,15 @@ rebuttable-presumption hook), transposition deadline 9 December 2026.
   statistics over Vaara's per-action conformal prediction intervals
   alongside the standard binomial CI. Carries the same non-parametric
   coverage guarantee with no distributional assumption.
-  Standard OVERT verifiers ignore the extension; Vaara-aware
+  Standard OVERT verifiers ignore the extension. Vaara-aware
   verifiers cross-check it against the per-action receipts.
 - **vaara-bench-v1.** Versioned adversarial-detection benchmark with
   frozen corpus (`bench/adversarial_corpus.jsonl`, SHA-256
   `7a3219776e1c93a5127ab3b63832d73ba75f32fa044cabdbaa4e5d7088b33ff2`),
   frozen methodology (`bench/scorer_eval.py`), and frozen headline
   numbers under v0.11.0 (soft TPR 100%, soft FPR 20%, hard TPR
-  28.85%, hard FPR 0%). Spec doc at `bench/vaara-bench-v1.md`;
-  machine-readable results at `bench/vaara-bench-v1-results.json`.
+  28.85%, hard FPR 0%). Spec doc at `bench/vaara-bench-v1.md`.
+  Machine-readable results at `bench/vaara-bench-v1-results.json`.
   Apache-2.0 licensed.
 - 18 new tests in `tests/test_attestation_s3p.py` covering
   Clopper-Pearson against textbook values, the k=0 and k=n
@@ -100,7 +100,7 @@ statement in COMPLIANCE.md.
   request commitments per Annex B.4 and SHA-256 encoder-identity
   derivation. IEEE-754 floats are rejected at the canonical-encoding
   boundary per Protocol Profile 1.0 numeric rules. Vaara operates as
-  the **Arbiter** in OVERT terms; external Independent Attestation
+  the **Arbiter** in OVERT terms. External Independent Attestation
   Providers promote AAL-3 emission to AAL-4 by attaching Phase 3
   notary signatures and transparency-log inclusion proofs. Public API:
   `BaseEnvelope`, `emit_base_envelope`, `verify_base_envelope`,
@@ -118,7 +118,7 @@ statement in COMPLIANCE.md.
   3(45), not a qualified trust service under Article 3(16)) and a
   position statement relative to OVERT 1.0 (Glacis Technologies). Vaara
   is structurally independent of the agent it governs and maps to
-  OVERT AAL-3 operator-controlled attestation; reaching AAL-4 requires
+  OVERT AAL-3 operator-controlled attestation. Reaching AAL-4 requires
   pairing Vaara with an external Independent Attestation Provider. The
   design admits an external IAP layer without internal change.
 
@@ -127,7 +127,7 @@ statement in COMPLIANCE.md.
 **Theme: Vaara as the kernel others build around.** v0.10.0 ships the
 network-callable surface, the auditor-facing evidence artefact, and the
 offline-verifiable receipt pair. Each of the three pieces is additive
-and backward-compatible; together they reposition Vaara from a Python
+and backward-compatible. Together they reposition Vaara from a Python
 library to a runtime kernel that control planes, audit consumers, and
 orchestration frameworks reference. The HTTP contract at
 `docs/openapi.yaml` is versioned `/v1/` independently of the project
@@ -139,13 +139,13 @@ version, following the OPA pattern.
   contract in `docs/openapi.yaml`. Endpoints: `POST /v1/score`,
   `POST /v1/score/outcome`, `POST /v1/audit/events`,
   `GET /v1/audit/actions/{action_id}/chain`, `POST /v1/audit/verify`,
-  `GET /v1/server`, `GET /v1/health`. The spec is authoritative; the
+  `GET /v1/server`, `GET /v1/health`. The spec is authoritative. The
   reference server in `src/vaara/server/` is a FastAPI implementation
   suitable for local development and modest production loads.
 - **`vaara serve`** CLI subcommand.
 - **OpenAPI 3.1 contract at `docs/openapi.yaml`.** Stable v1 surface,
   intended as the integration point for control planes, orchestration
-  frameworks, and audit consumers. Vaara defines the interface; the
+  frameworks, and audit consumers. Vaara defines the interface. The
   vendors call it.
 - 11 new HTTP server tests (`tests/test_server.py`).
 - **Auditor-facing evidence report rendering.** New module
@@ -193,7 +193,7 @@ it governs.
   `validate_source(source, fmt="auto")` combines load and check so a
   single call yields `(policy, report)` or `(None,
   report-with-error)`. Stable JSON shape via `ValidationReport.to_dict()`.
-- **`vaara.policy.test_cases` module — Conftest analog for Vaara
+- **`vaara.policy.test_cases` module - Conftest analog for Vaara
   policies.** `evaluate(policy, action_class, risk_score,
   matched_sequences=())` is the underlying primitive: applies any
   matched sequence pattern boosts (capped at 1.0), resolves the
@@ -216,7 +216,7 @@ it governs.
   subcommands. Both honour standard CI exit codes: validate returns
   1 on parse errors (warnings do not flip), test returns 1 on any
   failed case (and 2 if the policy itself fails to parse).
-- **`examples/policies/test_cases.yaml`** — six worked test cases
+- **`examples/policies/test_cases.yaml`** - six worked test cases
   exercising thresholds, sequence-pattern boost, default and
   article-matched escalation routes against
   `examples/policies/full.yaml`.
@@ -235,7 +235,7 @@ it governs.
 
 ### Note
 Backwards-compatible. Pure addition. No existing module signatures
-change. `Policy` and the load path are unchanged; the new modules
+change. `Policy` and the load path are unchanged. The new modules
 sit beside them under `vaara.policy.*`.
 
 ### Provenance note
@@ -273,7 +273,7 @@ publishes, the schema_version bumps and new fields land additively.
 - **`vaara trail export-incident` CLI subcommand.** Reads a trail JSONL
   plus an operator-supplied incident metadata JSON, writes the report
   to the output path. Picks the most recent trigger-eligible record by
-  default; explicit `--trigger-record-id ID` overrides. No external
+  default. Explicit `--trigger-record-id ID` overrides. No external
   template dependency, zero new runtime deps.
 - **`tests/test_incident_export.py`** covers schema shape, deadline
   mapping per Article 3(49) sub-category, trigger-event validation,
@@ -290,7 +290,7 @@ concept, so no `AuditRecord` column was added.
 **Theme: human-in-the-loop review queue (Article 14).** Adds the
 storage layer and operator surface that turn an `escalate` decision
 into a substantive Article 14(4)(d) override path. The pipeline
-already wrote `ESCALATION_SENT` for every escalated action; with a
+already wrote `ESCALATION_SENT` for every escalated action. With a
 queue wired in, those actions now wait in a queryable place with
 their conformal interval, get claimed by an operator, and produce an
 `ESCALATION_RESOLVED` audit record when resolved.
@@ -302,20 +302,20 @@ their conformal interval, get claimed by an operator, and produce an
   `pending → claimed → resolved` happy path, `pending → expired`
   stale path. Resolutions: `allow`, `deny`, `abstain`. `enqueue`
   records each item with the conformal interval, risk signals,
-  bucket category, and request parameters/context as JSON; the
+  bucket category, and request parameters/context as JSON. The
   interval is what makes Article 14 oversight substantive rather
-  than cosmetic (see `COMPLIANCE.md`). `claim` is optimistic —
+  than cosmetic (see `COMPLIANCE.md`). `claim` is optimistic -
   concurrent claim races resolve with one winner and
   `InvalidTransitionError` for the loser. `resolve` accepts an
   optional `trail` and writes `ESCALATION_RESOLVED` so the Article
   14(4)(d) evidence row lands on the hash chain. `expire_stale`
-  marks pending items past a timeout; claimed items are left alone
+  marks pending items past a timeout. Claimed items are left alone
   since they are under active review.
 - **`InterceptionPipeline(review_queue=...)`.** Optional constructor
   parameter. When supplied, every `escalate` decision is enqueued
   alongside the existing `ESCALATION_SENT` audit record. Default
   `None` preserves prior behaviour bit-for-bit. Queue write failure
-  logs and continues — the action is already gated by the escalate
+  logs and continues - the action is already gated by the escalate
   verdict and the audit record stands.
 - **`vaara review` CLI.** Subcommands `list`, `show`, `claim`,
   `resolve`, `expire`. `resolve --audit-db PATH` writes the
@@ -333,8 +333,8 @@ their conformal interval, get claimed by an operator, and produce an
   subcommand including `resolve --audit-db` writing the audit row.
 
 ### Note
-Backwards-compatible. Pure addition. No existing schemas migrate;
-the queue lives in its own DB file with its own `review_queue_meta`
+Backwards-compatible. Pure addition. No existing schemas migrate.
+The queue lives in its own DB file with its own `review_queue_meta`
 schema-version row.
 
 ## [0.7.0] - 2026-05-10
@@ -366,41 +366,41 @@ Backwards-compatible release. All four PRs are additive. Mondrian is opt-in via
 **Theme: documentation sync to PyPI.** v0.6.0 shipped the functional changes (policy DSL, retention purge, transparency taxonomy, distribution-shift / stack-ablation / PAIR evals, lint sweep) but the README, library docstring, example file headers, and PyPI tagline stayed at the v0.5.0 framing. PyPI's package page kept publishing pre-rebalance numbers and the wrong default threshold. v0.6.1 ships only the documentation cleanup so new PyPI installs see the current state.
 
 ### Changed
-- `README.md`: replaced the "Numbers" section with the v0.6 distribution-shift table (97.1% recall / 70.0% FPR hand-curated held-out; 95.2% / 87.5% LLM-generated in-sample) plus PAIR ASR 0.0% (0/25). Threshold default 0.5 -> 0.55, corpus description updated to the 5,955-entry rebalanced corpus, threshold-direction note corrected (recall drops as threshold rises). PR #44.
-- `src/vaara/adversarial_classifier.py`: rewrote the module docstring. Removed all version-bound numbers; readers now point at README + COMPLIANCE so the docstring does not go stale on every release. PR #44.
+- `README.md`: replaced the "Numbers" section with the v0.6 distribution-shift table (97.1% recall / 70.0% FPR hand-curated held-out. 95.2% / 87.5% LLM-generated in-sample) plus PAIR ASR 0.0% (0/25). Threshold default 0.5 -> 0.55, corpus description updated to the 5,955-entry rebalanced corpus, threshold-direction note corrected (recall drops as threshold rises). PR #44.
+- `src/vaara/adversarial_classifier.py`: rewrote the module docstring. Removed all version-bound numbers. Readers now point at README + COMPLIANCE so the docstring does not go stale on every release. PR #44.
 - `examples/adversarial_classifier.py`: ship-note threshold 0.8 -> 0.55. PR #44.
 - `scripts/classifier_vs_heuristic.py`: clarified the script is the v0.5.0 historical reproducer, not the current production training path. PR #44.
 - `pyproject.toml`: rephrased description for cleaner PyPI tagline rendering. PR #45.
 
 ### Note
-No functional code changes. v0.6.0 users are on the same code; v0.6.1 only refreshes documentation surfaces visible to new PyPI installs and to anyone reading the package source.
+No functional code changes. v0.6.0 users are on the same code. V0.6.1 only refreshes documentation surfaces visible to new PyPI installs and to anyone reading the package source.
 
 ## [0.6.0] - 2026-04-27
 
 **Theme: standards alignment + legibility.** v0.5.x was the capability axis (jailbreak coverage closed, classifier rebalanced). v0.6 is the legibility axis: policies become readable, audit records become standards-aligned, adversarial numbers become honest, architecture contribution becomes documented.
 
 ### Added
-- **`vaara.policy` package — JSON-native policy loader plus optional YAML via `vaara[yaml]` extra.** Frozen dataclasses for action classes, threshold curves, sequence patterns, and escalation routes. Hand-rolled validation with field-path error messages. Reuses existing `vaara.taxonomy.actions` enums verbatim. Threshold partial-overrides supported (set just `deny`, inherit default `escalate`). Implements Sketch A from the v0.6 DSL design exploration; embedded Python DSL (Sketch B) and standalone DSL (Sketch C) stay deferred to v0.7+ pending external pull.
-- **`vaara trail purge --db PATH --retention-days N (--tenant TID | --all-tenants) [--dry-run]` CLI subcommand** plus `SQLiteAuditBackend.purge_older_than(seconds, *, dry_run=False)` Python API. Article 12(2) retention enforcement. Tenant scoping is required: pick `--tenant TID` for a single tenant or `--all-tenants` explicitly, so a shared multi-tenant audit DB can never be silently purged across all tenants. Hash-chain integrity: surviving records still reference deleted predecessors via `previous_hash`, leaving a documented seam at the retention boundary that subsequent loads expose as a hash mismatch. Intended workflow: export a signed handoff zip BEFORE purging, archive externally, then purge. The signed zip remains self-consistent forever; the live DB chain has the seam.
-- **prEN ISO/IEC 12792 four-axis transparency taxonomy on `AuditRecord`.** Four optional fields (`system_operation`, `data_usage`, `decision_making`, `limitations`) with default-classification heuristic per `EventType`. Per-record override via construction kwargs. NOT tamper-evident in v0.6 — fields are metadata annotations excluded from `record_hash` so pre-v0.6 chains stay valid. v0.7+ may add a separate signing mechanism if compliance requires.
-- **`scripts/eval_distribution_shift.py`** — runs the full Vaara stack against the adversarial corpus with per-source tagging (hand-curated vs LLM-generated). Reports recall and FPR per source/class.
-- **`scripts/eval_stack_ablation.py`** — runs three configurations (heuristic-only, classifier-only, full-stack) against the same corpus. Quantifies the independent contribution of each layer.
-- **`scripts/eval_pair_attack.py`** — PAIR (Chao et al. 2023) iterative adaptive attacker. Uses an OpenAI-compatible vLLM endpoint for both attacker and judge roles. Zero new runtime deps (uses `urllib.request`).
+- **`vaara.policy` package - JSON-native policy loader plus optional YAML via `vaara[yaml]` extra.** Frozen dataclasses for action classes, threshold curves, sequence patterns, and escalation routes. Hand-rolled validation with field-path error messages. Reuses existing `vaara.taxonomy.actions` enums verbatim. Threshold partial-overrides supported (set just `deny`, inherit default `escalate`). Implements Sketch A from the v0.6 DSL design exploration. Embedded Python DSL (Sketch B) and standalone DSL (Sketch C) stay deferred to v0.7+ pending external pull.
+- **`vaara trail purge --db PATH --retention-days N (--tenant TID | --all-tenants) [--dry-run]` CLI subcommand** plus `SQLiteAuditBackend.purge_older_than(seconds, *, dry_run=False)` Python API. Article 12(2) retention enforcement. Tenant scoping is required: pick `--tenant TID` for a single tenant or `--all-tenants` explicitly, so a shared multi-tenant audit DB can never be silently purged across all tenants. Hash-chain integrity: surviving records still reference deleted predecessors via `previous_hash`, leaving a documented seam at the retention boundary that subsequent loads expose as a hash mismatch. Intended workflow: export a signed handoff zip BEFORE purging, archive externally, then purge. The signed zip remains self-consistent forever. The live DB chain has the seam.
+- **prEN ISO/IEC 12792 four-axis transparency taxonomy on `AuditRecord`.** Four optional fields (`system_operation`, `data_usage`, `decision_making`, `limitations`) with default-classification heuristic per `EventType`. Per-record override via construction kwargs. NOT tamper-evident in v0.6 - fields are metadata annotations excluded from `record_hash` so pre-v0.6 chains stay valid. v0.7+ may add a separate signing mechanism if compliance requires.
+- **`scripts/eval_distribution_shift.py`** - runs the full Vaara stack against the adversarial corpus with per-source tagging (hand-curated vs LLM-generated). Reports recall and FPR per source/class.
+- **`scripts/eval_stack_ablation.py`** - runs three configurations (heuristic-only, classifier-only, full-stack) against the same corpus. Quantifies the independent contribution of each layer.
+- **`scripts/eval_pair_attack.py`** - PAIR (Chao et al. 2023) iterative adaptive attacker. Uses an OpenAI-compatible vLLM endpoint for both attacker and judge roles. Zero new runtime deps (uses `urllib.request`).
 - **`[yaml]` optional extra in `pyproject.toml`** (`pyyaml>=6.0`). Core `dependencies = []` preserved.
 - **`examples/policies/minimal.json` and `full.yaml`** as reference policies.
-- **COMPLIANCE.md gains "EU AI Act Annex IV evidence sections"** (maps Vaara contribution per §1–§9; direct fill on §3, §5, §9; contributes on §2, §4, §6, §7; out of scope for §1, §8) **and "CEN-CENELEC harmonised standards alignment"** (per-standard table for ISO/IEC 42001, prEN 18286, prEN 18228, ISO/IEC 42006, prEN ISO/IEC 24970, prEN 18229-1, prEN ISO/IEC 12792).
-- **`scripts/lint_full.sh` pre-push lint sweep** — chains `ruff` (style + correctness), `bandit` (security), `mypy` (types — strict on `vaara.policy`, lenient on legacy modules), and `pytest`. Documented in CONTRIBUTING.md. Catches CodeRabbit-class findings before they hit a PR review round-trip. New dev extras: `bandit>=1.7.5`, `mypy>=1.8`. Bandit configured in `pyproject.toml` to skip B608 across `audit/sqlite_backend.py` (all f-string SQL there interpolates only internally-controlled tenant clauses, not user input). Two `# nosec` annotations document the remaining trusted-bundle and synthetic-trace-RNG sites.
+- **COMPLIANCE.md gains "EU AI Act Annex IV evidence sections"** (maps Vaara contribution per §1–§9. Direct fill on §3, §5, §9. Contributes on §2, §4, §6, §7. Out of scope for §1, §8) **and "CEN-CENELEC harmonised standards alignment"** (per-standard table for ISO/IEC 42001, prEN 18286, prEN 18228, ISO/IEC 42006, prEN ISO/IEC 24970, prEN 18229-1, prEN ISO/IEC 12792).
+- **`scripts/lint_full.sh` pre-push lint sweep** - chains `ruff` (style + correctness), `bandit` (security), `mypy` (types - strict on `vaara.policy`, lenient on legacy modules), and `pytest`. Documented in CONTRIBUTING.md. Catches CodeRabbit-class findings before they hit a PR review round-trip. New dev extras: `bandit>=1.7.5`, `mypy>=1.8`. Bandit configured in `pyproject.toml` to skip B608 across `audit/sqlite_backend.py` (all f-string SQL there interpolates only internally-controlled tenant clauses, not user input). Two `# nosec` annotations document the remaining trusted-bundle and synthetic-trace-RNG sites.
 
 ### Changed
-- **Audit DB schema v2 → v3.** Migration `_MIGRATIONS[2]` adds four nullable transparency columns to `audit_records`. Pre-v0.6 records get NULL for the new columns; their stored `record_hash` is preserved (NOT re-hashed on load), so chain verification of historical records continues to work.
+- **Audit DB schema v2 → v3.** Migration `_MIGRATIONS[2]` adds four nullable transparency columns to `audit_records`. Pre-v0.6 records get NULL for the new columns. Their stored `record_hash` is preserved (NOT re-hashed on load), so chain verification of historical records continues to work.
 - **COMPLIANCE.md "Current limits"** replaced placeholder bullets with v0.6 measurement results:
   - **Distribution-shift split.** Hand-curated (held-out, 250): attack recall 97.1% / benign FPR 70.0%. LLM-generated (in-sample, 5,705): attack recall 95.2% / benign FPR 87.5%. The 18pp benign-FPR gap is the dominant distribution-shift signal.
-  - **Stack composition.** `heuristic_only` recall 35% / 63%. `classifier_only` recall 94% / 86%. `full_stack` recall 97% / 98%. Layers not redundant — heuristic catches a small set of attacks the classifier misses (justifies the ensemble). Most full-stack benign FPR comes from heuristic ESCALATEs, not classifier upgrades.
-  - **PAIR adaptive-attacker calibration.** Qwen2.5-32B-Instruct as both attacker and judge, 25 hand-curated jailbreak seeds, max 5 iterations: **ASR 0.0% (0/25)**. NOT a claim of imperviousness to all adaptive attackers — stronger attacker (70B+), longer iteration budgets, or alternate strategies (multi-turn drift, language-switch, obfuscation) might produce non-zero ASR.
+  - **Stack composition.** `heuristic_only` recall 35% / 63%. `classifier_only` recall 94% / 86%. `full_stack` recall 97% / 98%. Layers not redundant - heuristic catches a small set of attacks the classifier misses (justifies the ensemble). Most full-stack benign FPR comes from heuristic ESCALATEs, not classifier upgrades.
+  - **PAIR adaptive-attacker calibration.** Qwen2.5-32B-Instruct as both attacker and judge, 25 hand-curated jailbreak seeds, max 5 iterations: **ASR 0.0% (0/25)**. NOT a claim of imperviousness to all adaptive attackers - stronger attacker (70B+), longer iteration budgets, or alternate strategies (multi-turn drift, language-switch, obfuscation) might produce non-zero ASR.
 
 ### Deferred to v0.7+
-- **prEN ISO/IEC 24970 field-alias layer** — pending public final of the standard. Will land when 24970 publishes.
-- **DORA mapping refinement** — pending deployer-side signal. Conservative defaults shipped in v0.5.3 stay until a financial deployer's input refines them.
+- **prEN ISO/IEC 24970 field-alias layer** - pending public final of the standard. Will land when 24970 publishes.
+- **DORA mapping refinement** - pending deployer-side signal. Conservative defaults shipped in v0.5.3 stay until a financial deployer's input refines them.
 
 ### Reproducible artifacts
 - `tests/adversarial/distribution_shift_v0_5_3.json`
@@ -525,7 +525,7 @@ v0.5.1 remains on PyPI but ships a broken classifier. Upgrade to 0.5.2 to get th
 
 ### Changed
 - `AdversarialClassifier` retrained on an expanded benign corpus to reduce false-positive rate in live agent traffic.
-- Recommended operating threshold changed from `0.8` to `0.3` — the added benigns shifted the score distribution, and 0.3 is now the optimal balanced-accuracy point.
+- Recommended operating threshold changed from `0.8` to `0.3` - the added benigns shifted the score distribution, and 0.3 is now the optimal balanced-accuracy point.
 
 ### Added
 - `tests/adversarial/benign_generated/BT-new-http_post.jsonl` (170 variants)
@@ -541,7 +541,7 @@ v0.5.1 remains on PyPI but ships a broken classifier. Upgrade to 0.5.2 to get th
 
 ### Known regressions (disclosed)
 The new benigns shifted the decision surface toward allow. Per-category accuracy regressed in three attack categories:
-- `data_exfil`: 0% (was 28.6% heuristic baseline — classifier now worse than heuristic here)
+- `data_exfil`: 0% (was 28.6% heuristic baseline - classifier now worse than heuristic here)
 - `destructive_actions`: 25% (was 87.5% heuristic)
 - `jailbreak`: 0% (was 100% heuristic)
 
@@ -550,12 +550,12 @@ The heuristic scorer retains strong coverage in these categories. Stack both rat
 ## [0.5.0] - 2026-04-23
 
 ### Added
-- `AdversarialClassifier` — opt-in XGBoost scorer for adversarial tool-call detection. Install with `pip install vaara[ml]`.
-- `src/vaara/data/adversarial_classifier_v1.joblib` — 295 KB pre-trained bundle shipped with the wheel.
-- `scripts/classifier_vs_heuristic.py` — reproducible comparison harness (by-seed train/test split, no leakage).
-- `tests/adversarial/generated/` — 1945 LLM-generated adversarial variants across 8 attack categories (Qwen2.5-3B on MI300X, batched).
-- `tests/adversarial/benign_generated/` — 635 benign variants (487 broad + 148 targeted at http_post/send_email/shell_exec legit uses).
-- `[project.optional-dependencies.ml]` — `xgboost`, `scikit-learn`, `joblib`, `numpy`. Default install stays zero-dep.
+- `AdversarialClassifier` - opt-in XGBoost scorer for adversarial tool-call detection. Install with `pip install vaara[ml]`.
+- `src/vaara/data/adversarial_classifier_v1.joblib` - 295 KB pre-trained bundle shipped with the wheel.
+- `scripts/classifier_vs_heuristic.py` - reproducible comparison harness (by-seed train/test split, no leakage).
+- `tests/adversarial/generated/` - 1945 LLM-generated adversarial variants across 8 attack categories (Qwen2.5-3B on MI300X, batched).
+- `tests/adversarial/benign_generated/` - 635 benign variants (487 broad + 148 targeted at http_post/send_email/shell_exec legit uses).
+- `[project.optional-dependencies.ml]` - `xgboost`, `scikit-learn`, `joblib`, `numpy`. Default install stays zero-dep.
 
 ### Benchmarks
 On the by-seed held-out test (50 adversarial seeds held out entirely, plus 60 benign entries), classifier at threshold 0.8:
@@ -564,7 +564,7 @@ On the by-seed held-out test (50 adversarial seeds held out entirely, plus 60 be
 - Balanced accuracy: 74% (vs 50% heuristic)
 - Latency: 140 µs mean, 210 µs p99 (commodity CPU, no GPU at inference)
 
-Live dogfood on 174 Qwen-generated prompts with an end-to-end LangChain-style agent loop: recall 70%, FPR 51% — real-world distribution is tougher than the clean held-out test. Ship with `decision="escalate"` in production, not `decision="deny"`.
+Live dogfood on 174 Qwen-generated prompts with an end-to-end LangChain-style agent loop: recall 70%, FPR 51% - real-world distribution is tougher than the clean held-out test. Ship with `decision="escalate"` in production, not `decision="deny"`.
 
 ## [0.4.4] - 2026-04-22
 
diff --git a/COMPLIANCE.md b/COMPLIANCE.md
index 0843fe3..ef77df3 100644
--- a/COMPLIANCE.md
+++ b/COMPLIANCE.md
@@ -31,7 +31,7 @@ types that populate evidence for it. The list matches the default
 | **11(1)** | Technical Documentation | Checked outside the audit trail. Vaara does not replace the Annex IV technical file. |
 | **12(1)** | Record-Keeping (Logging) | Every `ACTION_REQUESTED`, `RISK_SCORED`, and `DECISION_MADE` is written to a hash-chained, tamper-evident trail. See "Audit trail integrity" below. |
 | **13(1)** | Transparency and Provision of Information | `RISK_SCORED` and `DECISION_MADE` records carry the risk score, the interval, the decision, and the reason string shown to the operator. |
-| **14(1)** | Human Oversight -- Design | `ESCALATION_SENT` and `ESCALATION_RESOLVED` events prove the oversight path exists and was exercised. The `vaara.audit.review_queue` storage layer turns `escalate` into a substantive queued-for-review step rather than a fire-and-forget log line; the `vaara review` CLI is the operator surface. |
+| **14(1)** | Human Oversight -- Design | `ESCALATION_SENT` and `ESCALATION_RESOLVED` events prove the oversight path exists and was exercised. The `vaara.audit.review_queue` storage layer turns `escalate` into a substantive queued-for-review step rather than a fire-and-forget log line. The `vaara review` CLI is the operator surface. |
 | **14(4)(d)** | Human Oversight -- Override Capability | `ESCALATION_RESOLVED` and `POLICY_OVERRIDE` events prove a human can decide not to proceed or can override Vaara's decision. The `vaara review resolve --audit-db PATH` CLI writes the `ESCALATION_RESOLVED` row directly from an operator action, so the override is a single recorded transaction rather than an out-of-band promise. |
 | **15(1)** | Accuracy, Robustness and Cybersecurity | `OUTCOME_RECORDED` events feed the adaptive scorer. Recency is tracked (default weekly calibration window). |
 | **61(1)** | Post-Market Monitoring | `OUTCOME_RECORDED` events form the post-market signal, tied back to the original action via `action_id`. |
@@ -56,7 +56,7 @@ independently from the agent code that uses it:
   (parse errors plus warnings for narrow threshold bands, dangling
   per-class overrides, unreachable escalation routes, sequence steps
   not naming a declared action class, missing default escalation
-  route). Exit code 0 if no errors; warnings print without flipping
+  route). Exit code 0 if no errors. Warnings print without flipping
   the exit code.
 - `vaara policy test POLICY_PATH --cases CASES_PATH` runs a YAML/JSON
   cases file against the policy (Conftest analog for Vaara). Each
@@ -131,7 +131,7 @@ finalize. Status as of April 2026.
 | **ISO/IEC 42006** Requirements for AI Management System Auditors | WG2 | DIS Stage 40 | Vaara's hash-chained trail is the artefact 42006-qualified auditors examine for surveillance evidence. |
 | **prEN ISO/IEC 24970** AI System Logging | WG3 | Stage 30.2 (comment resolution) | Vaara aligns with the tamper-resistance, decision-factor logging, and audit-system integration requirements. Field-level alignment pending the published version. |
 | **prEN 18229-1** Trustworthiness Framework Pt 1 (logging, transparency, human oversight) | WG4 | Public enquiry | Implements AI Act Articles 12-14, which Vaara already maps to in the article table above. Field-level alignment pending the published version. |
-| **prEN ISO/IEC 12792** Transparency Taxonomy of AI Systems | WG4 | Stage 40 (final vote) | v0.6 ships per-action audit records tagged against the four-axis model (System Operation, Data Usage, Decision Making, Limitations) via four optional `AuditRecord` fields. Default classification heuristic by event type; per-record override available. NOT tamper-evident in v0.6 — fields are metadata annotations excluded from `record_hash` so pre-v0.6 chains stay valid. |
+| **prEN ISO/IEC 12792** Transparency Taxonomy of AI Systems | WG4 | Stage 40 (final vote) | v0.6 ships per-action audit records tagged against the four-axis model (System Operation, Data Usage, Decision Making, Limitations) via four optional `AuditRecord` fields. Default classification heuristic by event type. Per-record override available. NOT tamper-evident in v0.6 - fields are metadata annotations excluded from `record_hash` so pre-v0.6 chains stay valid. |
 
 **What "alignment" means here.** Most of these standards have not
 published. The mapping above is pre-compliance positioning: Vaara is
@@ -220,7 +220,7 @@ qualified seal or signature without changing the underlying evidence.
 ## Position relative to open runtime-attestation standards
 
 The runtime-attestation space is converging on the principle that
-**self-attestation is not sufficient** — the entity attesting to
+**self-attestation is not sufficient** - the entity attesting to
 governance should be structurally independent of the entity being
 governed. OVERT 1.0 (Glacis Technologies, overt.is) makes this
 explicit through its four-tier Attestation Assurance Level model,
@@ -239,8 +239,8 @@ Vaara's position in this picture:
   operator.** The operator deploys Vaara in their own environment.
   In OVERT 1.0 terms this maps to AAL-3 (automated monitoring with
   operator-controlled infrastructure), not AAL-4. Reaching AAL-4
-  requires layering an external IAP — a notary service that the
-  operator does not control — on top of Vaara's emitted evidence.
+  requires layering an external IAP - a notary service that the
+  operator does not control - on top of Vaara's emitted evidence.
 - **Vaara's design admits an external IAP layer without internal
   change.** The hash chain, the commit-prove receipt pair, and the
   HTTP API surface all produce structured, signable artefacts that
@@ -251,7 +251,7 @@ Vaara's position in this picture:
 This positioning is deliberate. Vaara does not claim AAL-4
 conformance and does not market a self-attestation pattern.
 Operators who need AAL-4 should pair Vaara with an independent
-attestation provider; the Vaara-emitted evidence is the input to
+attestation provider. The Vaara-emitted evidence is the input to
 that provider, not a replacement for it.
 
 ## OVERT 1.0 Part 3 (Agentic AI Controls) mapping
@@ -263,141 +263,141 @@ the-loop attestation, and behavioural drift governance. The mapping
 below states, control by control, whether Vaara satisfies the
 requirement today (✅), partially satisfies it (◐), or leaves it as
 explicit gap-to-deployer or future-work (◯). Per OVERT Annex F.2 this
-mapping does not establish legal compliance with any regulation; it
+mapping does not establish legal compliance with any regulation. It
 records technical correspondence.
 
-### Section 11 — Tool-Call Governance
+### Section 11 - Tool-Call Governance
 
-- **TOOL-1.1** (intercept all tool calls before execution) — ✅.
-  `InterceptionPipeline.intercept()` is the enforcement boundary; no
+- **TOOL-1.1** (intercept all tool calls before execution) - ✅.
+  `InterceptionPipeline.intercept()` is the enforcement boundary. No
   tool call proceeds without a governance decision.
-- **TOOL-1.2** (evaluate against capability policy) — ✅. The policy
+- **TOOL-1.2** (evaluate against capability policy) - ✅. The policy
   DSL declares permitted tools, parameter ranges, destinations, and
-  approval gates; `policy.evaluate` returns the verdict carried in
+  approval gates. `policy.evaluate` returns the verdict carried in
   the per-call receipt.
 - **TOOL-1.3** (denial receipt with policy reference and violation
-  type) — ✅. Denials emit a `DENY` event on the hash chain with
+  type) - ✅. Denials emit a `DENY` event on the hash chain with
   policy id and violation reason.
 - **TOOL-1.4** (provisional receipt before execution, upgrade to full
-  attestation after notary validation) — ◐ at AAL-3. The Article 12
+  attestation after notary validation) - ◐ at AAL-3. The Article 12
   commit-prove receipt pair (shipped v0.10.0) is the Phase 2
   Provisional Receipt. Phase 3 (full notary attestation) requires an
   external IAP per the OVERT-position section above.
 - **TOOL-2.1** (explicit function allowlist with hash in policy
-  attestation) — ✅. Policy hash flows into `encoder_binary_identity`
+  attestation) - ✅. Policy hash flows into `encoder_binary_identity`
   in the Base Envelope (v0.11.0).
-- **TOOL-2.2** (parameter schema validation before execution) — ✅
-  for declared parameter shapes; ◐ for arbitrary deep schemas (the
+- **TOOL-2.2** (parameter schema validation before execution) - ✅
+  for declared parameter shapes. ◐ for arbitrary deep schemas (the
   policy DSL is intentionally bounded).
-- **TOOL-2.3** (rejection receipt with parameter violation detail) —
+- **TOOL-2.3** (rejection receipt with parameter violation detail) -
   ✅.
-- **TOOL-3.1** (per-tool rate limits with attested enforcement) — ◐.
-  The adaptive scorer applies velocity-aware risk signals; explicit
+- **TOOL-3.1** (per-tool rate limits with attested enforcement) - ◐.
+  The adaptive scorer applies velocity-aware risk signals. Explicit
   per-tool calls-per-epoch counters are not yet emitted as
   standalone receipts.
-- **TOOL-3.2** (per-session / per-user velocity caps) — ◐ via the
+- **TOOL-3.2** (per-session / per-user velocity caps) - ◐ via the
   agent profile in `scorer/adaptive.py`.
-- **TOOL-3.3** (circuit breakers on error / violation rate) — ◐ in
-  policy DSL; circuit-breaker receipt not yet a first-class artefact.
-- **TOOL-3.4** (recursion-depth termination per trace_id) — ◯.
-  Not implemented; agent-loop termination is currently the deployer's
+- **TOOL-3.3** (circuit breakers on error / violation rate) - ◐ in
+  policy DSL. Circuit-breaker receipt not yet a first-class artefact.
+- **TOOL-3.4** (recursion-depth termination per trace_id) - ◯.
+  Not implemented. Agent-loop termination is currently the deployer's
   responsibility.
-- **TOOL-4** (human approval gates) — ◐. The SQLite-backed review
+- **TOOL-4** (human approval gates) - ◐. The SQLite-backed review
   queue (`vaara.audit.review_queue`) routes `ESCALATE` verdicts to
   human reviewers and records `ESCALATION_RESOLVED` events with
   reviewer identity, timestamp, and decision. TOOL-4.4 approval-
   velocity caps are not enforced.
-- **TOOL-5** (tamper-evident tool-call log with epoch attestation) —
+- **TOOL-5** (tamper-evident tool-call log with epoch attestation) -
   ✅ for TOOL-5.1 and TOOL-5.2 (hash-chained `AuditTrail`,
   Article 12 commit-prove receipt pair). TOOL-5.3 epoch notary
   attestation is the external-IAP layer.
 
-### Section 11.5 — MCP Server Trust Governance
+### Section 11.5 - MCP Server Trust Governance
 
 Vaara ships an MCP server (`vaara.integrations.mcp_server`) that
-exposes governance tools to MCP clients; it does not currently act
+exposes governance tools to MCP clients. It does not currently act
 as an MCP *client* governing tools hosted on third-party MCP
 servers. The MCP-1/2/3 control set therefore applies to Vaara only
 in the **custom (operator-hosted)** mode (MCP-2): the operator runs
 the Vaara MCP server in their own environment.
 
-- **MCP-2.1** (server binary identity in co-epoch binding) — ◐ at
+- **MCP-2.1** (server binary identity in co-epoch binding) - ◐ at
   v0.12.0: arbiter binary identity is captured in
-  `encoder_binary_identity`; a dedicated MCP-server binary identity
+  `encoder_binary_identity`. A dedicated MCP-server binary identity
   field is future work.
-- **MCP-2.2** (network topology attestation) — ◯. Deployer concern;
+- **MCP-2.2** (network topology attestation) - ◯. Deployer concern.
   Vaara does not measure its own network position.
-- **MCP-2.3** (per-call authorization at the MCP server boundary) —
+- **MCP-2.3** (per-call authorization at the MCP server boundary) -
   ✅. Every MCP tool invocation passes through `intercept()`.
-- **MCP-2.4** (configuration change detection within an epoch) — ◯.
+- **MCP-2.4** (configuration change detection within an epoch) - ◯.
   Future work.
 - **MCP-1** and **MCP-3** (managed-vendor and external-third-party
-  MCP servers) — outside Vaara's current surface. An operator using
+  MCP servers) - outside Vaara's current surface. An operator using
   Vaara as the *governor in front of* a third-party MCP server would
-  need adapter work; the architecture admits it but no implementation
+  need adapter work. The architecture admits it but no implementation
   ships today.
 
-### Section 12 — Multi-Agent System Controls
+### Section 12 - Multi-Agent System Controls
 
-- **MULTI-1** (inter-agent trust boundaries) — ◯. Per-agent policy
+- **MULTI-1** (inter-agent trust boundaries) - ◯. Per-agent policy
   evaluation works today (each `intercept()` call carries an
   `agent_id`), but agent-vs-agent trust separation is not
   enforced beyond what the deployment policy declares.
-- **MULTI-2** (agent composition / topology attestation) — ◯.
-  Deployer-side documentation; no Vaara-emitted topology receipt.
+- **MULTI-2** (agent composition / topology attestation) - ◯.
+  Deployer-side documentation. No Vaara-emitted topology receipt.
 
-### Section 13 — Capability-Based Access Control
+### Section 13 - Capability-Based Access Control
 
-- **CAP-1** (data provenance tracking) — ◐. The taxonomy and policy
-  DSL accept provenance tags on actions; transformation propagation
+- **CAP-1** (data provenance tracking) - ◐. The taxonomy and policy
+  DSL accept provenance tags on actions. Transformation propagation
   (CAP-1.2) is the deployer's responsibility because Vaara intercepts
   tool calls, not arbitrary data transformations inside the agent
   process.
 - **CAP-2** (architectural separation of planning from untrusted
-  data) — ◯. AAL-2 documentation at most; this is a deployer-side
+  data) - ◯. AAL-2 documentation at most. This is a deployer-side
   architecture choice that Vaara records but does not enforce.
 
-### Section 14 — Agent Disclosure and Transparency
+### Section 14 - Agent Disclosure and Transparency
 
-- **DISC-1.1** (capability documentation) — ◐ via the deployer's
+- **DISC-1.1** (capability documentation) - ◐ via the deployer's
   policy file + `vaara compliance report`.
-- **DISC-1.2** (AIBOM in CycloneDX-AI or SPDX 3.0) — ◯. Future
-  work; the auditor-facing evidence export (v0.10.0) is a candidate
+- **DISC-1.2** (AIBOM in CycloneDX-AI or SPDX 3.0) - ◯. Future
+  work. The auditor-facing evidence export (v0.10.0) is a candidate
   surface to embed AIBOM references.
 - **DISC-1.3** (attestation summary with coverage ratio, S3P
-  signals, override frequency) — ◐ from v0.12.0: Vaara now emits
+  signals, override frequency) - ◐ from v0.12.0: Vaara now emits
   S3P attestations (`vaara.attestation.s3p`) carrying coverage
-  ratio and binomial CI; the deployer aggregates these for
+  ratio and binomial CI. The deployer aggregates these for
   disclosure.
 
-### Section 15 — Human-in-the-Loop Attestation
+### Section 15 - Human-in-the-Loop Attestation
 
-- **HITL-1** (consent attestation) — ◯. Deployer-side concern;
+- **HITL-1** (consent attestation) - ◯. Deployer-side concern.
   Vaara does not collect end-user consent.
-- **HITL-2** (human review attestation) — ◐. Review-queue resolution
+- **HITL-2** (human review attestation) - ◐. Review-queue resolution
   events on the audit chain carry reviewer identity (when supplied
   by the deployer), timestamp, decision, and reference to the
   original `ESCALATE` verdict by `action_id`. AAL-4 identity
   binding is the deployer's responsibility.
-- **HITL-3** (human correction and override) — ◐ via
+- **HITL-3** (human correction and override) - ◐ via
   `report_outcome` and the review-queue resolution event.
 - **HITL-4** (policy and configuration approval with separation of
-  duties) — ◯ at the receipt level; policy-change approval is
+  duties) - ◯ at the receipt level. Policy-change approval is
   currently a git-history artefact, not an attested OVERT event.
-- **SESS-1..5** (session-scoped attestation) — ◯.
+- **SESS-1..5** (session-scoped attestation) - ◯.
 - **STATE-1, STATE-2** (durable state sealing and prompt artifact
-  binding) — ◯.
-- **IDENT-1** (federated identity / token provenance chain) — ◐.
+  binding) - ◯.
+- **IDENT-1** (federated identity / token provenance chain) - ◐.
   `vaara.auth` accepts authenticated caller identity into the audit
-  record; full delegation-chain attestation per IDENT-1.2 is future
+  record. Full delegation-chain attestation per IDENT-1.2 is future
   work.
 
-### Section 16 — Behavioural Drift Governance
+### Section 16 - Behavioural Drift Governance
 
-- **DRIFT-1** (baseline intent declaration) — ◯. Future work; the
+- **DRIFT-1** (baseline intent declaration) - ◯. Future work. The
   policy DSL is the candidate surface for machine-readable behavioural
   bounds.
-- **DRIFT-2** and downstream drift controls — ◐ in spirit. The
+- **DRIFT-2** and downstream drift controls - ◐ in spirit. The
   adaptive scorer tracks coverage error via FACI (`scorer/adaptive.py`)
   and emits drift signals through audit events, but these are not yet
   packaged as DRIFT-* receipts.
@@ -407,31 +407,31 @@ the Vaara MCP server in their own environment.
 S3P sits in Domain 5 (MEASURE), not Part 3, but it is the agentic-
 relevant measurement primitive that ties everything above together.
 
-- **MEA-1** (deterministic sampling infrastructure) — ◯. Vaara
-  evaluates every intercepted action; sampling-rate-based
+- **MEA-1** (deterministic sampling infrastructure) - ◯. Vaara
+  evaluates every intercepted action. Sampling-rate-based
   measurement is opt-in. A deployer who wants S3P sampling provides
   the PRF tag and threshold.
-- **MEA-2.1** (epoch nonce commitment) — ✅ via
+- **MEA-2.1** (epoch nonce commitment) - ✅ via
   `vaara.attestation.s3p.make_epoch_nonce_commitment`.
-- **MEA-2.4** (exact binomial CI) — ✅. Pure-Python Clopper-Pearson
-  via the regularized incomplete beta function; no scipy dependency.
+- **MEA-2.4** (exact binomial CI) - ✅. Pure-Python Clopper-Pearson
+  via the regularized incomplete beta function. No scipy dependency.
 - **MEA-2.6** (closed-schema S3P attestation, Ed25519-signed,
-  canonical CBOR per Protocol Profile 1.0) — ✅ via
+  canonical CBOR per Protocol Profile 1.0) - ✅ via
   `emit_s3p_attestation`.
 - **Vaara conformal extension (proposed Protocol Profile
   extension):** the `ConformalExtension` field reports aggregate
   statistics over Vaara's per-action conformal prediction intervals
   alongside the standard Clopper-Pearson CI. The conformal
   aggregates carry the same non-parametric coverage guarantee with
-  no distributional assumption — exactly the property MEA-2.4
+  no distributional assumption - exactly the property MEA-2.4
   requires from a method offered as an alternative to (or
   complement of) Clopper-Pearson. The extension rides in a single
-  field in the signed metadata; standard OVERT verifiers ignore it.
+  field in the signed metadata. Standard OVERT verifiers ignore it.
 
 ## EU Product Liability Directive 2024/2853
 
 Directive (EU) 2024/2853 of 23 October 2024 on liability for defective
-products treats software — including AI systems — as a product within
+products treats software - including AI systems - as a product within
 scope of strict product-liability rules. Member State transposition
 deadline is **9 December 2026** (Article 22). The provisions that
 matter for runtime evidence:
@@ -440,7 +440,7 @@ matter for runtime evidence:
   national court SHALL presume the defectiveness of a product, or
   the causal link between defectiveness and damage, where the
   claimant faces excessive difficulties proving the technical
-  facts — in particular due to the technical complexity of the
+  facts - in particular due to the technical complexity of the
   product (Article 9(4)). The defendant rebuts the presumption by
   showing the product was not defective.
 - **Article 7 (Defectiveness assessment).** Defectiveness is
@@ -465,7 +465,7 @@ How Vaara fits:
 - The hash-chain integrity, Ed25519 signatures, and Article 12
   receipt pair give the evidence the tamper-evident shape that
   national courts will expect from contemporaneous records.
-- Vaara does not generate liability defences; it produces the
+- Vaara does not generate liability defences. It produces the
   technical evidence those defences are built from. Legal strategy,
   expert witness work, and the substantive risk-management policy
   remain with the deployer's counsel.
@@ -565,7 +565,7 @@ problem:
   `vaara trail verify` will report a chain break at the boundary.
   Intended workflow: export a signed handoff zip BEFORE purging,
   archive the zip externally for long-tail audit history, then purge
-  the live DB. The signed zip remains self-consistent forever; the
+  the live DB. The signed zip remains self-consistent forever. The
   live DB chain has a documented seam at the retention boundary.
 
 ## Current limits
@@ -604,7 +604,7 @@ Honest about the edges:
 
   Note on FPR vs CHANGELOG headline: the CHANGELOG quotes "global benign
   FPR 21.0%" which is classifier-alone 5-fold CV OOF. The full-stack
-  numbers above are dominated by the heuristic — most benign escalations
+  numbers above are dominated by the heuristic - most benign escalations
   come from the heuristic `ESCALATE` branch, not from classifier upgrades
   on heuristic-`ALLOW`ed entries.
 
@@ -614,8 +614,8 @@ Honest about the edges:
   a v0.7 follow-up if the gap demands it.
 - **Stack composition (v0.6 measurement).** The full-stack numbers above
   decompose into independent layer contributions. `heuristic_only` recall
-  is 35% / 63% (hand-curated / LLM-generated); `classifier_only` recall
-  is 94% / 86%. Layers are not redundant — heuristic catches a small set
+  is 35% / 63% (hand-curated / LLM-generated). `classifier_only` recall
+  is 94% / 86%. Layers are not redundant - heuristic catches a small set
   of attacks the classifier misses, justifying the ensemble. Most of the
   full-stack benign FPR comes from heuristic ESCALATEs, not classifier
   upgrades. Detailed breakdown: `tests/adversarial/stack_ablation_v0_5_3.json`.
@@ -629,13 +629,13 @@ Honest about the edges:
   - Total LLM calls: 125 attacker iterations across 25 seeds, plus
     judge confirmations on heuristic-ALLOW outcomes
   - **ASR: 0.0% (0/25)**. Across 125 candidate prompts, Vaara
-    escalated 124 and allowed 1; the judge ruled the allowed candidate
+    escalated 124 and allowed 1. The judge ruled the allowed candidate
     not a successful jailbreak.
 
   Reading: Vaara stack catches DAN-roleplay, "hypothetical scenario",
   and "security drill" -style jailbreak attempts at this attacker
   capability level. NOT a claim of imperviousness to all adaptive attackers
-  — a stronger attacker model (70B+), longer iteration budgets, or
+  - a stronger attacker model (70B+), longer iteration budgets, or
   different strategies (multi-turn drift, language-switch, obfuscation)
   might produce non-zero ASR. v0.7 follow-up: re-run with 70B+ attacker
   + judge if a compliance audience requires the harder calibration.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 2750ad0..58e8d5c 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -36,7 +36,7 @@ pip install -e '.[dev]'   # one-time setup
 scripts/lint_full.sh
 ```
 
-The script chains four checks: `ruff` (style + correctness), `bandit` (security), `mypy` (types — strict on `vaara.policy`, lenient elsewhere while legacy modules are migrated), and `pytest`. Total runtime ~10s. CI runs the same gates, so a green local sweep should mean a green PR.
+The script chains four checks: `ruff` (style + correctness), `bandit` (security), `mypy` (types - strict on `vaara.policy`, lenient elsewhere while legacy modules are migrated), and `pytest`. Total runtime ~10s. CI runs the same gates, so a green local sweep should mean a green PR.
 
 New modules under `src/vaara/` are expected to type-check cleanly. As legacy modules get cleaned up, add them to the strict mypy block in `pyproject.toml` so the typing floor only ratchets upward.
 
diff --git a/README.md b/README.md
index 4c23f49..3de6784 100644
--- a/README.md
+++ b/README.md
@@ -71,11 +71,11 @@ curl -sX POST http://localhost:8000/v1/score \
   -d '{"tool_name":"tx.transfer","agent_id":"agent-007","base_risk_score":0.5}'
 ```
 
-The contract is in [docs/openapi.yaml](docs/openapi.yaml). Vaara defines the interface; control-plane and orchestration vendors call it. Integration recipes for adopters live under `examples/recipes/`.
+The contract is in [docs/openapi.yaml](docs/openapi.yaml). Vaara defines the interface. Control-plane and orchestration vendors call it. Integration recipes for adopters live under `examples/recipes/`.
 
 ## OVERT 1.0 attestation
 
-Vaara is the first OSS Python reference implementation of the OVERT 1.0 ([overt.is](https://overt.is/), Glacis Technologies, March 2026) Protocol Profile 1.0 Base Envelope at AAL-3 Phase 2 (Provisional Receipt). Closed-schema 9-field structure, canonical CBOR encoding, Ed25519 signatures, HMAC-SHA256 keyed commitments, IEEE-754 float rejection. External Independent Attestation Providers can promote AAL-3 emission to AAL-4 by attaching Phase 3 notary signatures and transparency-log inclusion proofs.
+Vaara implements the OVERT 1.0 ([overt.is](https://overt.is/)) Protocol Profile 1.0 Base Envelope. OVERT 1.0 is an open standard for runtime trust in AI systems, authored by Glacis Technologies and published in March 2026. Closed-schema 9-field structure at AAL-3 Phase 2 (Provisional Receipt), canonical CBOR (RFC 8949), Ed25519 signatures, HMAC-SHA256 keyed commitments, IEEE-754 float rejection. External Independent Attestation Providers can promote AAL-3 emission to AAL-4 by attaching Phase 3 notary signatures and transparency-log inclusion proofs.
 
 ```
 pip install 'vaara[attestation]'
diff --git a/SECURITY.md b/SECURITY.md
index 2160908..5898d7d 100644
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -7,7 +7,7 @@ Please report security vulnerabilities privately through GitHub's
 feature. **Do not open a public issue for anything that could be exploited.**
 
 For communication outside GitHub, reach the maintainers at
-`security@vaara.io`. Use PGP if you prefer end-to-end-encrypted email; the
+`security@vaara.io`. Use PGP if you prefer end-to-end-encrypted email. The
 current public key is published at
 <https://github.com/vaaraio/vaara/blob/main/docs/signing-keys.md>.
 
diff --git a/bench/COMPARISON.md b/bench/COMPARISON.md
index 0083833..8f1d571 100644
--- a/bench/COMPARISON.md
+++ b/bench/COMPARISON.md
@@ -1,9 +1,12 @@
 # Comparison with adjacent tools
 
 This doc compares Vaara against the open-source tools most often
-named in the same breath: **NVIDIA NeMo Guardrails**, **Guardrails AI**,
-**OpenAI Guardrails** (for Agents SDK), **LangChain callback handlers**,
-and the **OWASP LLM Top 10** threat taxonomy.
+named in the same breath. Two clusters: LLM-text rails and output
+validators on one side (**NVIDIA NeMo Guardrails**, **Guardrails AI**,
+**OpenAI Guardrails** for Agents SDK, **LangChain callback handlers**,
+and the **OWASP LLM Top 10** threat taxonomy), and agent governance
+plus attestation tools on the other (**Glacis Python SDK** and
+**Microsoft Agent Governance Toolkit**).
 
 No benchmark numbers are cited for the other tools here. Each one
 solves a different problem than Vaara, so a head-to-head TPR/FPR on
@@ -18,24 +21,36 @@ prose, read the sections below it.
 
 ## Capability matrix
 
-| Concern                                          | Vaara | NeMo Guardrails | Guardrails AI | OpenAI Guardrails | LangChain callbacks | OWASP LLM Top 10 |
-| ------------------------------------------------ | :---: | :-------------: | :-----------: | :---------------: | :-----------------: | :--------------: |
-| Validates tool-call **arguments** at runtime     |   ✓   |        ✗        |       ✗       |         ✗         |    observes only    |   not software   |
-| Probabilistic / conformal risk scoring per call  |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |
-| Detects temporal **sequence** patterns           |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |
-| Hash-chained, regulator-exportable audit trail   |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |
-| EU AI Act Art. 12 / 14 / 26 evidence mapping     |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |
-| Validates LLM *output text* (PII, toxicity, etc) |   ✗   |        ✓        |       ✓       |         ✓         |          ✗          |   advisory only  |
-| Validates LLM *input prompt* (jailbreak etc)     |   ✗   |        ✓        |       ✓       |         ✓         |          ✗          |   advisory only  |
-| Structured-output validation (schema / regex)    | partial|        ✓        |       ✓       |         ✓         |          ✗          |        ✗         |
-| Self-hostable Python library (no SaaS required)  |   ✓   |        ✓        |       ✓       |         ✓         |          ✓          |     document     |
-| Apache-2.0                                       |   ✓   |     Apache-2.0  |     Apache-2.0|        MIT        |        MIT          |      CC-BY       |
-
-Reading the matrix: Vaara and the output-validation tools are
-complementary, not competitive. A real deployment uses output
-validation **and** tool-call governance. Vaara does not validate LLM
-text output, so use Guardrails AI or NeMo for that. NeMo and Guardrails
-AI do not validate tool-call arguments at runtime, so use Vaara for that.
+| Concern                                          | Vaara | NeMo Guardrails | Guardrails AI | OpenAI Guardrails | LangChain callbacks | OWASP LLM Top 10 | Glacis Python SDK | MS Agent Governance Toolkit |
+| ------------------------------------------------ | :---: | :-------------: | :-----------: | :---------------: | :-----------------: | :--------------: | :---------------: | :-------------------------: |
+| Validates tool-call **arguments** at runtime     |   ✓   |        ✗        |       ✗       |         ✗         |    observes only    |   not software   |         ✗         |              ✓              |
+| Probabilistic / conformal risk scoring per call  |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |         ✗         |              ✗              |
+| Detects temporal **sequence** patterns           |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |         ✗         |              ✗              |
+| Hash-chained, regulator-exportable audit trail   |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |  partial (Merkle) |      partial (logging)      |
+| EU AI Act Art. 12 / 14 / 26 evidence mapping     |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |         ✗         |              ✗              |
+| OVERT 1.0 Base Envelope emission (RFC 8949 CBOR) |   ✓   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |         ✗         |              ✗              |
+| RFC 6962 Merkle inclusion proof integration      |  ext. IAP  |     ✗      |       ✗       |         ✗         |          ✗          |        ✗         |    ✓ (hosted)     |              ✗              |
+| Validates LLM *output text* (PII, toxicity, etc) |   ✗   |        ✓        |       ✓       |         ✓         |          ✗          |   advisory only  |         ✗         |              ✗              |
+| Validates LLM *input prompt* (jailbreak etc)     |   ✗   |        ✓        |       ✓       |         ✓         |          ✗          |   advisory only  |         ✗         |              ✗              |
+| Structured-output validation (schema / regex)    | partial|        ✓        |       ✓       |         ✓         |          ✗          |        ✗         |         ✗         |          partial            |
+| Zero-trust agent identity primitives             |   ✗   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |         ✗         |              ✓              |
+| Capability-based access control                  | policy schema |  ✗        |       ✗       |         ✗         |          ✗          |        ✗         |         ✗         |              ✓              |
+| Execution sandboxing                             |   ✗   |        ✗        |       ✗       |         ✗         |          ✗          |        ✗         |         ✗         |              ✓              |
+| Multi-language SDKs                              | Python only |     N/A    |   Python      |  Python (Agents)  |   Python / JS       |      N/A         |    Python only    |              ✓              |
+| Self-hostable Python library (no SaaS required)  |   ✓   |        ✓        |       ✓       |         ✓         |          ✓          |     document     |         ✓         |              ✓              |
+| License                                          | Apache-2.0 |   Apache-2.0 |   Apache-2.0 |        MIT        |        MIT          |      CC-BY       |    Apache-2.0     |             MIT             |
+
+Reading the matrix: Vaara and the other tools are complementary, not
+competitive. Different cells of the matrix. Different parts of the
+stack. A real production agent deployment uses several of these at
+once. Vaara owns the runtime risk-scoring + Article 14 evidence +
+OVERT 1.0 attestation slice. NeMo and Guardrails AI cover the LLM
+text-rail slice. Microsoft AGT covers the agent identity, capability,
+and sandboxing slice. Glacis SDK is a client to Glacis's hosted
+attestation service. Vaara does not validate LLM text output, so use
+Guardrails AI or NeMo for that. Vaara does not provide zero-trust
+agent identity, so use Microsoft AGT for that. The text-rail tools do
+not validate tool-call arguments at runtime, so use Vaara for that.
 
 ## One paragraph each
 
@@ -79,6 +94,29 @@ vocabulary. Not software, so there is nothing to install. Vaara's
 signals and sequence patterns are informed by this taxonomy, but the
 taxonomy itself does not do runtime enforcement.
 
+**Glacis Python SDK.** Apache-2.0 client library for Glacis
+Technologies' hosted attestation service, using RFC 8785 canonical
+JSON, SHA-256 hashing, Ed25519 signatures, and RFC 6962 Merkle
+inclusion proofs delivered in-line by the hosted service. Glacis
+Technologies also authored OVERT 1.0, the open standard for
+runtime trust in AI systems, published at overt.is in March 2026.
+Either tool can be used depending on whether you need a
+Glacis-hosted-service client or an OVERT 1.0 Base Envelope emitter
+in your runtime.
+
+**Microsoft Agent Governance Toolkit.** MIT-licensed toolkit for
+agent identity, capability-based access control, execution sandboxing,
+and reliability engineering. The toolkit frames its surface around
+the OWASP Agentic Top 10 and zero-trust principles, with multi-language
+SDKs for deployers running heterogeneous agent stacks. Where Vaara
+provides runtime risk scoring and Article 14 audit evidence, AGT
+provides agent identity primitives and the sandboxing layer that
+isolates agent execution from the host environment. The two tools
+cover different layers of the same governance stack. The
+`GenAI-Gurus/awesome-eu-ai-act` curator places Vaara and AGT side
+by side in the AI Agent Governance section for exactly this reason:
+deployers running production agents typically want both wired in.
+
 ## Where Vaara fits
 
 Vaara is the gate between an AI agent's *decision* to take an action
@@ -96,15 +134,24 @@ The three things Vaara does that the tools above do not:
 3. Produce **regulator-ready** evidence: cryptographic audit chain,
    signal breakdown per decision, conformity report.
 
-The three things Vaara does not do that the tools above handle well:
+The things Vaara does not do that the tools above handle well:
 
-1. LLM output validation (PII, toxicity, schema).
-2. LLM input guardrails (jailbreak detection, topical rails).
-3. Constrained decoding and structured output generation.
+1. LLM output validation, PII redaction, toxicity filtering (NeMo,
+   Guardrails AI, OpenAI Guardrails).
+2. LLM input guardrails, jailbreak detection, topical rails (same).
+3. Constrained decoding and structured output generation (same).
+4. Zero-trust agent identity primitives and capability-based access
+   control as first-class types (Microsoft Agent Governance Toolkit).
+5. Execution sandboxing as a built-in primitive (Microsoft AGT).
+6. Hosted Merkle-inclusion-proof attestation as a managed service
+   (Glacis Python SDK).
 
 If you are building an agent that writes to user-visible text **and**
-executes tools, you want both Vaara and one of the output-validation
-tools wired in. They run in different places in the stack.
+executes tools, you want Vaara plus one of the output-validation
+tools wired in. If you are running agents in production, you want
+Vaara plus Microsoft AGT for the identity, capability, and sandboxing
+layer Vaara does not cover. They run in different places in the
+stack and the matrix above shows where each tool lives.
 
 ## Numbers we publish
 
diff --git a/bench/README.md b/bench/README.md
index 8a6a2fc..10f986f 100644
--- a/bench/README.md
+++ b/bench/README.md
@@ -123,7 +123,7 @@ contract as **vaara-bench-v1**. See [`vaara-bench-v1.md`](vaara-bench-v1.md)
 for the frozen corpus hash, the methodology, the headline numbers
 under Vaara 0.11.0, the reproduction commands, and the license. Use
 the spec doc when citing Vaara's adversarial-detection numbers
-externally; this README is the running commentary.
+externally. This README is the running commentary.
 
 `bench/adversarial_corpus.jsonl` is a **synthetic** labelled corpus
 of 77 traces generated deterministically by `bench/build_corpus.py`.