v0.49.0: decision records + automatic cadence anchoring#181
Conversation
…y cap The v0.48.0 release published PyPI, npm, and the GitHub Release, but the MCP registry job rejected both manifests with 'expected length <= 100' on body.description (the EU-reframe descriptions ran ~150 chars). Trim both to keep the EU AI Act lead within the cap. No code change; version stays 0.48.0 and the registry is published from these manifests via mcp-publisher.
…cution-record SEP Lands the pre-execution decision record as a signed envelope, closing the one unbacked block in docs/sep/sep-server-execution-record.md. Every normative claim in the SEP now has shipped, tested code behind it. - vaara.attestation.decision: DecisionRecord / DecisionDerived, emit, verify-signature, back-link verify, and records_paired (the decision-and-outcome join). Reuses the receipt back-link, issuer-block layout, and JCS + HS256/ES256/RS256 signing stack unchanged; only the decisionDerived block is new. - decision_derived_from_commit bridges the shipped hash-chained CommitPayload onto the wire shape: deny normalizes to block, the float risk basis becomes decimal strings, the epoch decision time becomes ISO 8601 UTC. Lazy import keeps the core audit layer free of the optional attestation extra. - 18 tests: HS/ES/RS round-trips, tamper and wrong-key rejection, back-link binding, instance-scoped pairing with the execution receipt, the float-on-the-wire ban, and the commit bridge. SEP updated: reference-implementation modules listed, the "implementation gap" paragraph replaced with the bridge description. 1114 passed / 12 skipped, ruff + mypy clean.
Closes the replay-substitution case promised on SEP-2787 (modelcontextprotocol#2787, 2026-05-30). A valid executed HS256 receipt is replayed with one signed field swapped (outcome status executed -> refused) while the original signature is kept. Back-link and result-commitment still verify, so only the signature catches the forgery: the signed envelope, not any single sub-check, binds the outcome claim. HS256 is deterministic, so the case is added without regenerating the ES256/RS256 keys or churning the existing fixtures. Independent walker reports 6/6; guard test bumped to >= 6 cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Anchors a real AuditTrail chain head against a real public TSA (DigiCert/Sectigo/freeTSA fallback) through anchor_head(), then verifies the token offline bound to the actual chain. Proves the round trip against an authority we do not control, complementing the in-process verifier tests. Skipped unless VAARA_LIVE_TSA=1 so CI and offline runs never depend on a third party. Verified live against http://timestamp.digicert.com: granted a 5999-byte RFC 3161 token, attested time re-derived offline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…marker AuditTrail.enable_auto_anchor(client, every_records=N) anchors the chain head to an external TSA every N records, so a deployment no longer has to call anchor_head() by hand. No TSA is configured by default; this is the opt-in that turns anchoring on. Fail-open per design: when the authority is unreachable or its token does not verify, the trail records a chained ANCHOR_GAP marker (reason + the head it tried to anchor) instead of raising, so the unanchored window is itself visible and tamper-evident in the chain. The TSA round trip runs off the hash-chain lock, so it does not block concurrent recording beyond the triggering append; the gap marker appends via _append_chained so it cannot recurse back into anchoring. EventType.ANCHOR_GAP added with a prEN ISO/IEC 12792 transparency default. Tests: cadence fires, fail-open records a chained gap and keeps the chain intact, off by default, cadence validated. Plus an opt-in live TSA test (VAARA_LIVE_TSA=1). Suite 1118 passed / 13 skipped, ruff + mypy clean. Library only; version bump and release are a separate decision. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… shipped Removes the named third-party attribution (and the external draft link) from the prior-art and alternatives sections, keeping the technical reasoning for why a content-addressed action_ref is not the default join. Updates the time anchor from "next shipped step" to its shipped state: RFC 3161 over the chain head, offline-verifiable, with optional automatic cadence anchoring that fails open by writing an ANCHOR_GAP marker. Adds the two precision points (anchoring is opt-in with no bundled TSA; offline verify proves the embedded-cert signature, not eIDAS qualification, which is deployer policy via cert pinning). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…receipt vector Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR releases v0.49.0 with SEP-2787 decision-record support for pre-execution commitment verification and automatic audit-chain anchoring. New decision-record types, signing, and verification enable issuers to embed policy decisions within attestation envelopes. Audit trails now support fail-open automatic time anchoring via cadence-based triggering, recording chained ChangesDecision Record Implementation
Automatic Audit Chain Anchoring
Testing & Verification
Documentation & Release Metadata
🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly Related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
server.json (1)
11-17:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUpdate manifest version fields to 0.49.0.
The manifest still points to
0.48.0, which will misrepresent this release in registry metadata.Proposed fix
- "version": "0.48.0", + "version": "0.49.0", @@ - "version": "0.48.0", + "version": "0.49.0",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server.json` around lines 11 - 17, Update the manifest version fields from "0.48.0" to "0.49.0" so registry metadata matches the new release: change the top-level "version" value and the package object "version" field (the package with "identifier": "vaara") to "0.49.0".docs/sep/sep-server-execution-record.md (1)
527-544:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClarify shipped-version claim for decision records vs receipts.
Line 527 currently frames the SEP wire schema as shipping in v0.48, but this section now includes
vaara/attestation/decision.pysymbols that are introduced in v0.49.0. Please split the statement so receipt/time-anchor references remain on v0.48 while decision-record availability is explicitly v0.49.0.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/sep/sep-server-execution-record.md` around lines 527 - 544, Update the paragraph that currently claims the SEP wire schema shipped in v0.48 to differentiate versions: keep the receipt/time-anchor related symbols (vaara/attestation/_receipt_types.py: ExecutionReceipt, OutcomeDerived, BackLink; vaara/attestation/_receipt_emit.py; vaara/attestation/_receipt_verifier.py and their described behaviors) as shipped in v0.48, and explicitly state that the decision-record symbols (vaara/attestation/decision.py: DecisionRecord, DecisionDerived, emit_decision_record, verify_decision_signature, verify_decision_back_link, records_paired) were introduced in v0.49.0 so readers know receipts are v0.48 while decision-record support is v0.49.0.server-vaara-server.json (1)
11-17:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBump manifest versions to 0.49.0 to match the release.
This manifest still advertises
0.48.0at both top-level and package level, which is inconsistent with the 0.49.0 release metadata.Proposed fix
- "version": "0.48.0", + "version": "0.49.0", @@ - "version": "0.48.0", + "version": "0.49.0",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server-vaara-server.json` around lines 11 - 17, Update the manifest version fields from 0.48.0 to 0.49.0: change the top-level "version" property and the nested package object's "version" property (the package with "identifier": "vaara") so both reflect 0.49.0 to match the release metadata; leave other fields (e.g., "registryType", "registryBaseUrl", "identifier") unchanged.
🧹 Nitpick comments (1)
src/vaara/audit/trail.py (1)
1191-1215: 💤 Low valueConsider validating that
clientis notNone.If
client=Noneis passed, the system will later raiseAttributeError: 'NoneType' object has no attribute 'anchor'when anchoring, which gets caught and recorded as a gap marker. While fail-open is preserved, explicit validation gives clearer feedback for this programming error.Proposed validation
def enable_auto_anchor(self, client: Any, *, every_records: int) -> None: + if client is None: + raise ValueError("client must not be None") if every_records < 1: raise ValueError("every_records must be a positive integer")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/vaara/audit/trail.py` around lines 1191 - 1215, The enable_auto_anchor method should validate that the provided client is not None to avoid obscured AttributeError later; update enable_auto_anchor to raise a ValueError (or TypeError) if client is None before assigning to self._anchor_client and related state. Modify the enable_auto_anchor function (referenced by name) to check the client parameter at the start, and keep the existing assignments to self._anchor_client, self._anchor_cadence, and self._records_since_anchor only after the non-None check so callers get an immediate, clear error instead of a later failure during anchor operations (e.g. anchor_head/_anchor_client usage).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/vaara/attestation/_decision_emit.py`:
- Around line 62-130: emit_decision_record currently signs whatever is in
DecisionDerived.decision without runtime validation; before
canonicalizing/signing (in emit_decision_record) assert that
decision_derived.decision is one of the allowed verdict literals (the same set
parse_decision_record accepts, e.g. "allow" or "deny"), and raise
AttestationError if not; perform this check right after the existing
back_link/issuer_asserted validations (referencing emit_decision_record,
DecisionDerived.decision, and parse_decision_record) so malformed decisions are
rejected early.
In `@src/vaara/attestation/_decision_types.py`:
- Around line 127-144: In decision_from_dict, validate that the decimal-like
fields riskScore, thresholdAllow, and thresholdBlock (from d["riskScore"],
d["thresholdAllow"], d["thresholdBlock"]) are either absent (None) or strings;
if any is present and not a str, raise AttestationError indicating the field is
invalid so floats like 0.21 are rejected at parse time; update the construction
of DecisionDerived to only pass these values after the type-checks.
In `@src/vaara/audit/receipts.py`:
- Around line 244-258: The current _decimal_str function expands scientific
notation using f"{value:.12f}" which rounds very small non-zero floats to "0"
(e.g. 1e-13), altering finite risk/threshold values when reconstructed (used by
decision_derived_from_commit/CommitPayload). Replace the scientific-notation
expansion with a non-rounding conversion by creating a Decimal from the float's
repr (e.g. Decimal(repr(value))) and use its string/normalization methods to
produce a plain decimal string; keep the existing finite check and ensure the
returned string never contains exponent notation.
---
Outside diff comments:
In `@docs/sep/sep-server-execution-record.md`:
- Around line 527-544: Update the paragraph that currently claims the SEP wire
schema shipped in v0.48 to differentiate versions: keep the receipt/time-anchor
related symbols (vaara/attestation/_receipt_types.py: ExecutionReceipt,
OutcomeDerived, BackLink; vaara/attestation/_receipt_emit.py;
vaara/attestation/_receipt_verifier.py and their described behaviors) as shipped
in v0.48, and explicitly state that the decision-record symbols
(vaara/attestation/decision.py: DecisionRecord, DecisionDerived,
emit_decision_record, verify_decision_signature, verify_decision_back_link,
records_paired) were introduced in v0.49.0 so readers know receipts are v0.48
while decision-record support is v0.49.0.
In `@server-vaara-server.json`:
- Around line 11-17: Update the manifest version fields from 0.48.0 to 0.49.0:
change the top-level "version" property and the nested package object's
"version" property (the package with "identifier": "vaara") so both reflect
0.49.0 to match the release metadata; leave other fields (e.g., "registryType",
"registryBaseUrl", "identifier") unchanged.
In `@server.json`:
- Around line 11-17: Update the manifest version fields from "0.48.0" to
"0.49.0" so registry metadata matches the new release: change the top-level
"version" value and the package object "version" field (the package with
"identifier": "vaara") to "0.49.0".
---
Nitpick comments:
In `@src/vaara/audit/trail.py`:
- Around line 1191-1215: The enable_auto_anchor method should validate that the
provided client is not None to avoid obscured AttributeError later; update
enable_auto_anchor to raise a ValueError (or TypeError) if client is None before
assigning to self._anchor_client and related state. Modify the
enable_auto_anchor function (referenced by name) to check the client parameter
at the start, and keep the existing assignments to self._anchor_client,
self._anchor_cadence, and self._records_since_anchor only after the non-None
check so callers get an immediate, clear error instead of a later failure during
anchor operations (e.g. anchor_head/_anchor_client usage).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 7c946fe3-6798-4eba-b904-d9cd3ba76349
📒 Files selected for processing (20)
CHANGELOG.mddocs/sep/sep-server-execution-record.mdpyproject.tomlscripts/generate_receipt_vectors.pyserver-vaara-server.jsonserver.jsonsrc/vaara/attestation/_decision_emit.pysrc/vaara/attestation/_decision_types.pysrc/vaara/attestation/_decision_verifier.pysrc/vaara/attestation/decision.pysrc/vaara/audit/receipts.pysrc/vaara/audit/trail.pytests/test_decision_record.pytests/test_receipt_vectors.pytests/test_timeanchor.pytests/vectors/execution_receipt_v0/README.mdtests/vectors/execution_receipt_v0/normative/neg_replay_substituted_field/attestation.jsontests/vectors/execution_receipt_v0/normative/neg_replay_substituted_field/expected.jsontests/vectors/execution_receipt_v0/normative/neg_replay_substituted_field/receipt.jsontests/vectors/execution_receipt_v0/normative/neg_replay_substituted_field/runtime_result.json
| def emit_decision_record( | ||
| *, | ||
| back_link: BackLink, | ||
| decision_derived: DecisionDerived, | ||
| iss: str, | ||
| sub: str, | ||
| secret_version: str, | ||
| alg: Algorithm, | ||
| signing_material: Any, | ||
| nonce: Optional[str] = None, | ||
| iat: Optional[str] = None, | ||
| version: int = 1, | ||
| ) -> DecisionRecord: | ||
| """Build, JCS-canonicalize, and sign a DecisionRecord envelope. | ||
|
|
||
| ``back_link`` joins the decision to the SEP-2787 attestation it | ||
| governs (build it with ``make_back_link``). ``decision_derived`` | ||
| carries the verdict, its risk basis, and the decision time. Any | ||
| float in the risk basis is rejected at the JCS boundary; the risk | ||
| fields MUST be decimal strings. | ||
|
|
||
| ``signing_material`` is either a bytes shared secret (HS256) or a | ||
| private-key object from ``cryptography.hazmat`` (ES256 / RS256). | ||
| """ | ||
| if alg not in VALID_ALGS: | ||
| raise AttestationError(f"unsupported alg: {alg!r}") | ||
| if not back_link.attestation_digest.startswith("sha256:"): | ||
| raise AttestationError( | ||
| "backLink.attestationDigest MUST be a 'sha256:' digest" | ||
| ) | ||
| if not back_link.attestation_nonce: | ||
| raise AttestationError("backLink.attestationNonce MUST be non-empty") | ||
|
|
||
| issuer_asserted = IssuerAsserted( | ||
| iss=iss, | ||
| sub=sub, | ||
| iat=iat or now_iso8601(), | ||
| nonce=nonce or new_nonce(), | ||
| secret_version=secret_version, | ||
| alg=alg, | ||
| ) | ||
|
|
||
| payload = _signing_payload( | ||
| version=version, | ||
| alg=alg, | ||
| back_link=back_link, | ||
| decision_derived=decision_derived, | ||
| issuer_asserted=issuer_asserted, | ||
| ) | ||
|
|
||
| if alg == "HS256": | ||
| if not isinstance(signing_material, (bytes, bytearray)): | ||
| raise AttestationError("HS256 requires bytes shared_secret") | ||
| signature_hex = sign_hs256(payload, shared_secret=bytes(signing_material)) | ||
| elif alg == "ES256": | ||
| signature_hex = sign_es256(payload, private_key=signing_material) | ||
| elif alg == "RS256": | ||
| signature_hex = sign_rs256(payload, private_key=signing_material) | ||
| else: | ||
| raise AttestationError(f"unreachable alg: {alg!r}") | ||
|
|
||
| return DecisionRecord( | ||
| version=version, | ||
| alg=alg, | ||
| back_link=back_link, | ||
| decision_derived=decision_derived, | ||
| issuer_asserted=issuer_asserted, | ||
| signature=signature_hex, | ||
| ) |
There was a problem hiding this comment.
Validate the verdict before signing.
emit_decision_record() trusts decision_derived.decision, but that Literal[...] is only static typing. Today a caller can sign DecisionDerived(decision="deny", ...), producing a record that parse_decision_record() rejects later. Please enforce the same verdict check here before canonicalization.
Suggested fix
from vaara.attestation._decision_types import (
DecisionDerived,
DecisionRecord,
IssuerAsserted,
+ VALID_VERDICTS,
decision_to_dict,
)
@@
if alg not in VALID_ALGS:
raise AttestationError(f"unsupported alg: {alg!r}")
+ if decision_derived.decision not in VALID_VERDICTS:
+ raise AttestationError(
+ f"invalid decision verdict {decision_derived.decision!r}"
+ )
if not back_link.attestation_digest.startswith("sha256:"):
raise AttestationError(
"backLink.attestationDigest MUST be a 'sha256:' digest"
)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/vaara/attestation/_decision_emit.py` around lines 62 - 130,
emit_decision_record currently signs whatever is in DecisionDerived.decision
without runtime validation; before canonicalizing/signing (in
emit_decision_record) assert that decision_derived.decision is one of the
allowed verdict literals (the same set parse_decision_record accepts, e.g.
"allow" or "deny"), and raise AttestationError if not; perform this check right
after the existing back_link/issuer_asserted validations (referencing
emit_decision_record, DecisionDerived.decision, and parse_decision_record) so
malformed decisions are rejected early.
| def decision_from_dict(d: dict[str, Any]) -> DecisionDerived: | ||
| for required in ("decision", "decidedAt"): | ||
| if required not in d: | ||
| raise AttestationError( | ||
| f"decisionDerived missing required field {required!r}" | ||
| ) | ||
| if d["decision"] not in VALID_VERDICTS: | ||
| raise AttestationError(f"invalid decision verdict {d['decision']!r}") | ||
| return DecisionDerived( | ||
| decision=d["decision"], | ||
| decided_at=d["decidedAt"], | ||
| reason=d.get("reason"), | ||
| risk_score=d.get("riskScore"), | ||
| threshold_allow=d.get("thresholdAllow"), | ||
| threshold_block=d.get("thresholdBlock"), | ||
| policy_id=d.get("policyId"), | ||
| client_turn_id=d.get("clientTurnId"), | ||
| ) |
There was a problem hiding this comment.
Reject non-string decimal fields at parse time.
decision_from_dict() currently passes riskScore/thresholdAllow/thresholdBlock through verbatim. A wire payload with riskScore: 0.21 parses successfully even though floats are banned on this envelope, and the failure is deferred until verify_decision_signature() re-canonicalizes the parsed record. Please reject non-string values here so malformed records fail at the boundary.
Suggested fix
def decision_from_dict(d: dict[str, Any]) -> DecisionDerived:
for required in ("decision", "decidedAt"):
if required not in d:
raise AttestationError(
f"decisionDerived missing required field {required!r}"
)
if d["decision"] not in VALID_VERDICTS:
raise AttestationError(f"invalid decision verdict {d['decision']!r}")
+ for field in ("riskScore", "thresholdAllow", "thresholdBlock"):
+ value = d.get(field)
+ if value is not None and not isinstance(value, str):
+ raise AttestationError(f"{field} MUST be a decimal string when present")
return DecisionDerived(
decision=d["decision"],
decided_at=d["decidedAt"],
reason=d.get("reason"),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/vaara/attestation/_decision_types.py` around lines 127 - 144, In
decision_from_dict, validate that the decimal-like fields riskScore,
thresholdAllow, and thresholdBlock (from d["riskScore"], d["thresholdAllow"],
d["thresholdBlock"]) are either absent (None) or strings; if any is present and
not a str, raise AttestationError indicating the field is invalid so floats like
0.21 are rejected at parse time; update the construction of DecisionDerived to
only pass these values after the type-checks.
| def _decimal_str(value: float) -> str: | ||
| """Stable decimal string for a risk score or threshold. | ||
|
|
||
| Floats are banned on the decision-record wire (the JCS boundary | ||
| rejects them) because cross-stack float behaviour is the most common | ||
| source of signature drift. ``repr`` gives the shortest round-tripping | ||
| decimal; scientific notation is expanded so the wire value is always | ||
| a plain decimal. | ||
| """ | ||
| if not math.isfinite(value): | ||
| raise ValueError("risk score and thresholds MUST be finite") | ||
| s = repr(float(value)) | ||
| if "e" in s or "E" in s: | ||
| s = f"{value:.12f}".rstrip("0").rstrip(".") | ||
| return s |
There was a problem hiding this comment.
Scientific-notation expansion drops small non-zero values.
f"{value:.12f}" rounds anything below 1e-12 to "0", so _decimal_str(1e-13) becomes "0". That means decision_derived_from_commit() can silently change finite risk/threshold values while bridging an existing CommitPayload. Please switch to a non-rounding expansion, e.g. via Decimal(repr(value)).
Suggested fix
+from decimal import Decimal
@@
def _decimal_str(value: float) -> str:
@@
- s = repr(float(value))
- if "e" in s or "E" in s:
- s = f"{value:.12f}".rstrip("0").rstrip(".")
- return s
+ s = format(Decimal(repr(float(value))), "f")
+ if "." in s:
+ s = s.rstrip("0").rstrip(".")
+ return s🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/vaara/audit/receipts.py` around lines 244 - 258, The current _decimal_str
function expands scientific notation using f"{value:.12f}" which rounds very
small non-zero floats to "0" (e.g. 1e-13), altering finite risk/threshold values
when reconstructed (used by decision_derived_from_commit/CommitPayload). Replace
the scientific-notation expansion with a non-rounding conversion by creating a
Decimal from the float's repr (e.g. Decimal(repr(value))) and use its
string/normalization methods to produce a plain decimal string; keep the
existing finite check and ensure the returned string never contains exponent
notation.
Summary
vaara.attestation.decision: signed decision-record envelopes that commit the governing server's policy verdict before a tool call executes. A verifier can now prove allow/block was decided before the side effect ran, closing the loop between SEP-2787 attestation and the execution receipt.AuditTrail.enable_auto_anchor()for automatic cadence anchoring. The trail anchors its own chain head every N records; a failed attempt writes a chainedANCHOR_GAPmarker so the gap is auditable and the trail continues (fail-open).neg_replay_substituted_field: field-substituted receipt replay now fails verification.Test plan
pytest: 1118 passed, 13 skippedvaara.attestation.decision)enable_auto_anchor+ANCHOR_GAPmarker inaudit/trail.pyGenerated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests