test(sep2828): decision/outcome pairing conformance vectors#185
Conversation
Six committed cases under tests/vectors/decision_pairing_v0 with a stdlib-only independent walker (no Vaara imports): valid pair, decision- only escalate, substituted attestation backlink, substituted pairing nonce, equal-decidedAt supersession tie (open contract), and fallback request-envelope binding with replay. Pairing follows the shipped shared-attestation model (records_paired). Requested on modelcontextprotocol/modelcontextprotocol#2828. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR introduces a decision/outcome pairing vector conformance suite consisting of an independent cryptographic checker, a deterministic generator script, a pytest test runner, and six normative test cases validating decision/receipt pairing with SEP-2787 attestation back-links under various scenarios including valid pairings, missing receipts, substituted attestations, nonce tampering, tie resolution, and fallback envelope bindings. ChangesDecision/Outcome Pairing Vector Conformance Suite
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes The PR combines cryptographic logic (signature verification, JCS canonicalization, ECDSA/HMAC validation), fixture generation with multiple parameterized test scenarios, and pytest integration. The scope spans three layers of distinct implementation concern (checker, generator, runner) with moderate file spread and heterogeneous logic density (signature verification is dense; case generation is repetitive; test runner is simple). Reviewers need to understand SEP-2787 back-link semantics, JCS/RFC 8785, and the pairing invariants being validated, but the logic is well-factored and the test cases are deterministic and self-documenting. Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
||
| def _write(path: Path, obj: dict) -> None: | ||
| path.parent.mkdir(parents=True, exist_ok=True) | ||
| path.write_text(json.dumps(obj, indent=2, sort_keys=True) + "\n") |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
scripts/generate_decision_pairing_vectors.py (1)
98-107: ⚡ Quick winNon-deterministic ES256 key generation breaks test vector reproducibility.
The script generates a fresh ES256 keypair on each run, which means ES256-signed test vectors (e.g.,
decision_only_escalate) will have different signatures each time the generator runs. For normative conformance vectors that external verifiers will check against committed fixtures, full determinism is essential.🔒 Suggested fix: use a deterministic seed or load existing keys
Option 1: Deterministic key derivation from a fixed seed
+import secrets + def main() -> None: - es = ec.generate_private_key(ec.SECP256R1()) + # Deterministic ES256 key from fixed seed for reproducible test vectors + seed = b"vaara-sep2828-test-vector-seed-v1" + # Derive 32 bytes deterministically (not cryptographically secure, but deterministic) + import hashlib + key_material = hashlib.sha256(seed).digest() + es_int = int.from_bytes(key_material, 'big') % ec.SECP256R1().curve.order + es = ec.derive_private_key(es_int, ec.SECP256R1()) keys = OUT / "keys"Option 2: Load existing keys if present, generate only if missing
def main() -> None: + keys = OUT / "keys" + es_private_path = keys / "es256_private.pem" + + if es_private_path.exists(): + # Load existing key for deterministic vectors + es = serialization.load_pem_private_key( + es_private_path.read_bytes(), + password=None) + else: + # Generate once, then reuse - es = ec.generate_private_key(ec.SECP256R1()) - keys = OUT / "keys" + es = ec.generate_private_key(ec.SECP256R1()) keys.mkdir(parents=True, exist_ok=True)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/generate_decision_pairing_vectors.py` around lines 98 - 107, The ES256 keypair is generated non-deterministically via ec.generate_private_key(ec.SECP256R1()) causing test-vector signatures to change; update the logic in the block that uses es (the variable holding the keypair), the write paths ("es256_private.pem"/"es256_public.pem") and the OUT/"keys" handling to ensure deterministic keys by either (A) checking if OUT/"keys"/"es256_private.pem" exists and loading it (deserialize with serialization.load_pem_private_key) and only generating/writing a new key if missing, or (B) deriving the private key deterministically from a fixed seed (e.g., use an HMAC/seed -> reproducible private scalar) instead of calling ec.generate_private_key each run, then write the private/public PEMs using the same serialization.private_bytes and public_key().public_bytes calls; implement one of these approaches so generated ES256 signatures are reproducible across runs.tests/vectors/decision_pairing_v0/normative/fallback_envelope_binding/expected.json (1)
1-7: 💤 Low valueConsider standardizing the schema across expected.json files.
Different test cases use different field sets in their expected.json files. For example,
decision_only_escalate/expected.jsonincludes fields likedecision_back_link_okandoutcome_required, while this file includesrecords_pairedandreplayed_receipt_signature_ok. A more uniform schema (even if some fields are null/omitted for certain cases) would simplify parsing for test consumers.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/vectors/decision_pairing_v0/normative/fallback_envelope_binding/expected.json` around lines 1 - 7, The expected.json schema across test vectors is inconsistent; unify fields across files by adopting a canonical set (e.g., decision_signature_ok, decision_back_link_ok, outcome_required, receipt_signature_ok, records_paired, replayed_receipt_signature_ok) and ensure each expected.json (including this fallback_envelope_binding/expected.json) contains all canonical keys, using null for irrelevant ones; update the generator or hand-edit this file so decision_signature_ok and receipt_signature_ok remain true, add decision_back_link_ok and outcome_required as null or false as appropriate, and keep records_paired and replayed_receipt_signature_ok set as in the diff so parsers can rely on a stable schema.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_decision_pairing_vectors.py`:
- Around line 16-21: Module-level pytest.skip in
tests/test_decision_pairing_vectors.py hides missing rfc8785 in CI; instead fail
loudly unless an explicit opt-out env var is set. Change the module-level guard
that checks importlib.util.find_spec("rfc8785") / "cryptography" so that if
rfc8785 is missing you call pytest.fail(...) or raise a clear RuntimeError (so
CI fails), and only allow skipping when a specific environment variable (e.g.
VAARA_SKIP_ATTESTATION_TESTS or similar) is present and truthy; keep
cryptography handling as before. Update the import-check block (the code
referencing rfc8785, cryptography, and pytest.skip) to implement this behavior
and include a clear message telling users how to opt out via the env var.
In `@tests/vectors/decision_pairing_v0/_check_independent.py`:
- Around line 75-80: verify_back_link currently always hashes _jcs(attestation);
change it to branch on the new fallback path: if the attestation contains a
fallback_envelope_binding field (attestation["fallback_envelope_binding"] or
under attestation["issuerAsserted"] if that's where your vectors place it)
compute expected = _sha256_hex(_jcs(fallback_envelope_binding)) and otherwise
compute expected = _sha256_hex(_jcs(attestation)) as before; keep using
hmac.compare_digest(bl["attestationDigest"], expected) and then return the
existing nonce comparison against attestation["issuerAsserted"]["nonce"] so the
fallback digest is independently verified.
---
Nitpick comments:
In `@scripts/generate_decision_pairing_vectors.py`:
- Around line 98-107: The ES256 keypair is generated non-deterministically via
ec.generate_private_key(ec.SECP256R1()) causing test-vector signatures to
change; update the logic in the block that uses es (the variable holding the
keypair), the write paths ("es256_private.pem"/"es256_public.pem") and the
OUT/"keys" handling to ensure deterministic keys by either (A) checking if
OUT/"keys"/"es256_private.pem" exists and loading it (deserialize with
serialization.load_pem_private_key) and only generating/writing a new key if
missing, or (B) deriving the private key deterministically from a fixed seed
(e.g., use an HMAC/seed -> reproducible private scalar) instead of calling
ec.generate_private_key each run, then write the private/public PEMs using the
same serialization.private_bytes and public_key().public_bytes calls; implement
one of these approaches so generated ES256 signatures are reproducible across
runs.
In
`@tests/vectors/decision_pairing_v0/normative/fallback_envelope_binding/expected.json`:
- Around line 1-7: The expected.json schema across test vectors is inconsistent;
unify fields across files by adopting a canonical set (e.g.,
decision_signature_ok, decision_back_link_ok, outcome_required,
receipt_signature_ok, records_paired, replayed_receipt_signature_ok) and ensure
each expected.json (including this fallback_envelope_binding/expected.json)
contains all canonical keys, using null for irrelevant ones; update the
generator or hand-edit this file so decision_signature_ok and
receipt_signature_ok remain true, add decision_back_link_ok and outcome_required
as null or false as appropriate, and keep records_paired and
replayed_receipt_signature_ok set as in the diff so parsers can rely on a stable
schema.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 383053b0-0e98-4bbb-8aa9-e0b1b60b9c04
📒 Files selected for processing (26)
scripts/generate_decision_pairing_vectors.pytests/test_decision_pairing_vectors.pytests/vectors/decision_pairing_v0/_check_independent.pytests/vectors/decision_pairing_v0/normative/decision_only_escalate/attestation.jsontests/vectors/decision_pairing_v0/normative/decision_only_escalate/decision.jsontests/vectors/decision_pairing_v0/normative/decision_only_escalate/expected.jsontests/vectors/decision_pairing_v0/normative/fallback_envelope_binding/decision.jsontests/vectors/decision_pairing_v0/normative/fallback_envelope_binding/expected.jsontests/vectors/decision_pairing_v0/normative/fallback_envelope_binding/receipt.jsontests/vectors/decision_pairing_v0/normative/fallback_envelope_binding/receipt_replayed.jsontests/vectors/decision_pairing_v0/normative/substituted_attestation_backlink/attestation.jsontests/vectors/decision_pairing_v0/normative/substituted_attestation_backlink/decision.jsontests/vectors/decision_pairing_v0/normative/substituted_attestation_backlink/expected.jsontests/vectors/decision_pairing_v0/normative/substituted_attestation_backlink/receipt.jsontests/vectors/decision_pairing_v0/normative/substituted_pairing_nonce/attestation.jsontests/vectors/decision_pairing_v0/normative/substituted_pairing_nonce/decision.jsontests/vectors/decision_pairing_v0/normative/substituted_pairing_nonce/expected.jsontests/vectors/decision_pairing_v0/normative/substituted_pairing_nonce/receipt.jsontests/vectors/decision_pairing_v0/normative/supersession_equal_decidedat_tie/attestation.jsontests/vectors/decision_pairing_v0/normative/supersession_equal_decidedat_tie/decision_a.jsontests/vectors/decision_pairing_v0/normative/supersession_equal_decidedat_tie/decision_b.jsontests/vectors/decision_pairing_v0/normative/supersession_equal_decidedat_tie/expected.jsontests/vectors/decision_pairing_v0/normative/valid_pair_allow_executed/attestation.jsontests/vectors/decision_pairing_v0/normative/valid_pair_allow_executed/decision.jsontests/vectors/decision_pairing_v0/normative/valid_pair_allow_executed/expected.jsontests/vectors/decision_pairing_v0/normative/valid_pair_allow_executed/receipt.json
| for _mod in ("rfc8785", "cryptography"): | ||
| if importlib.util.find_spec(_mod) is None: | ||
| pytest.skip( | ||
| "attestation extra not installed (pip install 'vaara[attestation]')", | ||
| allow_module_level=True, | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Dependency declarations =="
rg -n --glob 'pyproject.toml' --glob 'setup.cfg' --glob 'requirements*.txt' --glob 'tox.ini' --glob 'noxfile.py' \
'rfc8785|cryptography|vaara\[attestation\]' . || true
echo
echo "== CI workflows that run tests =="
rg -n '.+' .github/workflows || true
echo
echo "== CI references to pytest / dependency install =="
rg -n '.+' .github/workflows | rg 'pytest|pip install|poetry install|uv sync|rfc8785|cryptography|attestation' || trueRepository: vaaraio/vaara
Length of output: 38716
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for rfc8785 references =="
rg -n --hidden --glob '!**/.git/**' -S 'rfc8785' .
echo
echo "== Show relevant CI install snippet =="
sed -n '1,120p' .github/workflows/ci.ymlRepository: vaaraio/vaara
Length of output: 5494
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== occurrences of rfc8785 in requirements-dev / CI-related requirement files =="
rg -n 'rfc8785' requirements-dev.txt .github/requirements-build.txt .github/requirements*.txt pyproject.toml setup.cfg tox.ini noxfile.py 2>/dev/null || true
echo
echo "== occurrences of rfc8785 in full repo dependency lock candidates =="
rg -n 'rfc8785' --glob '!**/.git/**' --glob 'poetry.lock' --glob 'uv.lock' --glob '*.lock' . || trueRepository: vaaraio/vaara
Length of output: 294
Fix CI guard: avoid module-level skipping of decision pairing vectors when rfc8785 (JCS) is missing
tests/test_decision_pairing_vectors.py module-level pytest.skip() (lines 16-21) allows the independent walker vector suite to go green when rfc8785 isn’t installed. CI installs requirements-dev.txt + pip install -e . --no-deps but never installs the vaara[attestation] extra; rfc8785 is only declared under that extra in pyproject.toml, and it’s not present in the CI-installed requirements. Fail loudly in CI (or require an explicit opt-out env var) instead of silently skipping.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/test_decision_pairing_vectors.py` around lines 16 - 21, Module-level
pytest.skip in tests/test_decision_pairing_vectors.py hides missing rfc8785 in
CI; instead fail loudly unless an explicit opt-out env var is set. Change the
module-level guard that checks importlib.util.find_spec("rfc8785") /
"cryptography" so that if rfc8785 is missing you call pytest.fail(...) or raise
a clear RuntimeError (so CI fails), and only allow skipping when a specific
environment variable (e.g. VAARA_SKIP_ATTESTATION_TESTS or similar) is present
and truthy; keep cryptography handling as before. Update the import-check block
(the code referencing rfc8785, cryptography, and pytest.skip) to implement this
behavior and include a clear message telling users how to opt out via the env
var.
| def verify_back_link(record: dict, attestation: dict) -> bool: | ||
| expected = _sha256_hex(_jcs(attestation)) | ||
| bl = record["backLink"] | ||
| if not hmac.compare_digest(bl["attestationDigest"], expected): | ||
| return False | ||
| return bl["attestationNonce"] == attestation["issuerAsserted"]["nonce"] |
There was a problem hiding this comment.
Add the request-envelope fallback path to backlink verification.
verify_back_link() only knows how to hash attestation.json. The PR also adds a fallback_envelope_binding vector with no SEP-2787 attestation, so this checker currently has no way to recompute or assert that fallback digest at all. As written, that case can only pass if expected.json avoids asserting backlink validity, which means the fallback contract is not independently verified here.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/vectors/decision_pairing_v0/_check_independent.py` around lines 75 -
80, verify_back_link currently always hashes _jcs(attestation); change it to
branch on the new fallback path: if the attestation contains a
fallback_envelope_binding field (attestation["fallback_envelope_binding"] or
under attestation["issuerAsserted"] if that's where your vectors place it)
compute expected = _sha256_hex(_jcs(fallback_envelope_binding)) and otherwise
compute expected = _sha256_hex(_jcs(attestation)) as before; keep using
hmac.compare_digest(bl["attestationDigest"], expected) and then return the
existing nonce comparison against attestation["issuerAsserted"]["nonce"] so the
fallback digest is independently verified.
SEP-owned decision/outcome pairing conformance vectors, requested on modelcontextprotocol/modelcontextprotocol#2828 so independent verifiers can check the record shapes without sharing Vaara's emitter code.
Six committed cases under
tests/vectors/decision_pairing_v0/normative/, each withexpected.json:valid_pair_allow_executed— allow decision + executed outcome, paireddecision_only_escalate— valid terminal/pending state, no sibling outcomesubstituted_attestation_backlink— receipt binds a different attestation, does not pairsubstituted_pairing_nonce— same attestationDigest, mismatched nonce, does not pairsupersession_equal_decidedat_tie— two conflicting decisions at equaldecidedAt; flags the open tie-break contract (no deterministic ordering in the impl as of v0.50.0)fallback_envelope_binding— no SEP-2787 attestation: bind to a SHA-256 over the JCS-canonical request envelope, plus a replay negativeA stdlib-only walker (
_check_independent.py, no Vaara imports) reproduces the verdicts, guarded in CI bytests/test_decision_pairing_vectors.py. Pairing follows the shipped shared-attestation model (records_paired); the "outcome resolves to decision content digest" join under discussion in #2828 is noted as out of scope and would add a field and a case.🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes