Skip to content

release(v0.46.0): multi-tenant hardening, no new features#177

Merged
vaaraio merged 3 commits into
mainfrom
release/v0.46.0
May 31, 2026
Merged

release(v0.46.0): multi-tenant hardening, no new features#177
vaaraio merged 3 commits into
mainfrom
release/v0.46.0

Conversation

@vaaraio
Copy link
Copy Markdown
Owner

@vaaraio vaaraio commented May 31, 2026

Hardening and multi-tenant-proof release. No new features. Makes the multi-tenant runtime-governance claim true and safe so the registry-of-record entry can read "multi-tenant runtime governance for MCP fleets".

Security

  • SEP-2787 attestation verification now rejects a future-dated iat. The TTL check had only an upper bound, so a future issuance time held the validity window open indefinitely. Verification now enforces now >= iat - clock_skew. Conformance vectors gain future-dated cases.

Fixed

  • Race in the audit trail's action-to-tenant map. The mutate path (length check, eviction, insert) and the read path were unguarded, so concurrent multi-tenant traffic could raise during eviction or hand one lifecycle another tenant's scope. Dedicated leaf lock added. New tests run 16 tenants through full lifecycles concurrently and assert chain integrity plus per-tenant scope.

Changed

  • Wheel slimmed from ~8MB to ~0.8MB. include-package-data was pulling all of v1-v8 model bundles into the wheel; only v9 loads at runtime. Packaging now ships v9 only. Older bundles stay in-repo for bench and cross-eval.

CI / tooling

  • mypy build-failing gate on the strict set (vaara.policy.*), pinned 1.20.2.
  • .gitignore SQLite WAL sidecars. scripts/RELEASE.md step 3 corrected to match release_merge_and_tag.sh.

Bench

  • bench/vaara-bench-v0.46.md: sub-2ms p50 governance overhead, flat across 1-8 upstream fan-out.

Verification

  • 1076 passed, 12 skipped. ruff clean. mypy (src/vaara/policy/) clean. Wheel built and confirmed at 770K shipping only the v9 bundle.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes: Version 0.46.0

  • New Features

    • Added strict type checking to the CI pipeline.
    • Enabled automatic MCP Registry publishing in release workflow.
  • Bug Fixes

    • Resolved concurrency issue in audit trail's tenant scope tracking under concurrent writes.
  • Security

    • Strengthened attestation verification to reject future-dated timestamps and enforce clock-skew bounds.
  • Performance

    • Optimized distribution package to include only the production model bundle.
  • Tests

    • Added conformance test vectors for timestamp validation.
    • Added multi-tenant concurrency tests to verify audit chain integrity.
  • Documentation

    • Updated release process documentation for tag management post-merge.
    • Added v0.46 benchmark documentation with concurrency and governance overhead evidence.

vaaraio and others added 2 commits May 30, 2026 23:06
The MCP registry was the only release surface with no CI automation. PyPI,
npm, and the GitHub Release all ride the version tag; the registry was hand-
published, so it drifted: 0.43.0 and 0.45.0 were never published, and 0.44.0
and 0.45.1 landed late and by hand.

Adds a publish-mcp-registry job that runs after publish-pypi and the GitHub
Release. It installs mcp-publisher, authenticates by GitHub OIDC (no stored
secret; needs id-token: write), publishes both manifests (server.json and
server-vaara-server.json), then asserts the live registry's latest active
version for both listings equals the released tag and fails the job loudly on
any mismatch. mcp-publisher is pinned (MCP_PUBLISHER_VERSION) to match the
repo's pinning convention; bump it when the registry deployment moves forward.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Makes the multi-tenant runtime-governance claim true and safe rather than
aspirational. Spine: upgrade the registry-of-record entry to "multi-tenant
runtime governance for MCP fleets".

Security:
- SEP-2787 verify_attestation now rejects a future-dated iat. The TTL check
  had only an upper bound, so a future issuance time kept the validity window
  open indefinitely. Enforces now >= iat - clock_skew. Conformance set gains
  future-dated cases. CLI signature-isolation pass moved off now=0.0 (which
  the new lower bound would reject) to the envelope's own iat.

Fixed:
- Race in AuditTrail's action-to-tenant map: unguarded read/evict/insert under
  concurrent writers could raise during eviction or cross-tenant the scope.
  Dedicated lock added. New tests run 16 tenants through full lifecycles
  concurrently, asserting chain integrity and per-tenant scope.

Changed:
- Wheel slimmed ~8MB to ~0.8MB: include-package-data was pulling all of
  v1-v8 model bundles into the wheel; only v9 loads at runtime. Ship v9 only.

CI/tooling:
- mypy build-failing gate on the strict set (vaara.policy.*), pinned 1.20.2.
- gitignore SQLite WAL sidecars. RELEASE.md step 3 corrected to match the
  tag-origin/main-directly script.

Bench:
- bench/vaara-bench-v0.46.md: sub-2ms p50 governance overhead, flat across
  1-8 upstream fan-out (bench/v046_fanout.json).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Review Change Stack

Warning

Review limit reached

@vaaraio, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 37 minutes and 38 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a9223dcc-f9c8-4488-a89f-e832a446de7f

📥 Commits

Reviewing files that changed from the base of the PR and between 1e45c5c and 9b28de3.

📒 Files selected for processing (2)
  • src/vaara/scorer/adaptive.py
  • tests/test_policy_controller.py
📝 Walkthrough

Walkthrough

Version 0.46.0 release hardens SEP-2787 attestation validation with future-dated iat checks, fixes AuditTrail concurrency via dedicated tenant-map locking, slims distribution to one classifier bundle, adds MyPy and MCP Registry CI jobs, and bumps manifests across all packages.

Changes

v0.46.0 Release: Attestation, Audit Concurrency, Package Distribution, and MCP Registry

Layer / File(s) Summary
Attestation future-dated iat lower-bound validation
src/vaara/attestation/_sep2787_emit.py, src/vaara/cli.py, tests/test_attestation_sep2787.py, tests/test_attestation_vectors.py
Rejects attestations with iat timestamps too far in the future beyond configurable clock skew. CLI now evaluates signatures at the attestation's own iat epoch to isolate signature validity from TTL expiration. New tests cover future-dated rejection and clock-skew acceptance edges, and vector tests are updated to evaluate at issuance time.
Audit trail multi-tenant concurrency and thread safety
src/vaara/audit/trail.py, tests/test_v040_tenant.py
Fixes concurrent tenant-scope tracking by introducing a dedicated _tenant_map_lock guarding the _tenant_for_action dictionary, preventing iteration hazards and cross-tenant mixups under contention. Two new tests assert chain integrity, tenant_id isolation in concurrent multi-tenant lifecycles, and lock correctness under eviction loads.
Package distribution and setuptools configuration
pyproject.toml, src/vaara/policy/modes.py
Restricts wheel distribution to ship only the production adversarial_classifier_v9.joblib bundle by disabling auto-inclusion and explicitly listing package data. Includes minor refactor of emit_yaml to annotate local variable for MyPy strict compliance.
CI tooling, workflows, and local hygiene
.github/workflows/ci.yml, .github/workflows/release.yml, .gitignore, scripts/RELEASE.md
Adds new MyPy strict typechecking job for src/vaara/policy/, introduces MCP Registry publishing workflow with OIDC authentication and post-publish version verification, expands .gitignore for SQLite WAL sidecars and local tool directories, and updates release script documentation for post-squash-merge tag management via remote origin/main.
Version metadata bumps across manifests
src/vaara/__init__.py, clients/ts/package.json, server.json, server-vaara-server.json
Synchronizes version 0.45.1 → 0.46.0 across Python package constant, TypeScript npm metadata, and MCP server manifests.
Release notes, benchmark results, and governance documentation
CHANGELOG.md, bench/v046_fanout.json, bench/vaara-bench-v0.46.md
Adds release notes summarizing all v0.46.0 changes and includes new benchmark data and detailed governance/concurrency overhead analysis for fanout latency across 1–8 upstream slots.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • vaaraio/vaara#139: Introduces verify_attestation for SEP-2787 verification; main PR further tightens it with the future-dated iat lower-bound check reflected across CLI and test updates.
  • vaaraio/vaara#155: Introduces MCP server.json submission shape and ownership metadata; main PR builds on this with publish-mcp-registry job and v0.46.0 version bumps in server manifests.
  • vaaraio/vaara#176: Updates server-vaara-server.json and release.yml publish env; main PR aligns with v0.46.0 version pins and enhanced publish workflow.

Poem

🐰 Locked and loaded, we bounce through the future,
Attestations now checked ere time plays its suitor,
Tenants race safely with locks standing guard,
Wheels slim and swift—no old models scarred!
MCP Registry rings: version bumped, verified true.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 59.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main focus: a release that hardens multi-tenant safety without adding new features, matching the PR's core objectives of attestation security enforcement and AuditTrail concurrency fixes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch release/v0.46.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread tests/test_v040_tenant.py
)
aid = trail.record_action_requested(req)
trail._tenant_for(aid) # concurrent read against the evicting writer
except BaseException as exc: # noqa: BLE001 — surface any race to the assert
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/vaara/audit/trail.py`:
- Around line 597-602: Eviction of entries from _tenant_for_action causes
_tenant_for(action_id) to return "" for late
record_decision/record_execution/record_outcome calls, dropping tenant scope;
change the logic so when an entry is evicted (or when _tenant_for(action_id)
would return empty) you instead resolve the tenant by consulting the canonical
ACTION_REQUESTED record in _by_action for that action_id (the same structure
used by record_decision/record_execution/record_outcome), or persist the tenant
alongside the _by_action entry and never remove that canonical tenant info on
eviction; update the eviction block around _tenant_for_action and the
_tenant_for() helper to fall back to reading tenant_id from
_by_action[action_id] (or its stored tenant field) so long‑running actions keep
correct tenant scope (also apply the same fix at the other occurrence around
lines 658-659).

In `@src/vaara/cli.py`:
- Around line 1227-1237: The code treats any failure of the live verification as
TTL expiry; instead, separate "future-dated" (not-yet-valid) from "ttl_expired":
keep signature_ok = verify_attestation(envelope, now=isolation_now) and live_ok
= verify_attestation(envelope, now=real_now) (or call verify_attestation without
now but capture the real current time into a variable), then compute
future_dated = signature_ok and not live_ok and (envelope.iat > real_now) and
ttl_expired = signature_ok and not live_ok and not future_dated; update any JSON
output and --enforce-ttl logging to use ttl_expired only and report future_dated
when appropriate (refer to isolation_now, verify_attestation, signature_ok,
live_ok, ttl_expired).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: de00c683-3667-4525-ab69-f1e63072ed76

📥 Commits

Reviewing files that changed from the base of the PR and between d18b7e9 and 1e45c5c.

📒 Files selected for processing (19)
  • .github/workflows/ci.yml
  • .github/workflows/release.yml
  • .gitignore
  • CHANGELOG.md
  • bench/v046_fanout.json
  • bench/vaara-bench-v0.46.md
  • clients/ts/package.json
  • pyproject.toml
  • scripts/RELEASE.md
  • server-vaara-server.json
  • server.json
  • src/vaara/__init__.py
  • src/vaara/attestation/_sep2787_emit.py
  • src/vaara/audit/trail.py
  • src/vaara/cli.py
  • src/vaara/policy/modes.py
  • tests/test_attestation_sep2787.py
  • tests/test_attestation_vectors.py
  • tests/test_v040_tenant.py

Comment thread src/vaara/audit/trail.py
Comment on lines +597 to +602
with self._tenant_map_lock:
if len(self._tenant_for_action) >= self._MAX_ACTION_TENANT_MAP:
evict = max(1, self._MAX_ACTION_TENANT_MAP // 8)
for stale in list(self._tenant_for_action)[:evict]:
self._tenant_for_action.pop(stale, None)
self._tenant_for_action[action_id] = tenant_id
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve tenant scope after _tenant_for_action eviction.

Once an action_id is evicted here, _tenant_for() falls back to "", so any late record_decision / record_execution / record_outcome for that still-live action is written as single-tenant even though the original ACTION_REQUESTED record in _by_action still has the real tenant_id. In a multi-tenant release, that silently drops scope on long-running actions under load.

Proposed fix
     def _tenant_for(self, action_id: str) -> str:
         """Resolve the tenant scope for an existing action lifecycle.
 
         Returns the tenant_id captured at record_action_requested time so
         every follow-up record (risk_scored, decision, execution,
         escalation, outcome) carries the same scope automatically.
         """
-        with self._tenant_map_lock:
-            return self._tenant_for_action.get(action_id, "")
+        with self._tenant_map_lock:
+            tenant_id = self._tenant_for_action.get(action_id)
+        if tenant_id is not None:
+            return tenant_id
+        with self._lock:
+            records = self._by_action.get(action_id, [])
+            if records:
+                return records[0].tenant_id
+        return ""

Also applies to: 658-659

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/audit/trail.py` around lines 597 - 602, Eviction of entries from
_tenant_for_action causes _tenant_for(action_id) to return "" for late
record_decision/record_execution/record_outcome calls, dropping tenant scope;
change the logic so when an entry is evicted (or when _tenant_for(action_id)
would return empty) you instead resolve the tenant by consulting the canonical
ACTION_REQUESTED record in _by_action for that action_id (the same structure
used by record_decision/record_execution/record_outcome), or persist the tenant
alongside the _by_action entry and never remove that canonical tenant info on
eviction; update the eviction block around _tenant_for_action and the
_tenant_for() helper to fall back to reading tenant_id from
_by_action[action_id] (or its stored tenant field) so long‑running actions keep
correct tenant scope (also apply the same fix at the other occurrence around
lines 658-659).

Comment thread src/vaara/cli.py
Comment on lines +1227 to 1237
# Evaluating at the envelope's own iat sits inside the validity window
# (both the lower and upper time bounds pass), isolating signature
# validity; a second pass at real time reveals whether the TTL expired.
# Durable evidence files are routinely checked long after exp, so TTL is
# reported but not enforced unless --enforce-ttl is set.
signature_ok = verify_attestation(envelope, verifying_material=material, now=0.0)
isolation_now = _attest_isolation_now(envelope)
signature_ok = verify_attestation(
envelope, verifying_material=material, now=isolation_now
)
live_ok = verify_attestation(envelope, verifying_material=material)
ttl_expired = signature_ok and not live_ok
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't collapse "future-dated" into ttl_expired.

After the lower-bound check in verify_attestation, live_ok == False now covers two different states: expired and not-yet-valid. ttl_expired = signature_ok and not live_ok mislabels future-dated envelopes as expired, so the JSON output and the --enforce-ttl stderr path report the wrong failure mode for one of the new hardening cases.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/cli.py` around lines 1227 - 1237, The code treats any failure of
the live verification as TTL expiry; instead, separate "future-dated"
(not-yet-valid) from "ttl_expired": keep signature_ok =
verify_attestation(envelope, now=isolation_now) and live_ok =
verify_attestation(envelope, now=real_now) (or call verify_attestation without
now but capture the real current time into a variable), then compute
future_dated = signature_ok and not live_ok and (envelope.iat > real_now) and
ttl_expired = signature_ok and not live_ok and not future_dated; update any JSON
output and --enforce-ttl logging to use ttl_expired only and report future_dated
when appropriate (refer to isolation_now, verify_attestation, signature_ok,
live_ok, ttl_expired).

The new mypy strict gate surfaced a real latent bug. apply_policy stored
policy-schema SequencePattern objects (.pattern/.window_seconds) directly into
the scorer's _sequences, but the sequence matcher reads the scorer's runtime
fields (.actions/.window_size). A hot policy reload followed by a sequence
match raised AttributeError. Tests passed only because none exercised
reload-then-match.

apply_policy now converts each policy pattern into the scorer's SequencePattern
(actions=pattern, window_size=max(len(pattern),10) since window_size is a
lookback count not a time window). Adds a regression test that reloads a
sequence-bearing policy and asserts the match fires.

Fixes the mypy strict CI failure on adaptive.py:726.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vaaraio vaaraio merged commit 82d3fcf into main May 31, 2026
13 checks passed
@vaaraio vaaraio deleted the release/v0.46.0 branch May 31, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants