Skip to content

release(v0.40.0): streamable HTTP, fan-out, multi-tenant policy#154

Merged
vaaraio merged 1 commit into
mainfrom
feat/v040
May 27, 2026
Merged

release(v0.40.0): streamable HTTP, fan-out, multi-tenant policy#154
vaaraio merged 1 commit into
mainfrom
feat/v040

Conversation

@vaaraio
Copy link
Copy Markdown
Owner

@vaaraio vaaraio commented May 27, 2026

Summary

v0.40 is the deployment-shape release. One Vaara process now serves a fleet of upstream MCP servers, with multi-tenant policy + audit + attestation on the same substrate.

The v0.39 sidecar shape (one Vaara process per upstream) never matched the "Vaara is the schema, not a plug-in" positioning the v0.34+ READMEs ship. v0.40 closes that gap.

What lands

Streamable HTTP transport on the proxy. vaara-mcp-proxy --transport http --http-host H --http-port P runs POST /mcp via FastAPI / uvicorn (the vaara[server] extra already shipped in v0.39 for vaara serve). The endpoint reads X-Vaara-Tenant and X-Vaara-Upstream per request, pushes them into ContextVars, and dispatches into the existing _handle_request path so policy, perimeter, OVERT, and progress-notification handling all light up unchanged. Notifications (no JSON-RPC id) return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404.

Fan-out via repeatable --upstream NAME=CMD. One Vaara process holds N UpstreamMCPClient instances in a name -> client map. Bare --upstream CMD keeps the v0.39 single-upstream contract (lands in the "default" slot). Commands that themselves contain = (e.g. python -m foo --bar=baz) stay intact because the name-side regex only matches short alphanumeric slugs. When more than one upstream is configured, a request with no X-Vaara-Upstream header returns 400 with the list of valid slots in the error envelope. Silent fallback to whichever slot won the sort would be the failure mode that surfaces only in production. Single-upstream deployments keep the silent-default contract.

tenant_id end-to-end through the request, decision, audit, and attestation layers.

  • ScoreRequest, AuditEventRequest, PolicyReloadRequest accept a tenant_id body field, with X-Vaara-Tenant as the HTTP-header alternative. Body wins over header.
  • AuditRecord gains a tenant_id field, excluded from compute_hash() so pre-v0.40 chains still re-verify on load.
  • AuditTrail keeps an action_id -> tenant_id map seeded by record_action_requested, so every follow-up record (risk_scored, decision, execution, escalation, outcome, policy_override) inherits the same scope without every caller threading tenant_id through every signature. The map is soft-capped (50k entries, 12.5% eviction on pressure).
  • SQLiteAuditBackend.write_record prefers the per-record tenant_id when set. A single backend instance can now serve a multi-tenant runtime.
  • OVERT envelopes carry tenant_id as a non_content_metadata claim.

Per-tenant policy plane. vaara.policy.registry.PolicyRegistry holds one PolicyController per tenant with the empty string slot reserved as the default fallback. vaara serve --policy-dir DIR loads one YAML/JSON policy per file (filename stem = tenant_id, default.yaml lands in the fallback slot). POST /v1/policy/reload routes per tenant via tenant_id body field or X-Vaara-Tenant header, creates the slot on first reload.

Per-tenant threshold dispatch at evaluate-time. AdaptiveScorer.evaluate consults the registry on every call. A new policy_lookup: Callable[[str], Optional[Policy]] constructor arg (with set_policy_lookup for late binding from ServerState) lets the scorer ask which tenant policy applies right now and use its allow / deny thresholds for this evaluation. Unknown tenant or no lookup configured falls back to the scorer-bound defaults that the default-slot listener keeps fresh on reload. The backend decision dict surfaces the applied threshold_allow and threshold_deny so operators can confirm which tenant's policy ran. MWU expert state, the conformal calibrator, agent profiles, and sequence patterns stay shared — only threshold application is per-tenant in v0.40.

Scope notes (honest)

  • HTTP transport is POST-only. GET-SSE for server-initiated notifications (sampling, server-pushed progress) is v0.41. The audit + OVERT emission path for upstream-originated notifications still works unchanged on stdio.
  • Classifier and conformal-calibrator hot-reload remain a restart operation. Per-tenant policy reload IS hot, which is the deployment-shape need that lands. Classifier reload waits on a shared singleton lifecycle (v0.41 candidate).
  • Cancellation routing across the fan-out map is v0.41 hardening. Current _upstream.notify() goes to the ctxvar-resolved upstream; cancellation tracking per inflight tools/call is a separate piece of plumbing.
  • Fan-out latency bench (p50 / p95 / p99 under N upstreams) is v0.40.1 measurement. A bench bundled with the same PR that introduces the surface being measured produces synthetic numbers, not deployment ones.

Spec lane reference

References modelcontextprotocol/modelcontextprotocol#2787 for the SEP-2787 envelope shape v0.40 builds on top of. The v0.40 fleet shape sits above SEP-2787, not in conflict with it.

Test plan

  • 861 passed, 12 skipped on the full suite (excluding tests/adversarial, with the pre-existing v9 SSRF-floor deselect documented in v0.39.0 STATE).
  • Ruff clean on src/vaara/, tests/test_v040_*.py.
  • 44 new tests across tests/test_v040_tenant.py, tests/test_v040_policy_registry.py, tests/test_v040_mcp_http_transport.py, tests/test_v040_per_tenant_threshold.py covering: ScoreRequest tenant routing (body, header, body-wins-over-header), audit-trail tenant propagation through followup records, map eviction under pressure, tenant length cap, AuditRecord.compute_hash excludes tenant_id so chains re-verify, PolicyRegistry reload-creates-slot, default-slot fallback, directory loader, Policy-instance reload path, /v1/policy/reload per-tenant routing via body and header, single-controller back-compat, unconfigured 409, _parse_upstream_specs edge cases (bare command, named, command containing =, legacy args, multiple named, non-slug-name fallback), streamable-HTTP endpoint dispatch by X-Vaara-Upstream, unknown-upstream 404, notification 202, body-size 413, bad-JSON parse-error path, ambiguous-fan-out-without-header 400 with slot list, single-upstream silent-default preserved, per-tenant threshold dispatch at evaluate-time (tenant override, unknown-tenant fallback, empty-tenant skip, lookup-exception fallback, reload-visible-to-next-call, dry-run honours per-tenant, ServerState auto-wiring, non-Policy return fallback).
  • PyPI install smoke test: pip install vaara==0.40.0 -> import -> vaara-mcp-proxy --help shows --transport, --upstream, --http-host, --http-port.
  • npm install smoke test: npm install @vaara/client@0.40.0.

Files changed

Four logical commits squashed into one ship commit:

  1. v0.40 foundation: tenant_id end-to-end + PolicyRegistry
  2. v0.40 transport: streamable-HTTP + fan-out on the proxy
  3. v0.40 ship: audit fixes (memory cap, tenant length cap, NAME=CMD parser hardening, body-size limit), tests, version bumps, CHANGELOG.
  4. v0.40 UX: require X-Vaara-Upstream when fan-out is ambiguous; 400 with valid-slot list instead of silent-default routing.

Summary by CodeRabbit

Release v0.40.0

  • New Features

    • Multi-tenant deployment support with per-tenant policy configuration and threshold overrides.
    • HTTP streamable transport option for MCP proxy with upstream routing via X-Vaara-Upstream header.
    • Per-tenant policy directory loading (--policy-dir) and request-scoped tenant selection via X-Vaara-Tenant header.
    • New vaara-mcp-proxy CLI entrypoint replacing legacy module invocation.
  • Bug Fixes

    • Audit records now correctly persist and propagate tenant context in multi-tenant scenarios.
  • Chores

    • Version updated to 0.40.0.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

Warning

Review limit reached

@vaaraio, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 36 minutes and 17 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 2e37fe78-aa12-4fd6-97e8-0db9a99ff254

📥 Commits

Reviewing files that changed from the base of the PR and between 63b6fc1 and 6204b54.

📒 Files selected for processing (24)
  • .github/workflows/ci.yml
  • CHANGELOG.md
  • README.md
  • clients/ts/package.json
  • pyproject.toml
  • src/vaara/__init__.py
  • src/vaara/audit/sqlite_backend.py
  • src/vaara/audit/trail.py
  • src/vaara/cli.py
  • src/vaara/integrations/mcp_proxy.py
  • src/vaara/pipeline.py
  • src/vaara/policy/controller.py
  • src/vaara/policy/registry.py
  • src/vaara/scorer/adaptive.py
  • src/vaara/server/app.py
  • src/vaara/server/routes.py
  • src/vaara/server/schemas.py
  • src/vaara/server/state.py
  • src/vaara/taxonomy/actions.py
  • tests/test_integrations_mcp_proxy.py
  • tests/test_v040_mcp_http_transport.py
  • tests/test_v040_per_tenant_threshold.py
  • tests/test_v040_policy_registry.py
  • tests/test_v040_tenant.py
📝 Walkthrough

Walkthrough

Release v0.40.0 introduces comprehensive multi-tenant support: tenant-scoped data model (ActionRequest, AuditRecord, AuditTrail), per-tenant PolicyRegistry for policy management, tenant-aware AdaptiveScorer threshold lookup, HTTP server X-Vaara-Tenant header routing, and MCP proxy multi-upstream fan-out with Streamable HTTP transport. Includes extensive test coverage for all new features.

Changes

v0.40.0 Multi-Tenant Deployment & Policy Registry

Layer / File(s) Summary
Version bumps and CLI entrypoint
pyproject.toml, clients/ts/package.json, src/vaara/__init__.py
Version updated to 0.40.0 across all packages. New vaara-mcp-proxy console script added to pyproject.toml.
Tenant-scoped core data model and audit trail tracking
src/vaara/taxonomy/actions.py, src/vaara/audit/trail.py, src/vaara/audit/sqlite_backend.py
ActionRequest gains tenant_id field. AuditRecord adds tenant_id (excluded from compute_hash() for backward compatibility). AuditTrail maintains internal per-action tenant mapping with size-bounded eviction and capping, propagating tenant scope across all lifecycle record methods.
Per-tenant policy registry and controller updates
src/vaara/policy/registry.py, src/vaara/policy/controller.py
New PolicyRegistry class manages per-tenant PolicyController instances with "" default slot. Supports per-tenant reload and directory-based bulk loading. PolicyController.reload() extended to accept pre-validated Policy instances.
Per-tenant threshold dispatch in AdaptiveScorer
src/vaara/scorer/adaptive.py
AdaptiveScorer gains optional policy_lookup callback for late-binding tenant-specific thresholds. evaluate() and dry_run_evaluate() resolve per-call (threshold_allow, threshold_deny) from context tenant_id with fallback to scorer defaults. Resolved thresholds included in backend decision payload.
HTTP server tenant header handling and routing
src/vaara/server/routes.py, src/vaara/server/schemas.py, src/vaara/server/state.py, src/vaara/server/app.py
Routes and schemas updated with X-Vaara-Tenant header support and tenant_id fields in ScoreRequest, AuditEventRequest, PolicyReloadRequest, and PolicyReloadResponse. ServerState wires PolicyRegistry or single PolicyController with tenant-aware policy lookup callback into scorer. remember_action() persists tenant_id.
Pipeline tenant support and CLI policy-dir option
src/vaara/pipeline.py, src/vaara/cli.py
InterceptionPipeline.intercept() gains tenant_id parameter. CLI serve command adds --policy-dir option (mutually exclusive with --policy) to load per-tenant policies via PolicyRegistry.load_directory().
MCP proxy multi-upstream fan-out and HTTP transport
src/vaara/integrations/mcp_proxy.py
VaaraMCPProxy refactored for multi-upstream routing via context vars. New run_http() method implements FastAPI/Uvicorn MCP proxy with /health and /mcp endpoints, X-Vaara-Upstream header-based routing, 1MiB body size cap, and per-request tenant context threading. _parse_upstream_specs() parses NAME=CMD fan-out or assigns bare CMD to default. CLI main() gains --transport switch and HTTP bind options.
Documentation and release notes
CHANGELOG.md, README.md
CHANGELOG.md adds [Unreleased] marker and comprehensive [0.40.0] release notes covering multi-tenant deployment, tenant routing/scoping, policy registry, and MCP proxy enhancements. README.md documents vaara-mcp-proxy CLI, fleet shape deployment, multi-upstream headers, and policy-dir loading.
MCP proxy HTTP transport test suite
tests/test_v040_mcp_http_transport.py
Comprehensive tests for _parse_upstream_specs() parsing, VaaraMCPProxy constructor validation, FastAPI /health and /mcp endpoint routing, X-Vaara-Upstream header enforcement, error responses (400/404/413), single-upstream silent default, and X-Vaara-Tenant header threading.
Policy registry and per-tenant reload test suite
tests/test_v040_policy_registry.py
Tests validate PolicyRegistry.reload() tenant slot creation, get() fallback vs get_exact(), load_directory(), Policy instance reload, and HTTP /v1/policy/reload per-tenant routing via body and header, backward compatibility with single PolicyController, and 409 error when unconfigured.
Per-tenant scorer threshold evaluation test suite
tests/test_v040_per_tenant_threshold.py
Tests validate AdaptiveScorer per-tenant threshold dispatch, fallback behavior for unknown/empty tenants, lookup exceptions, policy reload visibility, dry_run_evaluate() per-tenant values, ServerState registry wiring, HTTP /v1/score per-tenant response correctness, and robustness to non-policy returns.
End-to-end tenant propagation test suite
tests/test_v040_tenant.py
Tests validate /v1/score tenant_id selection (body precedence), persistence in action info, audit event and trail tenant propagation, internal tenant-map eviction and length capping, and regression assertion that AuditRecord.compute_hash() excludes tenant_id.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • vaaraio/vaara#82: Both PRs extend policy-reload and scorer update paths, but this PR introduces multi-tenant PolicyRegistry/tenant_id selection where #82 modified single-policy reload semantics.
  • vaaraio/vaara#75: Directly related; both modify v0.10 HTTP server code (routes/schemas/state/app) to handle tenant-scoped request/response behavior on the /v1/* endpoints.
  • vaaraio/vaara#100: This PR's multi-upstream MCP proxy changes build directly on #100's initial VaaraMCPProxy implementation, extending the same proxy class with fan-out routing and HTTP transport.

Poem

🐰 In forty-point-oh, tenants now bloom,
Each policy steered to its rightful room,
Context vars dance through the MCP flow,
Registry thresholds tell which way to go,
Multi-upstream feast, no more single fare—
A rabbit's delight: such abundance to share! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.51% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and specifically summarizes the main changes: versioning to v0.40.0 and three key features (streamable HTTP, fan-out, multi-tenant policy).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/v040

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread src/vaara/integrations/mcp_proxy.py Fixed
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
README.md (1)

184-190: ⚡ Quick win

Consider showing more realistic command examples in the multi-upstream section.

The multi-upstream example shows:

--upstream github=github-mcp-server \
--upstream sap=sap-mcp-server

But comparing to the single-upstream example on lines 171-174, which shows the full invocation --upstream npx --upstream-arg -y --upstream-arg @sap/mdk-mcp-server``, users might not understand that the right side of NAME=CMD should be a complete command. Consider using more realistic examples that show the full command pattern, such as:

--upstream github="npx -y `@github/mcp-server`" \
--upstream sap="npx -y `@sap/mdk-mcp-server`"

This would make it clearer that CMD is the full command invocation, not just an executable name.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 184 - 190, Update the multi-upstream README example
to show full command invocations so users understand the right-hand side of the
NAME=CMD pair is a complete command (like in the single-upstream example).
Replace the stubbed upstream entries in the vaara-mcp-proxy invocation (the
lines using --upstream github=github-mcp-server and --upstream
sap=sap-mcp-server) with realistic command strings, e.g. using the same pattern
as the single-upstream example (npx -y `@scope/pkg`), so entries read like
--upstream github="npx -y `@github/mcp-server`" and --upstream sap="npx -y
`@sap/mdk-mcp-server`".
src/vaara/scorer/adaptive.py (1)

774-776: 💤 Low value

Misleading comment about lock scope.

The comment states "The lookup callback runs outside the scorer lock" but _thresholds_for is called from _evaluate_locked which runs under self._lock (line 821). The lookup actually executes inside the scorer lock.

This doesn't cause deadlocks in the current design (lock ordering is consistently scorer → registry), but the comment should be corrected to avoid confusion during future maintenance.

📝 Suggested comment fix
-        The lookup callback runs outside the scorer lock —
-        the registry has its own lock, and we want a clean lock
-        ordering (scorer -> registry, never the reverse).
+        Lock ordering is scorer -> registry (the registry has its
+        own lock). This ordering is maintained because _thresholds_for
+        is called from _evaluate_locked which already holds the scorer
+        lock, and the lookup acquires the registry lock.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/scorer/adaptive.py` around lines 774 - 776, The comment is
incorrect: _thresholds_for is invoked from _evaluate_locked while holding
self._lock, so the lookup callback actually runs inside the scorer lock; update
the comment near the registry/lookup description to state that the lookup
executes under the scorer lock (self._lock) and clarify that the code relies on
a consistent lock ordering (scorer → registry) to avoid deadlocks rather than
implying the callback runs outside the lock; reference _thresholds_for,
_evaluate_locked, and the registry when making this change.
src/vaara/audit/trail.py (1)

586-592: 💤 Low value

Thread-safety note for _tenant_for_action under free-threaded Python.

The _tenant_for_action dict is mutated and read without holding self._lock. Under the GIL this is safe due to atomic dict operations, but under free-threaded Python 3.13t (which the codebase explicitly targets per other comments) there's a theoretical race between concurrent record_action_requested calls mutating the map and _tenant_for reading it.

The documented fail-soft behavior (fallback to "") makes this acceptable, but consider wrapping these accesses under self._lock for full 3.13t safety if tenant isolation becomes stricter in future versions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/audit/trail.py` around lines 586 - 592, The code mutates and reads
the shared dict _tenant_for_action in record_action_requested (and other places
like _tenant_for) without using self._lock, which can race under free-threaded
Python; protect all accesses that modify or read _tenant_for_action by acquiring
self._lock (use the same lock already on the object) around the eviction loop,
the assignment self._tenant_for_action[action_id] = tenant_id, and any reads in
_tenant_for so that reads and writes are atomic relative to each other and avoid
races on Python 3.13t.
tests/test_v040_mcp_http_transport.py (1)

78-82: ⚡ Quick win

Add a regression assertion that "default" is an alias, not a second client instance.

Current test only checks existence/non-None, so duplicate upstream process creation can slip through undetected.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_v040_mcp_http_transport.py` around lines 78 - 82, The test
test_multi_upstream_populates_default_slot currently only checks presence of
"default" in http_proxy._upstreams; change it to assert that the "default" entry
is an alias (the exact same upstream client object) rather than a separate
instance. Concretely, after confirming "default" exists, identify the original
upstream key used as the fallback (e.g., the sorted-first upstream name among
http_proxy._upstreams keys excluding "default") and assert
http_proxy._upstreams["default"] is http_proxy._upstreams[that_original_key]
(object identity), ensuring no duplicate client was created.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/vaara/integrations/mcp_proxy.py`:
- Line 753: The callback code reads tenant_id from the ContextVar
_REQUEST_TENANT inside async notification handlers which can run outside the
originating request; instead, capture tenant_id = _REQUEST_TENANT.get() at the
point you schedule/construct the notification and pass that explicit tenant_id
into the async notification callbacks and into the audit/OVERT paths (replace
usages of _REQUEST_TENANT.get() inside those handlers with the passed
tenant_id). Update all related call sites (including the other occurrences
around the noted block at the later lines) so the tenant is threaded through the
notification creation/signaling functions rather than re-reading the ContextVar.
- Around line 157-168: The current code clones the command into
upstream_map["default"] and then constructs a new UpstreamMCPClient, causing a
duplicate subprocess; instead, detect the first lexicographic key (e.g.,
first_name = sorted(upstream_map)[0]) when "default" is missing, do not mutate
upstream_map to add a duplicate command, build self._upstreams by instantiating
UpstreamMCPClient for each original named slot (using
UpstreamMCPClient(command=..., on_notification=self._on_upstream_notification)
for name, command in upstream_map.items()), and finally set
self._upstreams["default"] = self._upstreams[first_name] so the "default" slot
aliases the existing UpstreamMCPClient rather than spawning another one.

In `@src/vaara/server/state.py`:
- Around line 31-61: The constructor currently allows both policy_controller and
policy_registry to be passed which can silently mix default and per-tenant
policies; add a fail-fast check at the start of the initializer that raises a
clear exception (e.g., ValueError) if both policy_controller and policy_registry
are not None, referencing the existing parameters policy_controller and
policy_registry so callers must provide only one source; keep the rest of the
logic unchanged (the existing registration/lookup/listener wiring for
PolicyRegistry and policy_controller can assume exclusivity once this check is
in place).

---

Nitpick comments:
In `@README.md`:
- Around line 184-190: Update the multi-upstream README example to show full
command invocations so users understand the right-hand side of the NAME=CMD pair
is a complete command (like in the single-upstream example). Replace the stubbed
upstream entries in the vaara-mcp-proxy invocation (the lines using --upstream
github=github-mcp-server and --upstream sap=sap-mcp-server) with realistic
command strings, e.g. using the same pattern as the single-upstream example (npx
-y `@scope/pkg`), so entries read like --upstream github="npx -y
`@github/mcp-server`" and --upstream sap="npx -y `@sap/mdk-mcp-server`".

In `@src/vaara/audit/trail.py`:
- Around line 586-592: The code mutates and reads the shared dict
_tenant_for_action in record_action_requested (and other places like
_tenant_for) without using self._lock, which can race under free-threaded
Python; protect all accesses that modify or read _tenant_for_action by acquiring
self._lock (use the same lock already on the object) around the eviction loop,
the assignment self._tenant_for_action[action_id] = tenant_id, and any reads in
_tenant_for so that reads and writes are atomic relative to each other and avoid
races on Python 3.13t.

In `@src/vaara/scorer/adaptive.py`:
- Around line 774-776: The comment is incorrect: _thresholds_for is invoked from
_evaluate_locked while holding self._lock, so the lookup callback actually runs
inside the scorer lock; update the comment near the registry/lookup description
to state that the lookup executes under the scorer lock (self._lock) and clarify
that the code relies on a consistent lock ordering (scorer → registry) to avoid
deadlocks rather than implying the callback runs outside the lock; reference
_thresholds_for, _evaluate_locked, and the registry when making this change.

In `@tests/test_v040_mcp_http_transport.py`:
- Around line 78-82: The test test_multi_upstream_populates_default_slot
currently only checks presence of "default" in http_proxy._upstreams; change it
to assert that the "default" entry is an alias (the exact same upstream client
object) rather than a separate instance. Concretely, after confirming "default"
exists, identify the original upstream key used as the fallback (e.g., the
sorted-first upstream name among http_proxy._upstreams keys excluding "default")
and assert http_proxy._upstreams["default"] is
http_proxy._upstreams[that_original_key] (object identity), ensuring no
duplicate client was created.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 0d2e78a0-ffba-4297-b6a1-2d82fbab0414

📥 Commits

Reviewing files that changed from the base of the PR and between 0f30fb2 and 63b6fc1.

📒 Files selected for processing (22)
  • CHANGELOG.md
  • README.md
  • clients/ts/package.json
  • pyproject.toml
  • src/vaara/__init__.py
  • src/vaara/audit/sqlite_backend.py
  • src/vaara/audit/trail.py
  • src/vaara/cli.py
  • src/vaara/integrations/mcp_proxy.py
  • src/vaara/pipeline.py
  • src/vaara/policy/controller.py
  • src/vaara/policy/registry.py
  • src/vaara/scorer/adaptive.py
  • src/vaara/server/app.py
  • src/vaara/server/routes.py
  • src/vaara/server/schemas.py
  • src/vaara/server/state.py
  • src/vaara/taxonomy/actions.py
  • tests/test_v040_mcp_http_transport.py
  • tests/test_v040_per_tenant_threshold.py
  • tests/test_v040_policy_registry.py
  • tests/test_v040_tenant.py

Comment thread src/vaara/integrations/mcp_proxy.py
Comment thread src/vaara/integrations/mcp_proxy.py Outdated
Comment thread src/vaara/server/state.py
Comment thread tests/test_v040_per_tenant_threshold.py Fixed
Comment thread tests/test_v040_per_tenant_threshold.py Fixed
@vaaraio vaaraio force-pushed the feat/v040 branch 2 times, most recently from 6505fd9 to 05f4808 Compare May 27, 2026 23:29
logger.exception("Failed to forward HTTP notification")
return Response(status_code=202)
response = proxy._handle_request(payload)
return JSONResponse(content=response)
logger.warning(
"tools/call rejected at perimeter (operator filter): %s", tool_name,
"tools/call rejected at perimeter (operator filter): %s",
_safe_log(tool_name),
logger.warning(
"resources/read rejected at perimeter (operator filter): %s", uri,
"resources/read rejected at perimeter (operator filter): %s",
_safe_log(uri),
logger.warning(
"prompts/get rejected at perimeter (operator filter): %s", name,
"prompts/get rejected at perimeter (operator filter): %s",
_safe_log(name),
One Vaara process now serves a fleet of upstream MCP servers over
Streamable HTTP, with multi-tenant policy, audit chain, and OVERT
attestation on the same substrate. v0.39 ran one Vaara process per
upstream; v0.40 collapses that into a single multi-tenant deployment.

Streamable HTTP transport on the proxy. `vaara-mcp-proxy --transport
http --http-host H --http-port P` runs POST /mcp via FastAPI / uvicorn.
The endpoint reads `X-Vaara-Tenant` and `X-Vaara-Upstream` per request,
pushes them into ContextVars, and dispatches into the existing
`_handle_request` path so policy, perimeter, OVERT, and
progress-notification handling all light up unchanged. Notifications
return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404.

Fan-out via repeatable `--upstream NAME=CMD`. One Vaara process holds N
UpstreamMCPClient instances in a name -> client map. Bare `--upstream
CMD` keeps the v0.39 single-upstream contract (lands in the "default"
slot). When more than one upstream is configured, a request with no
`X-Vaara-Upstream` header returns 400 with the list of valid slots in
the error envelope. Single-upstream deployments keep the silent-default
contract.

tenant_id end-to-end. ScoreRequest, AuditEventRequest, PolicyReloadRequest
accept a `tenant_id` body field, with `X-Vaara-Tenant` as the HTTP-header
alternative (body wins over header). AuditRecord gains a `tenant_id`
field, excluded from `compute_hash()` so pre-v0.40 chains still
re-verify. AuditTrail keeps an `action_id -> tenant_id` map seeded by
`record_action_requested`, soft-capped at 50k entries.
SQLiteAuditBackend.write_record prefers per-record tenant. OVERT
envelopes carry `tenant_id` as a `non_content_metadata` claim.

Per-tenant policy plane. `vaara.policy.registry.PolicyRegistry` holds
one PolicyController per tenant with the empty string slot reserved as
the default fallback. `vaara serve --policy-dir DIR` loads one YAML/JSON
policy per file (filename stem = tenant_id). `POST /v1/policy/reload`
routes per tenant via body field or header.

Installs `vaara-mcp-proxy` as a top-level console script so the proxy
CLI matches what every v0.39+ docs surface advertises. Earlier releases
only shipped the proxy as `python -m vaara.integrations.mcp_proxy`;
v0.40 closes that gap. v0.41 will fold the proxy into the main `vaara`
verb tree (`vaara mcp-proxy ...`) and keep `vaara-mcp-proxy` as a thin
alias for one release cycle.

Per-tenant threshold dispatch at evaluate-time. `AdaptiveScorer.evaluate`
consults the registry on every call. A new `policy_lookup` constructor
arg (and `set_policy_lookup` for late binding from ServerState) lets
the scorer ask which tenant policy applies right now and use its
allow/deny thresholds for THIS evaluation. Unknown tenant or no lookup
configured falls back to the scorer-bound defaults that the default-slot
listener keeps fresh on reload. The backend decision dict surfaces the
applied threshold_allow and threshold_deny so operators can confirm
which tenant's policy ran. MWU expert state, the conformal calibrator,
agent profiles, and sequence patterns stay shared across tenants; only
threshold application is per-tenant in v0.40.

Scope notes. HTTP transport is POST-only (GET-SSE is v0.41). Per-tenant
policy reload is hot; classifier hot-reload still restart-only.
Cancellation routing across fan-out is v0.41 hardening. Fan-out latency
bench is v0.40.1 measurement.

862 passed, 12 skipped. 45 new tests across tests/test_v040_tenant.py,
tests/test_v040_policy_registry.py, tests/test_v040_mcp_http_transport.py,
tests/test_v040_per_tenant_threshold.py.

References modelcontextprotocol/modelcontextprotocol#2787 for the
SEP-2787 envelope shape v0.40 builds on top of.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants