Conversation
|
Warning Review limit reached
More reviews will be available in 36 minutes and 17 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (24)
📝 WalkthroughWalkthroughRelease v0.40.0 introduces comprehensive multi-tenant support: tenant-scoped data model (ActionRequest, AuditRecord, AuditTrail), per-tenant PolicyRegistry for policy management, tenant-aware AdaptiveScorer threshold lookup, HTTP server X-Vaara-Tenant header routing, and MCP proxy multi-upstream fan-out with Streamable HTTP transport. Includes extensive test coverage for all new features. Changesv0.40.0 Multi-Tenant Deployment & Policy Registry
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
README.md (1)
184-190: ⚡ Quick winConsider showing more realistic command examples in the multi-upstream section.
The multi-upstream example shows:
--upstream github=github-mcp-server \ --upstream sap=sap-mcp-serverBut comparing to the single-upstream example on lines 171-174, which shows the full invocation
--upstream npx --upstream-arg -y --upstream-arg@sap/mdk-mcp-server``, users might not understand that the right side ofNAME=CMDshould be a complete command. Consider using more realistic examples that show the full command pattern, such as:--upstream github="npx -y `@github/mcp-server`" \ --upstream sap="npx -y `@sap/mdk-mcp-server`"This would make it clearer that
CMDis the full command invocation, not just an executable name.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 184 - 190, Update the multi-upstream README example to show full command invocations so users understand the right-hand side of the NAME=CMD pair is a complete command (like in the single-upstream example). Replace the stubbed upstream entries in the vaara-mcp-proxy invocation (the lines using --upstream github=github-mcp-server and --upstream sap=sap-mcp-server) with realistic command strings, e.g. using the same pattern as the single-upstream example (npx -y `@scope/pkg`), so entries read like --upstream github="npx -y `@github/mcp-server`" and --upstream sap="npx -y `@sap/mdk-mcp-server`".src/vaara/scorer/adaptive.py (1)
774-776: 💤 Low valueMisleading comment about lock scope.
The comment states "The lookup callback runs outside the scorer lock" but
_thresholds_foris called from_evaluate_lockedwhich runs underself._lock(line 821). The lookup actually executes inside the scorer lock.This doesn't cause deadlocks in the current design (lock ordering is consistently scorer → registry), but the comment should be corrected to avoid confusion during future maintenance.
📝 Suggested comment fix
- The lookup callback runs outside the scorer lock — - the registry has its own lock, and we want a clean lock - ordering (scorer -> registry, never the reverse). + Lock ordering is scorer -> registry (the registry has its + own lock). This ordering is maintained because _thresholds_for + is called from _evaluate_locked which already holds the scorer + lock, and the lookup acquires the registry lock.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/vaara/scorer/adaptive.py` around lines 774 - 776, The comment is incorrect: _thresholds_for is invoked from _evaluate_locked while holding self._lock, so the lookup callback actually runs inside the scorer lock; update the comment near the registry/lookup description to state that the lookup executes under the scorer lock (self._lock) and clarify that the code relies on a consistent lock ordering (scorer → registry) to avoid deadlocks rather than implying the callback runs outside the lock; reference _thresholds_for, _evaluate_locked, and the registry when making this change.src/vaara/audit/trail.py (1)
586-592: 💤 Low valueThread-safety note for
_tenant_for_actionunder free-threaded Python.The
_tenant_for_actiondict is mutated and read without holdingself._lock. Under the GIL this is safe due to atomic dict operations, but under free-threaded Python 3.13t (which the codebase explicitly targets per other comments) there's a theoretical race between concurrentrecord_action_requestedcalls mutating the map and_tenant_forreading it.The documented fail-soft behavior (fallback to
"") makes this acceptable, but consider wrapping these accesses underself._lockfor full 3.13t safety if tenant isolation becomes stricter in future versions.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/vaara/audit/trail.py` around lines 586 - 592, The code mutates and reads the shared dict _tenant_for_action in record_action_requested (and other places like _tenant_for) without using self._lock, which can race under free-threaded Python; protect all accesses that modify or read _tenant_for_action by acquiring self._lock (use the same lock already on the object) around the eviction loop, the assignment self._tenant_for_action[action_id] = tenant_id, and any reads in _tenant_for so that reads and writes are atomic relative to each other and avoid races on Python 3.13t.tests/test_v040_mcp_http_transport.py (1)
78-82: ⚡ Quick winAdd a regression assertion that
"default"is an alias, not a second client instance.Current test only checks existence/non-None, so duplicate upstream process creation can slip through undetected.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_v040_mcp_http_transport.py` around lines 78 - 82, The test test_multi_upstream_populates_default_slot currently only checks presence of "default" in http_proxy._upstreams; change it to assert that the "default" entry is an alias (the exact same upstream client object) rather than a separate instance. Concretely, after confirming "default" exists, identify the original upstream key used as the fallback (e.g., the sorted-first upstream name among http_proxy._upstreams keys excluding "default") and assert http_proxy._upstreams["default"] is http_proxy._upstreams[that_original_key] (object identity), ensuring no duplicate client was created.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/vaara/integrations/mcp_proxy.py`:
- Line 753: The callback code reads tenant_id from the ContextVar
_REQUEST_TENANT inside async notification handlers which can run outside the
originating request; instead, capture tenant_id = _REQUEST_TENANT.get() at the
point you schedule/construct the notification and pass that explicit tenant_id
into the async notification callbacks and into the audit/OVERT paths (replace
usages of _REQUEST_TENANT.get() inside those handlers with the passed
tenant_id). Update all related call sites (including the other occurrences
around the noted block at the later lines) so the tenant is threaded through the
notification creation/signaling functions rather than re-reading the ContextVar.
- Around line 157-168: The current code clones the command into
upstream_map["default"] and then constructs a new UpstreamMCPClient, causing a
duplicate subprocess; instead, detect the first lexicographic key (e.g.,
first_name = sorted(upstream_map)[0]) when "default" is missing, do not mutate
upstream_map to add a duplicate command, build self._upstreams by instantiating
UpstreamMCPClient for each original named slot (using
UpstreamMCPClient(command=..., on_notification=self._on_upstream_notification)
for name, command in upstream_map.items()), and finally set
self._upstreams["default"] = self._upstreams[first_name] so the "default" slot
aliases the existing UpstreamMCPClient rather than spawning another one.
In `@src/vaara/server/state.py`:
- Around line 31-61: The constructor currently allows both policy_controller and
policy_registry to be passed which can silently mix default and per-tenant
policies; add a fail-fast check at the start of the initializer that raises a
clear exception (e.g., ValueError) if both policy_controller and policy_registry
are not None, referencing the existing parameters policy_controller and
policy_registry so callers must provide only one source; keep the rest of the
logic unchanged (the existing registration/lookup/listener wiring for
PolicyRegistry and policy_controller can assume exclusivity once this check is
in place).
---
Nitpick comments:
In `@README.md`:
- Around line 184-190: Update the multi-upstream README example to show full
command invocations so users understand the right-hand side of the NAME=CMD pair
is a complete command (like in the single-upstream example). Replace the stubbed
upstream entries in the vaara-mcp-proxy invocation (the lines using --upstream
github=github-mcp-server and --upstream sap=sap-mcp-server) with realistic
command strings, e.g. using the same pattern as the single-upstream example (npx
-y `@scope/pkg`), so entries read like --upstream github="npx -y
`@github/mcp-server`" and --upstream sap="npx -y `@sap/mdk-mcp-server`".
In `@src/vaara/audit/trail.py`:
- Around line 586-592: The code mutates and reads the shared dict
_tenant_for_action in record_action_requested (and other places like
_tenant_for) without using self._lock, which can race under free-threaded
Python; protect all accesses that modify or read _tenant_for_action by acquiring
self._lock (use the same lock already on the object) around the eviction loop,
the assignment self._tenant_for_action[action_id] = tenant_id, and any reads in
_tenant_for so that reads and writes are atomic relative to each other and avoid
races on Python 3.13t.
In `@src/vaara/scorer/adaptive.py`:
- Around line 774-776: The comment is incorrect: _thresholds_for is invoked from
_evaluate_locked while holding self._lock, so the lookup callback actually runs
inside the scorer lock; update the comment near the registry/lookup description
to state that the lookup executes under the scorer lock (self._lock) and clarify
that the code relies on a consistent lock ordering (scorer → registry) to avoid
deadlocks rather than implying the callback runs outside the lock; reference
_thresholds_for, _evaluate_locked, and the registry when making this change.
In `@tests/test_v040_mcp_http_transport.py`:
- Around line 78-82: The test test_multi_upstream_populates_default_slot
currently only checks presence of "default" in http_proxy._upstreams; change it
to assert that the "default" entry is an alias (the exact same upstream client
object) rather than a separate instance. Concretely, after confirming "default"
exists, identify the original upstream key used as the fallback (e.g., the
sorted-first upstream name among http_proxy._upstreams keys excluding "default")
and assert http_proxy._upstreams["default"] is
http_proxy._upstreams[that_original_key] (object identity), ensuring no
duplicate client was created.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 0d2e78a0-ffba-4297-b6a1-2d82fbab0414
📒 Files selected for processing (22)
CHANGELOG.mdREADME.mdclients/ts/package.jsonpyproject.tomlsrc/vaara/__init__.pysrc/vaara/audit/sqlite_backend.pysrc/vaara/audit/trail.pysrc/vaara/cli.pysrc/vaara/integrations/mcp_proxy.pysrc/vaara/pipeline.pysrc/vaara/policy/controller.pysrc/vaara/policy/registry.pysrc/vaara/scorer/adaptive.pysrc/vaara/server/app.pysrc/vaara/server/routes.pysrc/vaara/server/schemas.pysrc/vaara/server/state.pysrc/vaara/taxonomy/actions.pytests/test_v040_mcp_http_transport.pytests/test_v040_per_tenant_threshold.pytests/test_v040_policy_registry.pytests/test_v040_tenant.py
6505fd9 to
05f4808
Compare
| logger.exception("Failed to forward HTTP notification") | ||
| return Response(status_code=202) | ||
| response = proxy._handle_request(payload) | ||
| return JSONResponse(content=response) |
| logger.warning( | ||
| "tools/call rejected at perimeter (operator filter): %s", tool_name, | ||
| "tools/call rejected at perimeter (operator filter): %s", | ||
| _safe_log(tool_name), |
| logger.warning( | ||
| "resources/read rejected at perimeter (operator filter): %s", uri, | ||
| "resources/read rejected at perimeter (operator filter): %s", | ||
| _safe_log(uri), |
| logger.warning( | ||
| "prompts/get rejected at perimeter (operator filter): %s", name, | ||
| "prompts/get rejected at perimeter (operator filter): %s", | ||
| _safe_log(name), |
One Vaara process now serves a fleet of upstream MCP servers over Streamable HTTP, with multi-tenant policy, audit chain, and OVERT attestation on the same substrate. v0.39 ran one Vaara process per upstream; v0.40 collapses that into a single multi-tenant deployment. Streamable HTTP transport on the proxy. `vaara-mcp-proxy --transport http --http-host H --http-port P` runs POST /mcp via FastAPI / uvicorn. The endpoint reads `X-Vaara-Tenant` and `X-Vaara-Upstream` per request, pushes them into ContextVars, and dispatches into the existing `_handle_request` path so policy, perimeter, OVERT, and progress-notification handling all light up unchanged. Notifications return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404. Fan-out via repeatable `--upstream NAME=CMD`. One Vaara process holds N UpstreamMCPClient instances in a name -> client map. Bare `--upstream CMD` keeps the v0.39 single-upstream contract (lands in the "default" slot). When more than one upstream is configured, a request with no `X-Vaara-Upstream` header returns 400 with the list of valid slots in the error envelope. Single-upstream deployments keep the silent-default contract. tenant_id end-to-end. ScoreRequest, AuditEventRequest, PolicyReloadRequest accept a `tenant_id` body field, with `X-Vaara-Tenant` as the HTTP-header alternative (body wins over header). AuditRecord gains a `tenant_id` field, excluded from `compute_hash()` so pre-v0.40 chains still re-verify. AuditTrail keeps an `action_id -> tenant_id` map seeded by `record_action_requested`, soft-capped at 50k entries. SQLiteAuditBackend.write_record prefers per-record tenant. OVERT envelopes carry `tenant_id` as a `non_content_metadata` claim. Per-tenant policy plane. `vaara.policy.registry.PolicyRegistry` holds one PolicyController per tenant with the empty string slot reserved as the default fallback. `vaara serve --policy-dir DIR` loads one YAML/JSON policy per file (filename stem = tenant_id). `POST /v1/policy/reload` routes per tenant via body field or header. Installs `vaara-mcp-proxy` as a top-level console script so the proxy CLI matches what every v0.39+ docs surface advertises. Earlier releases only shipped the proxy as `python -m vaara.integrations.mcp_proxy`; v0.40 closes that gap. v0.41 will fold the proxy into the main `vaara` verb tree (`vaara mcp-proxy ...`) and keep `vaara-mcp-proxy` as a thin alias for one release cycle. Per-tenant threshold dispatch at evaluate-time. `AdaptiveScorer.evaluate` consults the registry on every call. A new `policy_lookup` constructor arg (and `set_policy_lookup` for late binding from ServerState) lets the scorer ask which tenant policy applies right now and use its allow/deny thresholds for THIS evaluation. Unknown tenant or no lookup configured falls back to the scorer-bound defaults that the default-slot listener keeps fresh on reload. The backend decision dict surfaces the applied threshold_allow and threshold_deny so operators can confirm which tenant's policy ran. MWU expert state, the conformal calibrator, agent profiles, and sequence patterns stay shared across tenants; only threshold application is per-tenant in v0.40. Scope notes. HTTP transport is POST-only (GET-SSE is v0.41). Per-tenant policy reload is hot; classifier hot-reload still restart-only. Cancellation routing across fan-out is v0.41 hardening. Fan-out latency bench is v0.40.1 measurement. 862 passed, 12 skipped. 45 new tests across tests/test_v040_tenant.py, tests/test_v040_policy_registry.py, tests/test_v040_mcp_http_transport.py, tests/test_v040_per_tenant_threshold.py. References modelcontextprotocol/modelcontextprotocol#2787 for the SEP-2787 envelope shape v0.40 builds on top of.
Summary
v0.40 is the deployment-shape release. One Vaara process now serves a fleet of upstream MCP servers, with multi-tenant policy + audit + attestation on the same substrate.
The v0.39 sidecar shape (one Vaara process per upstream) never matched the "Vaara is the schema, not a plug-in" positioning the v0.34+ READMEs ship. v0.40 closes that gap.
What lands
Streamable HTTP transport on the proxy.
vaara-mcp-proxy --transport http --http-host H --http-port Pruns POST /mcp via FastAPI / uvicorn (thevaara[server]extra already shipped in v0.39 forvaara serve). The endpoint readsX-Vaara-TenantandX-Vaara-Upstreamper request, pushes them into ContextVars, and dispatches into the existing_handle_requestpath so policy, perimeter, OVERT, and progress-notification handling all light up unchanged. Notifications (no JSON-RPCid) return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404.Fan-out via repeatable
--upstream NAME=CMD. One Vaara process holds NUpstreamMCPClientinstances in a name -> client map. Bare--upstream CMDkeeps the v0.39 single-upstream contract (lands in the "default" slot). Commands that themselves contain=(e.g.python -m foo --bar=baz) stay intact because the name-side regex only matches short alphanumeric slugs. When more than one upstream is configured, a request with noX-Vaara-Upstreamheader returns 400 with the list of valid slots in the error envelope. Silent fallback to whichever slot won the sort would be the failure mode that surfaces only in production. Single-upstream deployments keep the silent-default contract.tenant_idend-to-end through the request, decision, audit, and attestation layers.ScoreRequest,AuditEventRequest,PolicyReloadRequestaccept atenant_idbody field, withX-Vaara-Tenantas the HTTP-header alternative. Body wins over header.AuditRecordgains atenant_idfield, excluded fromcompute_hash()so pre-v0.40 chains still re-verify on load.AuditTrailkeeps anaction_id -> tenant_idmap seeded byrecord_action_requested, so every follow-up record (risk_scored,decision,execution,escalation,outcome,policy_override) inherits the same scope without every caller threadingtenant_idthrough every signature. The map is soft-capped (50k entries, 12.5% eviction on pressure).SQLiteAuditBackend.write_recordprefers the per-recordtenant_idwhen set. A single backend instance can now serve a multi-tenant runtime.tenant_idas anon_content_metadataclaim.Per-tenant policy plane.
vaara.policy.registry.PolicyRegistryholds onePolicyControllerper tenant with the empty string slot reserved as the default fallback.vaara serve --policy-dir DIRloads one YAML/JSON policy per file (filename stem =tenant_id,default.yamllands in the fallback slot).POST /v1/policy/reloadroutes per tenant viatenant_idbody field orX-Vaara-Tenantheader, creates the slot on first reload.Per-tenant threshold dispatch at evaluate-time.
AdaptiveScorer.evaluateconsults the registry on every call. A newpolicy_lookup: Callable[[str], Optional[Policy]]constructor arg (withset_policy_lookupfor late binding fromServerState) lets the scorer ask which tenant policy applies right now and use its allow / deny thresholds for this evaluation. Unknown tenant or no lookup configured falls back to the scorer-bound defaults that the default-slot listener keeps fresh on reload. The backend decision dict surfaces the appliedthreshold_allowandthreshold_denyso operators can confirm which tenant's policy ran. MWU expert state, the conformal calibrator, agent profiles, and sequence patterns stay shared — only threshold application is per-tenant in v0.40.Scope notes (honest)
_upstream.notify()goes to the ctxvar-resolved upstream; cancellation tracking per inflight tools/call is a separate piece of plumbing.Spec lane reference
References modelcontextprotocol/modelcontextprotocol#2787 for the SEP-2787 envelope shape v0.40 builds on top of. The v0.40 fleet shape sits above SEP-2787, not in conflict with it.
Test plan
tests/adversarial, with the pre-existing v9 SSRF-floor deselect documented in v0.39.0 STATE).src/vaara/,tests/test_v040_*.py.tests/test_v040_tenant.py,tests/test_v040_policy_registry.py,tests/test_v040_mcp_http_transport.py,tests/test_v040_per_tenant_threshold.pycovering:ScoreRequesttenant routing (body, header, body-wins-over-header), audit-trail tenant propagation through followup records, map eviction under pressure, tenant length cap,AuditRecord.compute_hashexcludestenant_idso chains re-verify,PolicyRegistryreload-creates-slot, default-slot fallback, directory loader,Policy-instance reload path,/v1/policy/reloadper-tenant routing via body and header, single-controller back-compat, unconfigured 409,_parse_upstream_specsedge cases (bare command, named, command containing=, legacy args, multiple named, non-slug-name fallback), streamable-HTTP endpoint dispatch byX-Vaara-Upstream, unknown-upstream 404, notification 202, body-size 413, bad-JSON parse-error path, ambiguous-fan-out-without-header 400 with slot list, single-upstream silent-default preserved, per-tenant threshold dispatch at evaluate-time (tenant override, unknown-tenant fallback, empty-tenant skip, lookup-exception fallback, reload-visible-to-next-call, dry-run honours per-tenant, ServerState auto-wiring, non-Policy return fallback).pip install vaara==0.40.0-> import ->vaara-mcp-proxy --helpshows--transport,--upstream,--http-host,--http-port.npm install @vaara/client@0.40.0.Files changed
Four logical commits squashed into one ship commit:
tenant_idend-to-end +PolicyRegistryX-Vaara-Upstreamwhen fan-out is ambiguous; 400 with valid-slot list instead of silent-default routing.Summary by CodeRabbit
Release v0.40.0
New Features
X-Vaara-Upstreamheader.--policy-dir) and request-scoped tenant selection viaX-Vaara-Tenantheader.vaara-mcp-proxyCLI entrypoint replacing legacy module invocation.Bug Fixes
Chores