Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,34 @@ claude-code-audit.db
.pr_body_*.md
.issue_body_*.md
.comment_body_*.md

# Private scratch drafts (replies, research, proposals, BD) — never publish
.recruiter_*
.reply_*
.research_*
.proposal_*
.tier1_*
.brand_book.md

# One-off ops/deploy scratch scripts and payloads — never publish
.apply_*.sh
.fix_*.sh
.deploy_*.sh
.restart_*.sh
.style_*.sh
.pr_create_*.sh
.pr_comment_*.md
.tag_payload.json
.gen_evidence_pair.py
.tmp_*.py

# Stray shell/editor env dotfiles (not part of the repo)
.bashrc
.bash_profile
.zshrc
.zprofile
.profile
.gitconfig
.ripgreprc
.idea/
.vscode/
51 changes: 51 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,57 @@ and this project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.ht

## [Unreleased]

## [0.45.1] - 2026-05-30

**Theme: audit-finding fixes on the remote HTTP connector, the HTTP transport, and the public numbers.**

### Security
- SSRF egress floor on the `--upstream-url` connector. The remote HTTP connector
handed a user-supplied upstream URL straight to `urllib` and followed
redirects with the static `Authorization` header attached, so a hostile or
compromised upstream (or an attacker controlling a redirect target) could aim
the proxy at the cloud instance-metadata service or an internal host and have
it fetch the target with the operator's bearer token. The new `_egress_guard`
resolves the host and refuses loopback, link-local, RFC1918, IPv6 ULA, and the
cloud-metadata address (including its dotless and IPv4-mapped encodings) before
any socket opens; a guarded opener caps redirects, re-applies the floor to each
hop, and drops the auth header on a cross-origin redirect. Default is SAFE; a
trusted internal upstream is opted in via `--allow-private-upstream-hosts`,
the `allow_private_hosts` constructor arg, or the
`VAARA_MCP_ALLOW_PRIVATE_UPSTREAM` env flag. The metadata address stays refused
even with the opt-in.
- DNS-rebind closure on that egress floor. Resolving the host and then handing
the name back to `urllib` left a gap: `urllib` re-resolved at socket-connect,
so a name that answered with a public address at the check and a blocked one a
moment later (a time-split rebind) reached the blocked target with the auth
header attached. The connector now validates and pins the address at connect
time and dials the IP literal, so the address that passed the floor is the
exact address the socket reaches; HTTPS still verifies the certificate against
the original hostname. The pin is re-applied on every redirect hop. An absent
`--allow-private-upstream-hosts` flag now leaves the
`VAARA_MCP_ALLOW_PRIVATE_UPSTREAM` env opt-in live instead of silently
shadowing it with a `False`.

### Fixed
- HTTP transport no longer serialises concurrent requests. The POST `/mcp`
endpoint ran the blocking `_handle_request` inline on the event loop, so one
slow upstream stalled every other POST, SSE drain, and `/health` (real
concurrency 1). It now runs on a worker thread via `asyncio.to_thread`, with
the per-request ContextVars preserved across the hop through
`contextvars.copy_context()`.
- SSE reconnect race that dropped notifications for the live session. On
reconnect under the same `Mcp-Session-Id`, the old stream's teardown
unregistered the NEW session. `unregister_session` is now identity-checked and
only removes the entry when it is still the tearing-down stream's own state.
- README mislabelled the rule-scorer latency as classifier latency. The
140 µs / 210 µs figure is the hot-path rule scorer; the MiniLM classifier is
opt-in (`vaara[ml]`) and not in that path. Also surfaces the cross-model
held-out recall (66.8%) and its weakest sub-cell (38.9%) the bench docs
already disclose.
- `llms.txt` advertised a two-generations-stale classifier (5,955-entry corpus,
97.1% at threshold 0.55). Regenerated from the current v9 numbers and switched
the lede to the tamper-evident runtime evidence framing.

## [0.45.0] - 2026-05-30

**Theme: reach remote MCP upstreams over HTTP, and make the proxy's Streamable HTTP handling conform to the spec.**
Expand Down
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,15 @@ Vaara intercepts agent tool calls, scores each one with a conformal risk interva

## Numbers

Held-out TEST recall 84.7% (95% Wilson [82.4, 86.7]) at FPR 4.1% [2.9, 5.7]. Phase 1 PAIR scale-up to n=300 per attacker family lands at 88.1% [85.8, 90.1]. Under BIPIA-pressure context, false-positive rate on benign tool calls 1.2% [0.4, 3.6] across four agent backends (Claude Haiku 4.5, Llama-3.1-8B, Mistral-7B, Qwen-2.5-7B). Multi-attacker PAIR ASR 0/25 across three different attacker models with identical seeds. 140 µs mean / 210 µs p99 inference latency on commodity CPU (excluding one-time embedding model load). Every number reproducible end-to-end via `make bench`.
Held-out TEST recall 84.7% (95% Wilson [82.4, 86.7]) at FPR 4.1% [2.9, 5.7]. Phase 1 PAIR scale-up to n=300 per attacker family lands at 88.1% [85.8, 90.1]. Cross-model held-out recall, where no attacker model in the eval set was in TRAIN, is 66.8% [64.9, 68.7] over n=2,277; the weakest sub-cell is data_exfil against a closed-weight model at 38.9% [35.3, 42.5] (see [vaara-bench-v0.37](bench/vaara-bench-v0.37.md)). Under BIPIA-pressure context, false-positive rate on benign tool calls 1.2% [0.4, 3.6] across four agent backends (Claude Haiku 4.5, Llama-3.1-8B, Mistral-7B, Qwen-2.5-7B). Multi-attacker PAIR ASR 0/25 across three different attacker models with identical seeds. The rule scorer that runs in the hot path adds 140 µs mean / 210 µs p99 per call on commodity CPU; the MiniLM classifier is opt-in (`vaara[ml]`) and is not in that measured path. Every number reproducible end-to-end via `make bench`.

- 12,155-entry adversarial corpus (250 hand-curated + 11,905 LLM-generated), 70/15/15 split stratified by (category, source)
- Classifier v9 with 236 hand-features + 384-dim MiniLM embeddings at calibrated threshold 0.9150 on held-out TEST n=1,827: recall 84.7% [82.4, 86.7] at FPR 4.1% [2.9, 5.7]
- Multi-attacker PAIR robustness: 0/25 successes per attacker across Qwen2.5-32B, Qwen2.5-72B, Llama-3.3-70B hitting identical seed indices, Wilson upper 13.3%
- BIPIA-pressure FPR on benign tool calls 1.2% [0.4, 3.6] across four agent backends, n=244 benign tool calls under `context.source=injected_via_bipia_<class>`
- Cross-model held-out recall 66.8% [64.9, 68.7] over n=2,277 with no eval-set attacker model in TRAIN; data_exfil generalises unevenly, with a closed-weight sub-cell at 38.9% [35.3, 42.5]. This is the honest worst case; the in-distribution TEST number above is the easier denominator
- Chain of custody: corpus manifest SHA, split manifest SHA, training commit, bundle SHA, all locked and printed by every script
- 140 µs mean / 210 µs p99 inference latency, commodity CPU
- 140 µs mean / 210 µs p99 for the hot-path rule scorer on commodity CPU; the MiniLM classifier is opt-in (`vaara[ml]`) and not in that path
- Distribution-free conformal coverage on the score
- MWU regret bound O(sqrt(T log N))
- [vaara-bench-v0.39](bench/vaara-bench-v0.39.md): current methodology, chain of custody, ship-gate record. v9 retrain on BIPIA-augmented corpus with follows upweighted (`--follow-weight 8.0`), calibrated to T=0.9150 at a 5% FPR target on v035 VAL. BIPIA-pressure FPR collapses from 35.2% on v8 to 1.2% on v9. In-distribution recall flat within Wilson intervals. Found-and-fixed in tree: auto-labeller `example.com` placeholder false-positive rule (42 to 14 true follows across four backends). Historical bench docs live under `bench/` for chain-of-custody continuity.
Expand Down Expand Up @@ -162,6 +163,8 @@ vaara-mcp-proxy \

Point your MCP client at the proxy instead of the upstream. The audit chain captures every tool call without changing client or upstream behavior. Distinct from `mcp_server`, which exposes Vaara itself as an MCP server for agents that consult Vaara as a tool.

Upstreams can be local or remote. `--upstream` launches a local stdio MCP server; `--upstream-url NAME=URL` connects to a remote MCP server over the Streamable HTTP transport, and a bare `--upstream-url URL` lands in the `default` slot. Each slot is one transport or the other, never both.

<details>
<summary>Fleet shape (v0.40): one proxy, many upstreams, multi-tenant policy</summary>

Expand Down
2 changes: 1 addition & 1 deletion clients/ts/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@vaara/client",
"version": "0.45.0",
"version": "0.45.1",
"mcpName": "io.github.vaaraio/vaara",
"description": "TypeScript client for the Vaara HTTP API. Conformal risk scoring, hash-chained audit, policy reload, named detectors.",
"main": "dist/index.js",
Expand Down
16 changes: 9 additions & 7 deletions llms.txt
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Vaara

> Runtime evidence layer for EU AI Act compliance. Open source, no SaaS, no telemetry.
> Tamper-evident runtime evidence layer for AI agents. Covers EU AI Act compliance and any case where you need to prove what an agent actually did. Open source, no SaaS, no telemetry.

Vaara intercepts agent tool calls, scores each one with a conformal risk interval, and writes a hash-chained audit record. Online learning across five expert signals via Multiplicative Weight Update. Distribution-free conformal coverage on the score.
Vaara intercepts agent tool calls, scores each one with a conformal risk interval, and writes a hash-chained audit record. Online learning across five expert signals via Multiplicative Weight Update. Distribution-free conformal coverage on the score. An external auditor can verify these properties without trusting your stack.

Position: runtime governance and enforcement layer. Implements OVERT 1.0 (Glacis Technologies, March 2026) as the Arbiter role at AAL-3 Phase 2.
Position: tamper-evident runtime evidence and enforcement layer. Signed attestation plus execution receipts pair each MCP tool call to the policy that allowed it.

## Repo and packages
- [GitHub source](https://github.com/vaaraio/vaara): code, releases, issue tracker
Expand All @@ -26,10 +26,12 @@ Position: runtime governance and enforcement layer. Implements OVERT 1.0 (Glacis
- OVERT 1.0 emitter, verifier CLI, S3P (MEA-2) emitter with Clopper-Pearson intervals, experimental AMD SEV-SNP TEE attestation hook

## Numbers
- 5,955-entry adversarial corpus (3,422 attack across 8 categories, 2,533 benign)
- 97.1% attack recall on held-out distribution-shift split, threshold 0.55
- PAIR adaptive-attacker calibration: ASR 0/25 against Qwen2.5-32B
- 140 µs / 210 µs p99 inference latency, commodity CPU
- 12,155-entry adversarial corpus (250 hand-curated + 11,905 LLM-generated), 70/15/15 split stratified by (category, source)
- Classifier v9 (236 hand-features + 384-dim MiniLM embeddings) at calibrated threshold 0.9150: held-out TEST recall 84.7% [82.4, 86.7] at FPR 4.1% [2.9, 5.7], n=1,827
- Cross-model held-out recall 66.8% [64.9, 68.7] over n=2,277 with no eval-set attacker model in TRAIN; weakest sub-cell (data_exfil, closed-weight) 38.9% [35.3, 42.5]
- BIPIA-pressure FPR on benign tool calls 1.2% [0.4, 3.6] across four agent backends
- Multi-attacker PAIR ASR 0/25 per attacker across Qwen2.5-32B, Qwen2.5-72B, Llama-3.3-70B at identical seeds
- 140 µs mean / 210 µs p99 for the hot-path rule scorer, commodity CPU; the MiniLM classifier is opt-in (`vaara[ml]`) and not in that path

## Optional
- [Article 14 runtime](https://futurium.ec.europa.eu/ga/apply-ai-alliance/community-content/article-14-runtime-why-oversight-agentic-ai-has-be-evidenced-action-not-model): position post on EU Apply AI Alliance Futurium
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ build-backend = "setuptools.build_meta"

[project]
name = "vaara"
version = "0.45.0"
description = "Tamper-evident runtime evidence layer for AI agents: risk scoring, audit trails, and regulatory compliance"
version = "0.45.1"
description = "Tamper-evident runtime evidence layer for AI agents: conformal risk scoring, hash-chained audit trails, and signed attestation plus execution receipts per MCP tool call"
requires-python = ">=3.10"
license = "Apache-2.0"
readme = "README.md"
Expand Down
4 changes: 2 additions & 2 deletions server.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@
"url": "https://github.com/vaaraio/vaara",
"source": "github"
},
"version": "0.45.0",
"version": "0.45.1",
"packages": [
{
"registryType": "pypi",
"registryBaseUrl": "https://pypi.org",
"identifier": "vaara",
"version": "0.45.0",
"version": "0.45.1",
"runtimeHint": "uvx",
"transport": {
"type": "stdio"
Expand Down
2 changes: 1 addition & 1 deletion src/vaara/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
oversight.
"""

__version__ = "0.45.0"
__version__ = "0.45.1"

from vaara.pipeline import InterceptionPipeline, InterceptionResult

Expand Down
16 changes: 11 additions & 5 deletions src/vaara/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -710,19 +710,25 @@ def _cmd_trail_receipt(args: argparse.Namespace) -> int:


def _cmd_compliance_dashboard(args: argparse.Namespace) -> int:
from vaara.audit.sqlite_backend import SQLiteAuditTrail
from vaara.audit.sqlite_backend import SQLiteAuditBackend
from vaara.compliance.dashboard import render_html
from vaara.compliance.engine import ComplianceEngine
from vaara.compliance.engine import create_default_engine

db_path = Path(args.db).expanduser()
if not db_path.is_file():
print(f"vaara compliance dashboard: not a file: {db_path}", file=sys.stderr)
return 2

trail = SQLiteAuditTrail(str(db_path))
engine = ComplianceEngine()
backend = SQLiteAuditBackend(str(db_path))
try:
trail = backend.load_trail()
except Exception as exc:
print(f"failed to load audit trail: {exc}", file=sys.stderr)
return 2
Comment on lines +722 to +727
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Close the SQLite backend after loading the trail.

Line 722 opens SQLiteAuditBackend, but neither the success path nor the load_trail() error path closes it. When main() is invoked in-process, that can leak the connection and leave the SQLite file locked for later operations.

♻️ Proposed fix
-    backend = SQLiteAuditBackend(str(db_path))
-    try:
-        trail = backend.load_trail()
-    except Exception as exc:
-        print(f"failed to load audit trail: {exc}", file=sys.stderr)
-        return 2
+    with SQLiteAuditBackend(str(db_path)) as backend:
+        try:
+            trail = backend.load_trail()
+        except Exception as exc:
+            print(f"failed to load audit trail: {exc}", file=sys.stderr)
+            return 2
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
backend = SQLiteAuditBackend(str(db_path))
try:
trail = backend.load_trail()
except Exception as exc:
print(f"failed to load audit trail: {exc}", file=sys.stderr)
return 2
with SQLiteAuditBackend(str(db_path)) as backend:
try:
trail = backend.load_trail()
except Exception as exc:
print(f"failed to load audit trail: {exc}", file=sys.stderr)
return 2
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/cli.py` around lines 722 - 727, The SQLiteAuditBackend instance
created by SQLiteAuditBackend(str(db_path)) is never closed, leaking the DB
connection; update the code to ensure backend is closed after load_trail()
whether it succeeds or raises (e.g., use a with-statement if SQLiteAuditBackend
supports context management or call backend.close() in a finally block after
calling backend.load_trail()), referencing SQLiteAuditBackend and load_trail so
the connection is always released on both success and in the except path.


engine = create_default_engine()
report = engine.assess(
trail=trail,
trail,
system_name=args.system_name,
system_version=args.system_version,
)
Expand Down
Loading