Skip to content

feat: v0.10.0 — Vaara HTTP API + evidence export + Article 12 receipts#75

Merged
vaaraio merged 5 commits into
mainfrom
feat/v010-kernel-export-receipt
May 16, 2026
Merged

feat: v0.10.0 — Vaara HTTP API + evidence export + Article 12 receipts#75
vaaraio merged 5 commits into
mainfrom
feat/v010-kernel-export-receipt

Conversation

@vaaraio
Copy link
Copy Markdown
Owner

@vaaraio vaaraio commented May 16, 2026

Summary

v0.10.0 ships three additive features that reposition Vaara from a Python library to a runtime kernel that control planes, audit consumers, and orchestration frameworks reference.

  • HTTP API v1. Network-callable scorer + audit emitter. OpenAPI 3.1 contract at docs/openapi.yaml. Reference FastAPI server via pip install 'vaara[server]' and vaara serve. Endpoints: /v1/score, /v1/score/outcome, /v1/audit/events, /v1/audit/actions/{id}/chain, /v1/audit/verify, /v1/server, /v1/health. The HTTP contract is versioned /v1/ independently of the project version, following the OPA pattern.
  • Auditor-facing evidence export. vaara.compliance.render with Markdown / JSON / narrative renderers for ConformityReport. CLI: vaara compliance report --db PATH --format md --out report.md. Markdown output has per-domain article tables, per-article detail sections, audit-chain integrity flagging, and a deployer-owns-the-conformity-decision disclaimer.
  • Article 12 commit-prove receipt pair. vaara.audit.receipts derives an offline-verifiable receipt from the existing audit chain. SHA-256 over canonical JSON, no external crypto library, no key infrastructure. Auditor verifies with hashlib alone. CLI: vaara trail receipt --db PATH --action-id ID.

All three additive. No existing module signatures change. Full test suite 499/499 pass (27 new tests across the three pieces).

Test plan

  • CI green on the 5 required status checks
  • pip install 'vaara[server]' works in a fresh env
  • vaara serve starts and curl http://localhost:8000/v1/health returns {"status":"ok"}
  • vaara compliance report --db <existing_db> --format md produces a readable Markdown report
  • vaara trail receipt --db <existing_db> --action-id <id> produces a verifiable JSON receipt

Summary by CodeRabbit

Release Notes – v0.10.0

  • New Features

    • HTTP API server with endpoints for scoring actions, reporting outcomes, and managing audit trails
    • Offline-verifiable audit receipts for action commitments and outcomes
    • Compliance report rendering in Markdown, JSON, and narrative formats
    • New CLI commands: vaara serve, vaara trail receipt, vaara compliance report
  • Documentation

    • Complete OpenAPI specification for HTTP API
    • HTTP API usage guide added to README
  • Tests

    • HTTP server integration tests
    • Audit receipt verification tests
    • Compliance report rendering tests

Review Change Stack

vaaraio added 4 commits May 16, 2026 17:50
Ship the Vaara scorer and hash-chained audit trail as a network
service following an OpenAPI 3.1 contract. The spec at docs/openapi.yaml
is authoritative. Control planes, orchestration frameworks, and audit
consumers call Vaara; Vaara does not ship as a plug-in inside any one
of them.

Endpoints in v1:
  POST /v1/score
  POST /v1/score/outcome
  POST /v1/audit/events
  GET  /v1/audit/actions/{action_id}/chain
  POST /v1/audit/verify
  GET  /v1/server
  GET  /v1/health

Reference server in src/vaara/server/ is a FastAPI implementation
suitable for local development and modest production loads. Production
deployments with sustained traffic can implement the same spec in any
language.

Install with the optional extra: pip install 'vaara[server]'. Run with
vaara serve.

11 new tests covering score, outcome roundtrip, audit append, chain
read, verify, and server identity. Full suite 494/494 pass.
Add vaara.compliance.render with three renderers for ConformityReport:
markdown (the canonical human-shipped format), json (re-exposed from the
existing strict-dict serialization), and narrative (re-exposed plain
text). The Markdown output has per-domain article tables, per-article
detail sections with status badges, audit-chain integrity flagging, and
a deployer-owns-the-conformity-decision disclaimer.

Expose via the vaara compliance report subcommand:

  vaara compliance report --db audit.sqlite --format md --out report.md

PDF is intentionally out of scope for v1. A clean Markdown render can be
piped through pandoc or weasyprint by the deployer pipeline; Vaara
defines the article-evidence content, format conversion is downstream.

5 new tests in tests/test_compliance_render.py. Full suite 499 of 499
pass.
Add vaara.audit.receipts, a derivation layer over the existing
hash-chained audit trail that produces offline-verifiable receipts
binding a gate-time commitment to its post-execution outcome.

  commit_hash  = SHA-256 of canonical JSON over
                 (action_id, decision, risk_score, threshold_allow,
                  threshold_deny, decided_at)
  outcome_hash = SHA-256 of canonical JSON over
                 (action_id, commit_hash, outcome_severity,
                  outcome_payload, recorded_at)

The outcome embeds the commit_hash, so the pair forms a chain of
accountability for one action. An external auditor verifies offline
with hashlib alone, no key infrastructure, no external cryptography
library. The full audit chain still protects integrity in aggregate;
receipts are a structured per-action pairing for handoff.

Module exports: CommitPayload, OutcomePayload, Receipt,
extract_receipt(trail, action_id), verify_receipt(receipt),
verify_receipt_dict(d). CLI access: vaara trail receipt --db PATH
--action-id ID. 11 new tests, full suite 499 of 499 pass.
Bundles three additive feature commits:
- HTTP API v1 (Vaara-as-callable kernel, OPA-style)
- Auditor-facing article-evidence Markdown export
- Article 12 commit-prove receipt pair

The HTTP contract at docs/openapi.yaml is versioned /v1/ independently
of the project version, following the OPA pattern of decoupling the
schema commitment from the project-maturity ceremony. Project version
stays in the 0.x series; 1.0 is reserved for a deliberate API-stability
event.

499 tests pass.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 16, 2026

Warning

Rate limit exceeded

@vaaraio has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 51 minutes and 46 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ede5f1a9-c66f-4950-a2a5-d9716d6f72fd

📥 Commits

Reviewing files that changed from the base of the PR and between dbbfd9b and a05c60b.

📒 Files selected for processing (2)
  • src/vaara/compliance/render.py
  • tests/test_receipts.py
📝 Walkthrough

Walkthrough

This PR releases Vaara v0.10.0, adding an HTTP API reference server with OpenAPI specification, offline-verifiable audit receipts for handoff, compliance evidence rendering for auditors, and CLI commands for server operation, receipt inspection, and report generation.

Changes

HTTP API and Offline-Verifiable Audit

Layer / File(s) Summary
API specification and documentation
docs/openapi.yaml, README.md, CHANGELOG.md
Complete OpenAPI 3.1.0 spec defining /v1 endpoints for scoring, outcomes, audit events, and verification. HTTP API section added to README with install and example usage. Changelog records v0.10.0 features.
Server state container
src/vaara/server/state.py
ServerState holds scorer/audit instances and maintains thread-safe per-action metadata (agent, tool, risk, signals) for correlating HTTP requests to scoring decisions.
Wire-format Pydantic schemas
src/vaara/server/schemas.py
Request/response models for scoring, outcomes, audit events/chains, verification, server metadata, and errors; enforce field constraints and reject unknown fields on input.
FastAPI app factory and HTTP endpoints
src/vaara/server/app.py, src/vaara/server/routes.py, src/vaara/server/__init__.py
create_app() factory initializes ServerState, registers routes with consistent error handling. Routes implement health, server metadata, score request/outcome, audit event ingestion, chain retrieval, and verification operations.
Offline-verifiable audit receipts
src/vaara/audit/receipts.py
CommitPayload and OutcomePayload immutable types with SHA-256 hashing. Receipt binds outcome hash to commitment hash. verify_receipt() and verify_receipt_dict() validate integrity. extract_receipt() derives receipts from audit trails for offline handoff.
Compliance evidence rendering
src/vaara/compliance/render.py
Three renderers for ConformityReport: Markdown with metadata/chain status/domain-grouped articles/evidence freshness, JSON for structured export, and narrative plain-text wrapper.
CLI subcommands for server and evidence
src/vaara/cli.py
New commands: vaara trail receipt extracts and verifies receipts; vaara compliance report renders evidence in md/json/narrative; vaara serve starts HTTP server with uvicorn.
Version and dependency updates
pyproject.toml, src/vaara/__init__.py
Bump version to 0.10.0. Add server optional dependency group (fastapi, uvicorn).
Integration and feature tests
tests/test_server.py, tests/test_receipts.py, tests/test_compliance_render.py
HTTP endpoint integration tests; receipt extraction/verification/round-trip tests; rendering output format and chain-status tests.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant POST_score["POST /v1/score"]
  participant ServerState
  participant Scorer
  participant Memory
  
  Client->>POST_score: ScoreRequest
  POST_score->>Scorer: evaluate(context)
  Scorer-->>POST_score: decision, risk, signals
  POST_score->>ServerState: remember_action(action_id, agent, tool, risk, signals)
  ServerState->>Memory: store _ActionInfo
  POST_score-->>Client: ScoreResponse(action_id, decision, risk)
Loading
sequenceDiagram
  participant Client
  participant Audit_append["POST /v1/audit/events"]
  participant AuditTrail
  participant Chain
  
  Client->>Audit_append: AuditEventRequest(event_type, payload)
  Audit_append->>AuditTrail: _append(AuditRecord)
  AuditTrail->>Chain: compute chain_position, hash
  Audit_append-->>Client: AuditEventResponse(chain_position, hash, prev_hash)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A server springs to life, REST-ful and bright,
With audit trails hash-chained and locked tight,
Receipts verify offline, compliance flows free—
The kernel for others, at v0.10.0 we see! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the three main features introduced in v0.10.0: HTTP API, evidence export, and Article 12 receipts.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/v010-kernel-export-receipt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread src/vaara/compliance/render.py Fixed
Comment thread src/vaara/compliance/render.py Fixed
Comment thread tests/test_receipts.py Fixed
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/openapi.yaml`:
- Around line 199-200: The OpenAPI request object schemas (e.g., ScoreRequest)
currently permit undeclared fields but the server forbids extras; update each
request schema to include additionalProperties: false so generated clients match
runtime validation—locate ScoreRequest and the other request object schemas
referenced in the review and add additionalProperties: false under their
definitions to disallow extra fields.

In `@pyproject.toml`:
- Line 45: The dependency pin in pyproject.toml (server = ["fastapi>=0.110",
"uvicorn>=0.27"]) is outdated; update the fastapi and uvicorn requirement
strings to newer, secure versions (minimum fastapi>=0.115.0 and a recent uvicorn
like >=0.47.0, or the latest stable releases) so transitive dependency CVE fixes
(e.g., Starlette) are picked up; edit the server list in pyproject.toml to
replace the version bounds accordingly and run dependency install/lock to verify
compatibility and tests.

In `@README.md`:
- Around line 62-71: The README's new fenced code blocks lack language
identifiers causing MD040; update the two blocks that contain the pip/vaara
serve snippet (pip install 'vaara[server]' and vaara serve --host ...) and the
curl POST snippet (curl -sX POST ... -d
'{"tool_name":"tx.transfer","agent_id":"agent-007","base_risk_score":0.5}') by
adding the bash fence marker (```bash) at the start of each block so they are
properly highlighted and pass the linter.

In `@src/vaara/audit/receipts.py`:
- Around line 209-210: The assignments to ta and td overwrite explicit zero
thresholds because they use truthiness fallback; change them to preserve 0.0 by
using a None check around _coerce_float(data.get("threshold_allow")) and
_coerce_float(data.get("threshold_deny")) (i.e., set ta = result if result is
not None else 0.4 and td = result if result is not None else 0.7) so that
explicit 0.0 values returned by _coerce_float are kept; update the lines that
set ta and td accordingly, referencing the variables ta, td, data,
"threshold_allow", "threshold_deny", and the helper _coerce_float.

In `@src/vaara/cli.py`:
- Around line 562-567: The SQLiteAuditBackend opened before calling load_trail()
is never closed; wrap the usage of SQLiteAuditBackend in a try/except/finally so
backend.close() is always called: create the backend as before, call trail =
backend.load_trail() inside the try, keep the existing except Exception as exc
handling, and add a finally block that calls backend.close() (and guards if
backend is None) for both occurrences where SQLiteAuditBackend is instantiated.
- Around line 586-587: The direct write
Path(args.out).expanduser().write_text(...) can fail for nested paths; update
both occurrences to expand user into a Path object, ensure parent directories
exist via path.parent.mkdir(parents=True, exist_ok=True) before writing, perform
the write inside a try/except catching OSError (or Exception) and surface a
clear error message and non-zero exit (or return) on failure; reference the Path
usage around args.out and the surrounding write_text calls so you update both
spots consistently.

In `@src/vaara/compliance/render.py`:
- Around line 23-30: Remove the unused imports causing Ruff F401 by deleting
typing.Any and EvidenceStrength from the import list; update the top of
src/vaara/compliance/render.py so only actually used symbols (e.g.,
ArticleEvidence, ConformityReport, EvidenceStatus) are imported from
vaara.compliance.engine and remove the unused "Any" import from typing.

In `@src/vaara/server/routes.py`:
- Around line 188-201: The handler verify_audit_chain currently ignores the
incoming _req and discards failure details; modify it to read _req.from_event_id
and _req.to_event_id and pass them into state.audit.verify_chain(...) (or call
the appropriate ranged verify method), capture the returned failure information
into a variable (e.g. problem) and populate S.VerifyResponse.first_break with
the failure's identifier/index (e.g. problem.event_id or problem.index) when
problem is not None, while still returning valid=True and
events_checked=state.audit.size when no problem is found; keep the function name
verify_audit_chain and response type S.VerifyResponse unchanged.
- Around line 111-126: The handler score_outcome currently calls
state.scorer.record_outcome unconditionally so repeated POSTs for the same
S.OutcomeRequest.action_id mutate learning state; change it to be a no-op for
duplicates by first determining whether that action already has an outcome
recorded (via state.lookup_action(info) and a persisted flag such as
info.outcome_recorded or by adding a state method like
state.has_outcome(action_id)); only call state.scorer.record_outcome when no
outcome exists, and ensure the check-and-set is atomic (implement a new state
method record_outcome_if_not_recorded(action_id, ...) or similar that performs
the existence check and records the outcome in one operation) so score_outcome
returns 204 without side effects on repeated calls.
- Around line 45-188: The three write-capable handlers (score, score_outcome,
append_audit_event) expose mutation endpoints without auth; add authentication
and authorization checks to each handler (score, score_outcome,
append_audit_event) by requiring an auth dependency (e.g. FastAPI Depends that
validates API key / bearer token and returns a principal with scopes) and
enforce an authorization policy before any state mutation (e.g. ensure principal
has "score:write" or "audit:write" scope), returning proper HTTP 401/403 errors
when missing; apply the same check for any other future endpoints that call
state.remember_action, state.scorer.record_outcome, or state.audit._append to
prevent unauthorized model-poisoning or audit tampering.
- Around line 36-43: Add an exception handler for RequestValidationError to
normalize FastAPI validation failures to the same {"error": {...}} shape as your
HTTPException handler: register a new
`@app.exception_handler`(RequestValidationError) function (similar to
_http_exc_handler) that converts the validation details into a single error
object (e.g., code "validation_error" and a summarized message or the full
str(exc.errors())) and returns JSONResponse(status_code=422, content={"error":
{"code": "...", "message": ...}}); ensure you reference RequestValidationError
and JSONResponse so validation errors follow the same format as
_http_exc_handler for HTTPException.

In `@src/vaara/server/state.py`:
- Around line 34-54: The in-memory actions dict (self._actions) is unbounded and
must be bounded to avoid memory growth; modify the state class to enforce a
capacity and eviction policy: add a max_actions attribute and change
remember_action to insert into a size-limited structure (e.g., replace dict with
an OrderedDict or use an LRU cache behavior) so when inserting a new _ActionInfo
you evict the oldest/least-recent entry if the capacity is exceeded; ensure
lookup_action still acquires self._lock and returns the _ActionInfo (optionally
update recency on lookup if using LRU) and keep using the existing _lock,
remember_action, lookup_action and _ActionInfo symbols so callers are unchanged.

In `@tests/test_receipts.py`:
- Line 15: The import line brings in EventType which is unused and causes ruff
F401; update the import in tests/test_receipts.py to remove EventType so only
AuditTrail is imported (i.e., change "from vaara.audit.trail import AuditTrail,
EventType" to import just AuditTrail), or if EventType is needed later, use it
where required or mark it as used; ensure no other references to EventType
remain in the test file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 3f536b3e-1579-4330-959b-3f9079c27f2c

📥 Commits

Reviewing files that changed from the base of the PR and between 131c351 and dbbfd9b.

📒 Files selected for processing (16)
  • CHANGELOG.md
  • README.md
  • docs/openapi.yaml
  • pyproject.toml
  • src/vaara/__init__.py
  • src/vaara/audit/receipts.py
  • src/vaara/cli.py
  • src/vaara/compliance/render.py
  • src/vaara/server/__init__.py
  • src/vaara/server/app.py
  • src/vaara/server/routes.py
  • src/vaara/server/schemas.py
  • src/vaara/server/state.py
  • tests/test_compliance_render.py
  • tests/test_receipts.py
  • tests/test_server.py

Comment thread docs/openapi.yaml
Comment on lines +199 to +200
ScoreRequest:
type: object
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align request schema strictness with runtime validation.

These request objects currently allow undeclared fields in OpenAPI, but the server rejects extras (extra="forbid"). This will break generated clients and contract tests.

Proposed spec fix
    ScoreRequest:
      type: object
+     additionalProperties: false
      required: [tool_name, agent_id]

    OutcomeRequest:
      type: object
+     additionalProperties: false
      required: [action_id, outcome_severity]

    AuditEventRequest:
      type: object
+     additionalProperties: false
      required: [event_type, action_id]

    VerifyRequest:
      type: object
+     additionalProperties: false
      properties:

Also applies to: 296-297, 309-310, 371-373

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/openapi.yaml` around lines 199 - 200, The OpenAPI request object schemas
(e.g., ScoreRequest) currently permit undeclared fields but the server forbids
extras; update each request schema to include additionalProperties: false so
generated clients match runtime validation—locate ScoreRequest and the other
request object schemas referenced in the review and add additionalProperties:
false under their definitions to disallow extra fields.

Comment thread pyproject.toml
export = ["cryptography>=41.0"]
ml = ["xgboost>=2.0", "scikit-learn>=1.3", "joblib>=1.3", "numpy>=1.24"]
yaml = ["pyyaml>=6.0"]
server = ["fastapi>=0.110", "uvicorn>=0.27"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

What are the latest stable versions of FastAPI and Uvicorn as of May 2026?

💡 Result:

As of May 16, 2026, the latest stable versions are: FastAPI: 0.136.1 (released April 23, 2026) [1][2] Uvicorn: 0.47.0 (released May 14, 2026) [3]

Citations:


🌐 Web query:

Are there any known security vulnerabilities in FastAPI 0.110.0 or Uvicorn 0.27.0?

💡 Result:

As of May 16, 2026, there are no direct security vulnerabilities (CVEs) reported specifically for FastAPI 0.110.0 or Uvicorn 0.27.0 [1][2][3]. However, it is important to note that security in these frameworks is often tied to their dependencies [4]. FastAPI Security Context: FastAPI itself has a minimal history of direct CVEs [4]. Most security alerts associated with FastAPI projects are actually vulnerabilities in its core dependencies, such as Starlette or Pydantic [4]. For example, older versions of FastAPI were affected by issues like ReDoS in multipart form data (fixed in 0.109.1) and CSRF vulnerabilities (fixed in 0.65.2) [5][4][6]. If you are using FastAPI 0.110.0, it is recommended to upgrade to a more recent version (such as 0.115.0 or later) to ensure that underlying dependencies like Starlette are updated to versions that patch known vulnerabilities, such as CVE-2024-47874 [4]. Uvicorn Security Context: Uvicorn 0.27.0 does not have direct CVEs listed in major vulnerability databases [3]. While Uvicorn has had historical vulnerabilities (such as CVE-2020-7694 and CVE-2020-7695 in very early versions) [7], these do not apply to the 0.27.0 release. Recommendation: To maintain a secure environment, it is best practice to regularly update both FastAPI and Uvicorn to their latest stable releases [4]. You can use tools like pip-audit or Snyk to scan your specific project environment for vulnerabilities in your dependency tree, as these tools will identify if any of your installed packages (including dependencies of FastAPI and Uvicorn) have known security issues [2][8][4].

Citations:


Update FastAPI and Uvicorn to more recent versions.

FastAPI 0.110 and Uvicorn 0.27 are from early 2024 and are substantially outdated. Current stable versions are FastAPI 0.136.1 and Uvicorn 0.47.0 (as of May 2026). While these specific older versions have no direct CVEs, FastAPI 0.110.0 has known vulnerabilities in its dependencies (e.g., Starlette with CVE-2024-47874) that are patched in newer releases. Consider upgrading FastAPI to at least 0.115.0 and using more recent Uvicorn versions to ensure dependency security is up-to-date.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pyproject.toml` at line 45, The dependency pin in pyproject.toml (server =
["fastapi>=0.110", "uvicorn>=0.27"]) is outdated; update the fastapi and uvicorn
requirement strings to newer, secure versions (minimum fastapi>=0.115.0 and a
recent uvicorn like >=0.47.0, or the latest stable releases) so transitive
dependency CVE fixes (e.g., Starlette) are picked up; edit the server list in
pyproject.toml to replace the version bounds accordingly and run dependency
install/lock to verify compatibility and tests.

Comment thread README.md
Comment on lines +62 to +71
```
pip install 'vaara[server]'
vaara serve --host 0.0.0.0 --port 8000
```

```
curl -sX POST http://localhost:8000/v1/score \
-H 'content-type: application/json' \
-d '{"tool_name":"tx.transfer","agent_id":"agent-007","base_risk_score":0.5}'
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language identifiers to new fenced code blocks.

The new blocks are missing fence languages (bash), which triggers MD040 and reduces syntax highlighting quality.

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 62-62: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 67-67: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 62 - 71, The README's new fenced code blocks lack
language identifiers causing MD040; update the two blocks that contain the
pip/vaara serve snippet (pip install 'vaara[server]' and vaara serve --host ...)
and the curl POST snippet (curl -sX POST ... -d
'{"tool_name":"tx.transfer","agent_id":"agent-007","base_risk_score":0.5}') by
adding the bash fence marker (```bash) at the start of each block so they are
properly highlighted and pass the linter.

Comment on lines +209 to +210
ta = _coerce_float(data.get("threshold_allow")) or 0.4
td = _coerce_float(data.get("threshold_deny")) or 0.7
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve explicit zero thresholds during receipt reconstruction.

Line 209 and Line 210 currently use truthiness fallback (or 0.4 / or 0.7), which overwrites valid 0.0 thresholds. That mutates commit payload semantics and can produce incorrect commit hashes.

Proposed fix
 def _thresholds_from_risk_record(
     risk_record: Optional[AuditRecord],
 ) -> tuple[float, float]:
     if risk_record is None:
         return 0.4, 0.7
     data = risk_record.data or {}
-    ta = _coerce_float(data.get("threshold_allow")) or 0.4
-    td = _coerce_float(data.get("threshold_deny")) or 0.7
+    ta_raw = _coerce_float(data.get("threshold_allow"))
+    td_raw = _coerce_float(data.get("threshold_deny"))
+    ta = 0.4 if ta_raw is None else ta_raw
+    td = 0.7 if td_raw is None else td_raw
     return float(ta), float(td)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ta = _coerce_float(data.get("threshold_allow")) or 0.4
td = _coerce_float(data.get("threshold_deny")) or 0.7
def _thresholds_from_risk_record(
risk_record: Optional[AuditRecord],
) -> tuple[float, float]:
if risk_record is None:
return 0.4, 0.7
data = risk_record.data or {}
ta_raw = _coerce_float(data.get("threshold_allow"))
td_raw = _coerce_float(data.get("threshold_deny"))
ta = 0.4 if ta_raw is None else ta_raw
td = 0.7 if td_raw is None else td_raw
return float(ta), float(td)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/audit/receipts.py` around lines 209 - 210, The assignments to ta
and td overwrite explicit zero thresholds because they use truthiness fallback;
change them to preserve 0.0 by using a None check around
_coerce_float(data.get("threshold_allow")) and
_coerce_float(data.get("threshold_deny")) (i.e., set ta = result if result is
not None else 0.4 and td = result if result is not None else 0.7) so that
explicit 0.0 values returned by _coerce_float are kept; update the lines that
set ta and td accordingly, referencing the variables ta, td, data,
"threshold_allow", "threshold_deny", and the helper _coerce_float.

Comment thread src/vaara/cli.py
Comment on lines +562 to +567
backend = SQLiteAuditBackend(str(db_path))
try:
trail = backend.load_trail()
except Exception as exc:
print(f"failed to load audit trail: {exc}", file=sys.stderr)
return 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Close SQLiteAuditBackend in both new DB-backed commands.

Both commands open a backend but never close it. Add a finally to guarantee closure after load_trail().

Proposed fix
     backend = SQLiteAuditBackend(str(db_path))
     try:
         trail = backend.load_trail()
     except Exception as exc:
         print(f"failed to load audit trail: {exc}", file=sys.stderr)
         return 2
+    finally:
+        backend.close()

Also applies to: 606-611

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/cli.py` around lines 562 - 567, The SQLiteAuditBackend opened
before calling load_trail() is never closed; wrap the usage of
SQLiteAuditBackend in a try/except/finally so backend.close() is always called:
create the backend as before, call trail = backend.load_trail() inside the try,
keep the existing except Exception as exc handling, and add a finally block that
calls backend.close() (and guards if backend is None) for both occurrences where
SQLiteAuditBackend is instantiated.

Comment on lines +45 to +188
@app.get("/v1/health")
async def health():
return {"status": "ok"}

@app.get("/v1/server", response_model=S.ServerInfo)
async def server_info():
return S.ServerInfo(
name=_SERVER_NAME,
version=_SERVER_VERSION,
vaara_version=_vaara_version,
capabilities=S.Capabilities(
score=True, audit=True, outcome_feedback=True,
),
scorer=S.ScorerInfo(
type=type(state.scorer).__name__,
calibration_size=state.scorer._conformal.calibration_size,
threshold_allow=state.scorer._threshold_allow,
threshold_deny=state.scorer._threshold_deny,
alpha=state.scorer._conformal._alpha,
),
)

@app.post("/v1/score", response_model=S.ScoreResponse)
async def score(req: S.ScoreRequest):
ctx = req.model_dump(exclude_none=True)
try:
decision_dict = state.scorer.evaluate(ctx)
except Exception as exc:
raise _error(
"scorer_error", str(exc), status.HTTP_503_SERVICE_UNAVAILABLE,
)

raw = decision_dict.get("raw_result", {}) or {}
lower, upper = (raw.get("conformal_interval") or [0.0, 1.0])
action_id = str(uuid.uuid4())
signals = {k: float(v) for k, v in (raw.get("signals") or {}).items()}
state.remember_action(
action_id=action_id,
agent_id=req.agent_id,
tool_name=req.tool_name,
predicted_risk=float(raw.get("point_estimate", 0.5) or 0.5),
signals=signals,
)

return S.ScoreResponse(
action_id=action_id,
decision=decision_dict.get("action", "escalate"),
risk=S.RiskBlock(
point=raw.get("point_estimate", 0.5),
lower=lower,
upper=upper,
alpha=raw.get("effective_alpha", 0.10),
bucket=raw.get("bucket_category"),
),
signals=signals,
mwu_weights={k: float(v) for k, v in state.scorer._mwu.weights.items()},
thresholds=S.Thresholds(
allow=state.scorer._threshold_allow,
deny=state.scorer._threshold_deny,
),
sequence_risk=float(raw.get("sequence_risk", 0.0) or 0.0),
calibration_size=int(raw.get("calibration_size", 0) or 0),
evaluation_ms=float(decision_dict.get("evaluation_ms", 0.0) or 0.0),
explanation=decision_dict.get("reason", ""),
)

@app.post("/v1/score/outcome", status_code=204)
async def score_outcome(req: S.OutcomeRequest):
info = state.lookup_action(req.action_id)
if info is None:
raise _error(
"unknown_action", f"action_id {req.action_id!r} not found",
status.HTTP_404_NOT_FOUND,
)
state.scorer.record_outcome(
agent_id=info.agent_id,
tool_name=info.tool_name,
predicted_risk=info.predicted_risk,
actual_outcome=req.outcome_severity,
signals=info.signals,
)
return None

@app.post(
"/v1/audit/events",
response_model=S.AuditEventResponse,
status_code=201,
)
async def append_audit_event(req: S.AuditEventRequest):
try:
event_type = EventType(req.event_type)
except ValueError:
raise _error(
"bad_event_type", f"unknown event_type {req.event_type!r}",
status.HTTP_400_BAD_REQUEST,
)

record = AuditRecord(
record_id=str(uuid.uuid4()),
action_id=req.action_id,
event_type=event_type,
timestamp=time.time(),
agent_id=req.agent_id or "",
tool_name=req.tool_name or "",
data=req.payload or {},
regulatory_articles=[],
)
state.audit._append(record)
return S.AuditEventResponse(
event_id=record.record_id,
chain_position=state.audit.size - 1,
event_hash=record.record_hash,
previous_hash=record.previous_hash,
timestamp=_iso(record.timestamp),
)

@app.get(
"/v1/audit/actions/{action_id}/chain",
response_model=S.AuditChain,
)
async def read_action_chain(action_id: str):
records = state.audit._by_action.get(action_id, [])
if not records:
raise _error(
"unknown_action", f"no audit records for {action_id!r}",
status.HTTP_404_NOT_FOUND,
)
return S.AuditChain(
action_id=action_id,
events=[
S.AuditChainEvent(
event_id=r.record_id,
event_type=r.event_type.value,
chain_position=state.audit._records.index(r),
event_hash=r.record_hash,
previous_hash=r.previous_hash,
timestamp=_iso(r.timestamp),
payload=r.data or {},
)
for r in records
],
)

@app.post("/v1/audit/verify", response_model=S.VerifyResponse)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Protect write-capable endpoints with authn/authz.

/v1/score, /v1/score/outcome, and /v1/audit/events are currently callable without authentication. If reachable outside a trusted boundary, this enables model-poisoning and audit-trail tampering.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/server/routes.py` around lines 45 - 188, The three write-capable
handlers (score, score_outcome, append_audit_event) expose mutation endpoints
without auth; add authentication and authorization checks to each handler
(score, score_outcome, append_audit_event) by requiring an auth dependency (e.g.
FastAPI Depends that validates API key / bearer token and returns a principal
with scopes) and enforce an authorization policy before any state mutation (e.g.
ensure principal has "score:write" or "audit:write" scope), returning proper
HTTP 401/403 errors when missing; apply the same check for any other future
endpoints that call state.remember_action, state.scorer.record_outcome, or
state.audit._append to prevent unauthorized model-poisoning or audit tampering.

Comment on lines +111 to +126
@app.post("/v1/score/outcome", status_code=204)
async def score_outcome(req: S.OutcomeRequest):
info = state.lookup_action(req.action_id)
if info is None:
raise _error(
"unknown_action", f"action_id {req.action_id!r} not found",
status.HTTP_404_NOT_FOUND,
)
state.scorer.record_outcome(
agent_id=info.agent_id,
tool_name=info.tool_name,
predicted_risk=info.predicted_risk,
actual_outcome=req.outcome_severity,
signals=info.signals,
)
return None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

/v1/score/outcome is not idempotent as documented.

The route updates learning state on every repeated call for the same action_id. The API description states repeat submissions should be no-ops.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/server/routes.py` around lines 111 - 126, The handler score_outcome
currently calls state.scorer.record_outcome unconditionally so repeated POSTs
for the same S.OutcomeRequest.action_id mutate learning state; change it to be a
no-op for duplicates by first determining whether that action already has an
outcome recorded (via state.lookup_action(info) and a persisted flag such as
info.outcome_recorded or by adding a state method like
state.has_outcome(action_id)); only call state.scorer.record_outcome when no
outcome exists, and ensure the check-and-set is atomic (implement a new state
method record_outcome_if_not_recorded(action_id, ...) or similar that performs
the existence check and records the outcome in one operation) so score_outcome
returns 204 without side effects on repeated calls.

Comment on lines +188 to +201
@app.post("/v1/audit/verify", response_model=S.VerifyResponse)
async def verify_audit_chain(_req: Optional[S.VerifyRequest] = None):
# v1: full-chain verify only. Ranged verify is in the spec but
# not yet implemented server-side.
problem = state.audit.verify_chain()
if problem is None:
return S.VerifyResponse(
valid=True, events_checked=state.audit.size,
)
return S.VerifyResponse(
valid=False,
events_checked=state.audit.size,
first_break=None,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

/v1/audit/verify drops failure details and ignores requested range.

_req is ignored, so from_event_id/to_event_id are not honored, and first_break is always None even when verification fails. That diverges from the advertised contract and weakens auditor usability.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/server/routes.py` around lines 188 - 201, The handler
verify_audit_chain currently ignores the incoming _req and discards failure
details; modify it to read _req.from_event_id and _req.to_event_id and pass them
into state.audit.verify_chain(...) (or call the appropriate ranged verify
method), capture the returned failure information into a variable (e.g. problem)
and populate S.VerifyResponse.first_break with the failure's identifier/index
(e.g. problem.event_id or problem.index) when problem is not None, while still
returning valid=True and events_checked=state.audit.size when no problem is
found; keep the function name verify_audit_chain and response type
S.VerifyResponse unchanged.

Comment thread src/vaara/server/state.py
Comment on lines +34 to +54
self._actions: dict[str, _ActionInfo] = {}

def remember_action(
self,
action_id: str,
agent_id: str,
tool_name: str,
predicted_risk: float,
signals: dict[str, float],
) -> None:
with self._lock:
self._actions[action_id] = _ActionInfo(
agent_id=agent_id,
tool_name=tool_name,
predicted_risk=predicted_risk,
signals=signals,
)

def lookup_action(self, action_id: str) -> Optional[_ActionInfo]:
with self._lock:
return self._actions.get(action_id)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Bound the in-memory action cache lifecycle.

self._actions only grows and has no eviction strategy. Under sustained traffic this becomes unbounded memory growth for a process-lifetime singleton.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vaara/server/state.py` around lines 34 - 54, The in-memory actions dict
(self._actions) is unbounded and must be bounded to avoid memory growth; modify
the state class to enforce a capacity and eviction policy: add a max_actions
attribute and change remember_action to insert into a size-limited structure
(e.g., replace dict with an OrderedDict or use an LRU cache behavior) so when
inserting a new _ActionInfo you evict the oldest/least-recent entry if the
capacity is exceeded; ensure lookup_action still acquires self._lock and returns
the _ActionInfo (optionally update recency on lookup if using LRU) and keep
using the existing _lock, remember_action, lookup_action and _ActionInfo symbols
so callers are unchanged.

Comment thread tests/test_receipts.py Outdated
Three unused imports introduced in v0.10.0 feature commits triggered
ruff lint failures across Python 3.10-3.13 CI matrix:

  src/vaara/compliance/render.py: 'typing.Any', 'EvidenceStrength'
  tests/test_receipts.py: 'vaara.audit.trail.EventType'

Local ruff and full pytest suite (499 of 499) clean after the removals.
@vaaraio vaaraio merged commit aa8c46f into main May 16, 2026
10 checks passed
@vaaraio vaaraio deleted the feat/v010-kernel-export-receipt branch May 16, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants