Skip to content

feat(timeout): #224 ledger-query timeout with Claude-hooks context surfacing#323

Merged
Knapp-Kevin merged 1 commit into
devfrom
feat/224-surrealdb-query-timeout
May 14, 2026
Merged

feat(timeout): #224 ledger-query timeout with Claude-hooks context surfacing#323
Knapp-Kevin merged 1 commit into
devfrom
feat/224-surrealdb-query-timeout

Conversation

@Knapp-Kevin

Copy link
Copy Markdown
Collaborator

Summary

Two-layer timeout architecture for SurrealDB ledger queries — closes BicameralAI/bicameral-daemon#22 + addresses operator directive 2026-05-13 ("with Claude specifically we need to leverage hooks that make calls to the MCP for relative context at an appropriate gate").

Layer 1 — Deterministic server-side gate (the floor)

  • asyncio.wait_for wrap on every LedgerClient.query / execute / execute_many
  • New LedgerTimeoutError(LedgerError) carrying sql_prefix (200-char cap) / timeout_class / elapsed_seconds / budget_seconds
  • Two budget classes: read (5s default) and drift (30s default), clamped to safe ranges in context.py
  • Fail-closed config readers — NaN / Inf / string / bool / negative / zero all fall back to documented default; out-of-range values clamp (preserving operator intent for ""long but bounded"")
  • BICAMERAL_QUERY_TIMEOUT_DISABLE=1 env-override for debugging (mirrors the BICAMERAL_INGEST_RATE_LIMIT_DISABLE precedent)
  • Registered in governance-gates.yaml per BicameralAI/bicameral-daemon#34 doctrine — skill text is advisory, the wrap is the truth

Layer 2 — Claude Code hooks (advisory context, additive)

  • .claude/hooks/session_start_timeout_posture.py — one-line stderr brief at session start (budgets, last-hour timeout counts, env-disable state)
  • .claude/hooks/pre_tool_use_timeout_context.py — stderr warning before bicameral tool calls when recent (<10 min) timeouts exist
  • New recent_timeout_count field on PreflightResponse (additive, default {""read"":0, ""drift"":0}); backed by in-memory ring buffer (cap 1000) in ledger/timeout_telemetry.py
  • Hooks always exit 0; graceful degrade if bicameral isn't importable; deterministic wrap is unaffected if hooks don't run

Handler annotations

Narrowed during Phase B based on evidence: only handlers/history.py::_fetch_all_decisions_enriched is a single graph-traversal query needing the 30s budget. Other workflows (preflight, sync_middleware, link_commit) chain many individually-fast queries; each carries its own 5s read budget. The plan-doc records the evidence-based narrowing.

Test plan

  • tests/test_query_timeout_unit.py — 25 tests covering wrap behavior, fail-closed config (negative/string/bool/NaN/Inf/clamp/zero), env-disable bypass, sql-prefix truncation, telemetry ring-buffer recording, execute() parity, BicameralContext default construction
  • tests/test_query_timeout_handler_routing.py — 13 static-grep pins (exactly one handler uses drift annotation; governance-gates.yaml entry present; LedgerClient signature exposes Literal[""read"",""drift""] kwarg)
  • tests/test_claude_hooks_timeout_context.py — 9 subprocess invocations of both hook scripts (exit 0; env-disable surfacing; missing-import graceful exit; ring-buffer cap; window filtering)
  • Existing handler regression (tests/test_history_input_span_id.py, tests/test_preflight_attribution_redaction.py) — 15 tests still pass with the new annotation + recent_timeout_count field
  • Config-knob regression (tests/test_context_ingest_max_bytes.py, tests/test_context_ingest_rate_limit.py) — 13 tests pass; the new _read_query_timeout_*_seconds readers mirror the existing pattern
  • ruff format + ruff check clean on all 13 modified/new source files
  • scripts/lint_skill_governance.py parses the new gate entry without error

Docs

  • docs/policies/query-timeouts.md — operator-facing reference (defaults, config, env override, error shape, telemetry, governance)
  • docs/policies/claude-hooks-mcp-integration.md — design rationale, wiring instructions for .claude/settings.json, constraints, pattern for adding future hooks at other gates

Plan + audit

  • plan-224-surrealdb-query-timeout.md (in this PR's diff)
  • qor-judge PASS at L2 — two non-blocking observations resolved during implementation

Review Boundary

This PR is the explicit-authorization push of locally-committed work. Per the persistent Review Boundary, neither this PR nor the underlying commit was created without operator authorization.

🤖 Generated with Claude Code

…rfacing

Two-layer gate architecture for SurrealDB ledger queries:

1. Deterministic server-side gate (the floor):
   - asyncio.wait_for wrap on LedgerClient.query/execute/execute_many
   - New LedgerTimeoutError(LedgerError) with sql_prefix/timeout_class/
     elapsed_seconds/budget_seconds (sql truncated to 200 chars)
   - Two budget classes: read (5s default) and drift (30s default)
   - Fail-closed config readers in context.py: clamp on out-of-range,
     fall back to default on NaN/Inf/string/bool/zero/negative
   - BICAMERAL_QUERY_TIMEOUT_DISABLE env-override for debugging
     (mirrors BICAMERAL_INGEST_RATE_LIMIT_DISABLE precedent)
   - Registered in governance-gates.yaml per #205 doctrine

2. Claude-Code hook layer (advisory, additive, never blocking):
   - .claude/hooks/session_start_timeout_posture.py: one-line stderr
     brief at session start (budgets, last-hour timeout counts,
     env-disable state)
   - .claude/hooks/pre_tool_use_timeout_context.py: stderr warning
     before bicameral tool calls when recent timeouts (<10min) exist
   - New recent_timeout_count field on PreflightResponse (additive,
     default {"read":0,"drift":0}); backed by in-memory ring buffer
     (cap 1000) in ledger/timeout_telemetry.py
   - Both hooks exit 0 unconditionally; graceful degrade if MCP
     unreachable

Handler annotations: handlers/history.py:_fetch_all_decisions_enriched
takes timeout_class="drift" (the only single-query graph-traversal
site). All other handlers stay on the 5s read default — composite
workflows like preflight/sync_middleware/link_commit chain many
individually-fast queries; each carries its own read budget.

Tests (47 new, all passing):
- tests/test_query_timeout_unit.py (25): wrap behavior, fail-closed
  config across negative/string/bool/NaN/Inf/clamp paths, env-disable
  bypass, sql-prefix truncation, telemetry ring-buffer recording,
  execute() parity, BicameralContext default construction.
- tests/test_query_timeout_handler_routing.py (13): static-grep pins
  on drift-class annotation; sanity-check that exactly one handler
  site uses drift; governance-gates.yaml entry present; LedgerClient
  signature exposes Literal["read","drift"] kwarg.
- tests/test_claude_hooks_timeout_context.py (9): subprocess
  invocation of both hook scripts, env-disable surfacing, missing-
  import graceful exit, ring-buffer cap at 1000, window filtering.

Docs:
- docs/policies/query-timeouts.md
- docs/policies/claude-hooks-mcp-integration.md

Plan: plan-224-surrealdb-query-timeout.md (audited PASS at L2)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 24d34d2b-5ffb-449d-8ccc-fff085f7c01e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/224-surrealdb-query-timeout

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Knapp-Kevin Knapp-Kevin added feat Feature work or user-visible capability P1 High: ship this milestone; user-impacting bug or committed feature labels May 14, 2026
@Knapp-Kevin Knapp-Kevin merged commit fd6059f into dev May 14, 2026
10 of 11 checks passed
@Knapp-Kevin Knapp-Kevin deleted the feat/224-surrealdb-query-timeout branch May 14, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feat Feature work or user-visible capability P1 High: ship this milestone; user-impacting bug or committed feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant