perf(server): denormalize runs.agent_id+conversation_id + reserve-connection cap by buremba · Pull Request #870 · lobu-ai/lobu

buremba · 2026-05-18T13:07:16Z

Summary

Two perf prerequisites for flipping LOBU_SESSION_STORE=snapshot to default (Phase 5 of the PR #865 snapshot work). Both surfaced by the 6-agent review on #865.

Fix 1 — `isRunOwnedByJwtScope` JSONB full scan

isRunOwnedByJwtScope (called from every snapshot POST under the snapshot path) was running:

WHERE id=$1 AND organization_id=$2
  AND action_input->>'agentId' = $3
  AND action_input->>'conversationId' = $4

No functional/expression index covered the JSONB shape. On the prod runs table it's a 10–100ms seq scan per worker completion.

Changes:

New migration 20260518050000_runs_denormalize_agent_conversation.sql adds nullable agent_id + conversation_id columns, backfills from action_input for historical rows, and creates a partial index runs_agent_conv_idx (id, organization_id, agent_id, conversation_id) WHERE both NOT NULL.
RunsQueue.send extracts the keys from the payload at insert time and populates the new columns alongside action_input.
isRunOwnedByJwtScope reads the scalar columns directly so the planner resolves an index-only scan.

CREATE INDEX is non-CONCURRENT following the precedent in 20260426130001_db_integrity_cleanup_concurrent.sql (dbmate + pq still present CONCURRENTLY statements as in-transaction). Defensive lock_timeout=30s for the DDL; bumped to 0 around the backfill so row-locks can wait under live traffic.

Fix 2 — `acquireConversationLock` reserve() pressure

Each spawned worker under snapshot mode pinned one sql.reserve() connection for the full subprocess lifetime (10s–1h+). Multi-pod × multi-conversation pressure exhausts the postgres-js pool (default max 20).

Changes:

LOBU_MAX_RESERVED_LOCKS env (default 50). Acquirer checks the in-process counter before calling sql.reserve(); when the cap is reached, returns null so spawnDeployment re-queues the run via the existing DEPLOYMENT_CREATE_FAILED retry path.
getReservedLockCount() helper exposes the counter for metrics / health probes.
One-shot WARN log when the counter crosses 80% of cap (re-armed when it falls back below).
Counter increment moves BEFORE await sql.reserve() to gate concurrent acquirers; decrement is idempotent and runs on every release path (lock-busy, throw, normal release).

Cap value 50 is a conservative starting point; real tuning happens by env once we observe the metric in prod.

Reproducer

Fix 1 — index lookup vs seq scan (on local PGlite):

cd packages/server && bun test src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts
# → 8 pass, including "partial index is present and covers the verifier predicate"

Fix 2 — cap rejection:

# Same file; cap=2 + staged counter=2 returns null, counter unchanged.

Validation

make typecheck → clean
make build-packages → clean
bun test packages/server → 1498 pass / 117 fail / 64 errors. Baseline on origin/main from same harness: 1490 pass / 118 fail / 65 errors. Delta: +8 passes (new tests), -1 fail, -1 error. All remaining failures are pre-existing and unrelated to this PR (admin auth routes, isolated-vm sandbox, settings-auth cookie tests).
schema_migrations.version confirmed still varchar(128) after schema.sql regeneration.

Test plan

Insert-then-read round trip for agent_id / conversation_id columns
isRunOwnedByJwtScope correctness: true for matching scope; false for wrong agent / conv / org
Backfill SQL populates columns on rows with NULL columns + populated action_input
Partial index runs_agent_conv_idx exists with the expected shape
Cap exhaustion returns null and does not increment counter
Embedded-mode (LOBU_DISABLE_PREPARE=1) returns no-op sentinel without touching counter
Counter resets between tests
Cap re-read fresh from env every call

Notes for review

Pre-existing rows from connector lanes (sync/action/watcher/auth) carry NULL in the new columns. The partial index's WHERE NOT NULL predicate keeps it small.
The crossover window between migration deploy and new-code deploy is harmless: only LOBU_SESSION_STORE=snapshot mode reads through isRunOwnedByJwtScope, and that flag isn't default until Phase 5.

Summary by CodeRabbit

New Features
- Added /health/orchestrator endpoint to monitor reserved conversation lock utilization and capacity metrics.
Tests
- Added integration tests validating denormalization logic and reserved lock cap enforcement.

…nection cap for snapshot prereqs Two perf prerequisites required before flipping LOBU_SESSION_STORE=snapshot to default: 1. isRunOwnedByJwtScope JSONB seq scan The verifier ran 'action_input->>agentId' / '... ->>conversationId' against runs on every snapshot POST. On a multi-million-row runs table that's 10-100ms per worker completion. Add nullable agent_id + conversation_id columns to runs, populate them on insert via RunsQueue.send, backfill historical rows in migration, and add a partial index over (id, organization_id, agent_id, conversation_id) WHERE both non-null. isRunOwnedByJwtScope now reads the columns directly so the planner resolves an index-only seek. 2. acquireConversationLock unbounded pool pressure Every active worker holds a sql.reserve() connection for its entire lifetime (10s-1h+). Multi-pod x multi-worker exhausts the postgres-js pool. Add LOBU_MAX_RESERVED_LOCKS (default 50) cap before reservation, an in-process counter exposed via getReservedLockCount(), and an 80% threshold WARN. Reaching the cap returns null so spawnDeployment re-queues the run. Migration ships chunked-friendly UPDATE for the backfill with lock_timeout relaxation, and a non-CONCURRENTLY CREATE INDEX following the precedent in 20260426130001_db_integrity_cleanup_concurrent.sql (dbmate + pq still present these as in-transaction).

coderabbitai · 2026-05-18T13:07:32Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This pull request denormalizes run-scoped identifiers (agent_id, conversation_id) into scalar columns in public.runs, updates the queue producer to write these values and the verifier to read them with COALESCE fallback during migration, and adds an in-process reserved-connection cap for embedded advisory locks with health metrics.

Changes

Runs denormalization and reserved locks cap

Layer / File(s)	Summary
Database schema and migration `db/migrations/20260518050000_runs_denormalize_agent_conversation.sql`, `db/schema.sql`	Adds nullable `agent_id` and `conversation_id` columns to `public.runs`; migration applies columns on up and removes on down with no backfill logic, relying on application behavior to populate values.
Queue producer enqueue updates `packages/server/src/gateway/infrastructure/queue/runs-queue.ts`	Extracts `agentId`, `conversationId`, and `organizationIdFromPayload` from payload when present as non-empty strings; writes them into `public.runs` scalar columns alongside `action_input` JSONB; parameter bindings extended to match new column positions.
Transcript route verifier query update `packages/server/src/gateway/gateway/transcript-routes.ts`	`isRunOwnedByJwtScope` uses `COALESCE(agent_id, action_input ->> 'agentId')` and `COALESCE(conversation_id, action_input ->> 'conversationId')` to handle migration ordering and enable partial-index optimization.
Test harness and helper functions `packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts` (lines 1–93)	Sets up Bun integration test lifecycle with PGlite, database reset before each test; introduces `ensureOrg` and `insertChatRun` helpers to insert runs using production payload shape.
Denormalization validation tests `packages/server/src/gateway/__tests__/agent-transcript-snapshot.test.ts`, `packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts` (lines 95–241)	Validates scalar column population and round-trip; verifies ownership checks on denormalized columns; covers COALESCE fallback for rows with only `action_input` populated and historical rows with `NULL` scalar columns.
Reserved locks cap implementation `packages/server/src/gateway/orchestration/impl/embedded-deployment.ts`	Adds configurable `LOBU_MAX_RESERVED_LOCKS` cap; `acquireConversationLock` enforces cap and returns `null` when reached, increments counter before reservation, decrements idempotently on failure; `release()` decrements counter after releasing connection.
Reserved locks cap tests `packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts` (lines 243–335)	Tests cap exhaustion without incrementing counter; embedded-mode sentinel behavior; cap-reject when counter at/above cap; counter reset and metric timing.
Orchestrator health endpoint `packages/server/src/index.ts`	Adds `GET /health/orchestrator` returning reserved lock count, cap, and an 80% `near_cap` boolean.

Sequence Diagram

sequenceDiagram
  participant RunsQueue
  participant Database
  participant Verifier
  RunsQueue->>Database: INSERT run (agent_id, conversation_id, action_input, ...)
  Verifier->>Database: SELECT ... WHERE COALESCE(agent_id, action_input->>'agentId')=$1 AND COALESCE(conversation_id, action_input->>'conversationId')=$2
  Database->>Verifier: row / null (with fallback for pre-backfill rows)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

lobu-ai/lobu#865: Updates isRunOwnedByJwtScope ownership logic in transcript routes to use the newly denormalized public.runs scalar columns with COALESCE fallback, directly supporting per-run JWT binding checks.

Poem

🐰 Two columns slip into runs so neat,
Agent and chat now scalar, complete!
A cap upon locks keeps resources fleet,
Health checks report the balance they meet—
Denormalization makes the snapshot sweet! 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main changes: denormalizing two columns to fix JSONB scanning, and adding a reserve-connection cap for performance prerequisites.
Description check	✅ Passed	The description includes all required template sections: detailed Summary explaining both fixes with context, comprehensive Test plan with all items checked, and Notes section addressing pre-existing rows and crossover window concerns.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/runs-jsonb-scan-and-reserve-cap

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov-commenter · 2026-05-18T13:09:19Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…back + metric endpoint Addresses 4 findings from codex review of PR #870: 1. Migration was wrapping ALTER + UPDATE + CREATE INDEX in a single transaction → AccessExclusive on runs held until commit, blocking writes for the duration of a million-row backfill. Now transaction:false with chunked backfill in a DO block (5000 rows / batch, tunable via lobu.backfill_chunk GUC), each step committing independently. 2. Deploy-order race: an old gateway still inserting rows post-migration would write only action_input, no scalar columns. isRunOwnedByJwtScope now COALESCEs the scalar columns onto the JSONB extraction so the crossover window is safe. Tested. 3. Index shape: dropped the over-engineered cover index leading with id (PK already serves single-row lookups). New index is (agent_id, conversation_id, id DESC) WHERE both non-null — useful for queries that look up runs by the (agent, conversation) pair, which is the genuine value of denormalization. 4. Metric exposed: new /health/orchestrator route returns reserved_conversation_locks + cap + near_cap so operators can scrape without code changes.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts (1)

289-304: ⚡ Quick win

Test intent and assertion diverge in the “below cap” branch.

The comment says this path should stop hitting cap rejection, but the code sets count=1 and cap=1 and still asserts rejection. Either adjust the narrative or split this into a true “below cap” case so the branch claim is actually validated.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts`
around lines 289 - 304, The test's comment says it should verify that dropping
the reserved count "below the cap" stops cap rejections, but the code sets
setReservedLockCountForTests(1) and process.env.LOBU_MAX_RESERVED_LOCKS = "1"
then expects a rejection; to fix, make the assertion and setup consistent:
either change the setup to truly go below cap (e.g.,
setReservedLockCountForTests(0) and keep LOBU_MAX_RESERVED_LOCKS="1" then expect
acquireConversationLock("org-a","agent-a","conv-b") not to be null and assert
getReservedLockCount() reflects the new value), or keep the current numeric
setup but update the comment to explain this is a re-bump/check that still
rejects; update the test around acquireConversationLock,
setReservedLockCountForTests, process.env.LOBU_MAX_RESERVED_LOCKS, and
getReservedLockCount accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@db/migrations/20260518050000_runs_denormalize_agent_conversation.sql`:
- Around line 58-59: The WHERE clause for the backfill currently only checks
action_input ? 'agentId' and agent_id IS NULL, allowing conversation_id to
remain NULL; update the predicate in the backfill (the SQL that references
agent_id, conversation_id and action_input) to require both JSON keys
(action_input ? 'agentId' AND action_input ? 'conversationId') and both target
columns be NULL (agent_id IS NULL AND conversation_id IS NULL) so rows are only
updated when both denormalized values will be populated.

In `@packages/server/src/gateway/orchestration/impl/embedded-deployment.ts`:
- Around line 292-300: The cleanup path may skip calling decrementOnce() if
reserved.release() throws, leaking reservedLockCount; change the two early-exit
blocks that call reserved.release() then decrementOnce() (the branches around
rows[0]?.acquired check and the catch block) to mirror the pattern used later:
call reserved.release() inside its own try/catch and always call decrementOnce()
in a finally (or ensure decrementOnce() runs after a guarded release) so that
decrementOnce() cannot be skipped even if reserved.release() throws; update the
code paths referencing reserved.release() and decrementOnce() accordingly.

---

Nitpick comments:
In
`@packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts`:
- Around line 289-304: The test's comment says it should verify that dropping
the reserved count "below the cap" stops cap rejections, but the code sets
setReservedLockCountForTests(1) and process.env.LOBU_MAX_RESERVED_LOCKS = "1"
then expects a rejection; to fix, make the assertion and setup consistent:
either change the setup to truly go below cap (e.g.,
setReservedLockCountForTests(0) and keep LOBU_MAX_RESERVED_LOCKS="1" then expect
acquireConversationLock("org-a","agent-a","conv-b") not to be null and assert
getReservedLockCount() reflects the new value), or keep the current numeric
setup but update the comment to explain this is a re-bump/check that still
rejects; update the test around acquireConversationLock,
setReservedLockCountForTests, process.env.LOBU_MAX_RESERVED_LOCKS, and
getReservedLockCount accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c0da1fbf-f2c3-49b7-9034-507b7382edbe

📥 Commits

Reviewing files that changed from the base of the PR and between 8d1beee and 45feb6c.

📒 Files selected for processing (7)

db/migrations/20260518050000_runs_denormalize_agent_conversation.sql
db/schema.sql
packages/server/src/gateway/__tests__/agent-transcript-snapshot.test.ts
packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts
packages/server/src/gateway/gateway/transcript-routes.ts
packages/server/src/gateway/infrastructure/queue/runs-queue.ts
packages/server/src/gateway/orchestration/impl/embedded-deployment.ts

coderabbitai · 2026-05-18T13:14:12Z

    if (!rows[0]?.acquired) {
      reserved.release();
+      decrementOnce();
      return null;
    }
  } catch (err) {
    reserved.release();
+    decrementOnce();
    throw err;


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify all reserved release paths around lock acquisition are guarded so decrement always happens. rg -n -C3 'reserved\.release|decrementOnce\(' packages/server/src/gateway/orchestration/impl/embedded-deployment.ts

Repository: lobu-ai/lobu

Length of output: 1164

Ensure counter decrement cannot be skipped when release throws.

On lines 293-294 and 298-299, reserved.release() runs before decrementOnce(). If release() throws, decrementOnce() won't execute, leaking reservedLockCount and eventually causing false cap rejections. The cleanup path (lines 337-345) already implements the correct pattern with try-catch around release and a finally-block decrement.

Proposed fix

+ const releaseReservedAndDecrement = (): void => { + try { + reserved.release(); + } catch (releaseErr) { + logger.warn( + `Failed to release reserved DB connection for ${organizationId}/${agentId}/${conversationId}: ${ + releaseErr instanceof Error ? releaseErr.message : String(releaseErr) + }` + ); + } finally { + decrementOnce(); + } + }; + try { const rows = (await reserved`SELECT pg_try_advisory_lock(${CONV_LOCK_KEY1}, ${key2}) AS acquired`) as Array<{ acquired: boolean }>; if (!rows[0]?.acquired) { - reserved.release(); - decrementOnce(); + releaseReservedAndDecrement(); return null; } } catch (err) { - reserved.release(); - decrementOnce(); + releaseReservedAndDecrement(); throw err; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (!rows[0]?.acquired) {

reserved.release();

decrementOnce();

return null;

}

} catch (err) {

reserved.release();

decrementOnce();

throw err;

const releaseReservedAndDecrement = (): void => {

try {

reserved.release();

} catch (releaseErr) {

logger.warn(

`Failed to release reserved DB connection for ${organizationId}/${agentId}/${conversationId}: ${

releaseErr instanceof Error ? releaseErr.message : String(releaseErr)

}`

);

} finally {

decrementOnce();

}

};

try {

const rows = (await reserved`SELECT pg_try_advisory_lock(${CONV_LOCK_KEY1}, ${key2}) AS acquired`) as Array<{ acquired: boolean }>;

if (!rows[0]?.acquired) {

releaseReservedAndDecrement();

return null;

}

} catch (err) {

releaseReservedAndDecrement();

throw err;

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/gateway/orchestration/impl/embedded-deployment.ts` around lines 292 - 300, The cleanup path may skip calling decrementOnce() if reserved.release() throws, leaking reservedLockCount; change the two early-exit blocks that call reserved.release() then decrementOnce() (the branches around rows[0]?.acquired check and the catch block) to mirror the pattern used later: call reserved.release() inside its own try/catch and always call decrementOnce() in a finally (or ensure decrementOnce() runs after a guarded release) so that decrementOnce() cannot be skipped even if reserved.release() throws; update the code paths referencing reserved.release() and decrementOnce() accordingly.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/server/src/index.ts`:
- Around line 450-454: The parsedCap/cap logic allows 0 but clamps negatives to
50 and also calls Number.parseInt(rawCap, 10) twice; update the code that
computes parsedCap (using rawCap) to parse once into a local value (e.g., const
parsed = Number.parseInt(rawCap, 10)) and then treat parsed <= 0 as the fallback
50 (or explicitly clamp zero to 50) so that both negative and zero map to 50,
finally assign cap from that clamped parsed value (keep variable names parsedCap
and cap for clarity).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 816dac00-e486-4fbd-95d5-ec37bbe545f9

📥 Commits

Reviewing files that changed from the base of the PR and between 45feb6c and 6538ecd.

📒 Files selected for processing (5)

db/migrations/20260518050000_runs_denormalize_agent_conversation.sql
db/schema.sql
packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts
packages/server/src/gateway/gateway/transcript-routes.ts
packages/server/src/index.ts

🚧 Files skipped from review as they are similar to previous changes (2)

packages/server/src/gateway/gateway/transcript-routes.ts
packages/server/src/gateway/tests/runs-denormalize-and-reserve-cap.test.ts

…ool size Round 2 review of PR #870: 1. Migration backfill is now actually chunk-committed. Previous DO block ran as a single implicit transaction so the LOOP didn't commit between iterations. Switched to a temp PROCEDURE (PG11+ feature) which DOES support COMMIT inside the loop body — each 5000-row chunk releases its row locks before the next batch. 2. LOBU_MAX_RESERVED_LOCKS default is now derived from DB_POOL_MAX with a POOL_HEADROOM of 5 (default cap = 15 against a 20-connection pool). The prior fixed default of 50 against a 20-connection pool meant callers 21-50 would block inside sql.reserve() instead of hitting the cap at all — defeating the cap's purpose. /health/orchestrator now reports the derived cap, not a hardcoded one. 3. Comment in runs-queue.ts no longer overclaims that isRunOwnedByJwtScope gets an index-only seek win from the new index — runs_pkey already serves single-row lookups on id. The index earns its keep on (agent_id, conversation_id) lookups that don't specify id.

…igration Round 3 review of PR #870: 1. Backfill predicate now requires action_input->>'agentId' IS NOT NULL instead of '? agentId'. A row with {"agentId": null} would have ? agentId TRUE but ->>'agentId' = NULL → SET agent_id = NULL is a no-op and the chunk row count never falls to 0. One bad row could wedge the migration LOOP forever. 2. Index build is now in its own transaction:false migration (20260518050001) so CREATE INDEX CONCURRENTLY can run without an enclosing transaction block. The hot multi-million-row runs table no longer eats a write-blocking AccessExclusive during the build. Test loader handles the second migration correctly because postgres.js runs a single-statement unsafe() via the simple query protocol — no implicit BEGIN/COMMIT wrap.

… index Round 4 review of PR #870: 1. CREATE INDEX CONCURRENTLY does not work in this repo's dbmate+pq setup even under transaction:false (per the comments in 20260426130001_db_integrity_cleanup_concurrent.sql:145-148 and 20260430151215). Plain CREATE INDEX would block writes for the duration of the build on the multi-million-row runs table. Since runs_pkey already serves the verifier's hot path, the index has no required user — dropped entirely. 2. The chunked backfill scans WHERE agent_id IS NULL with LIMIT, no keyset cursor; on millions of rows this degenerates to O(N²) repeated scans of the already-updated prefix. Dropped the backfill from the migration entirely. The COALESCE fallback in isRunOwnedByJwtScope already handles historical NULL rows correctly. Operators who want the columns populated for diagnostic queries can run a keyset-anchored backfill via a runbook. Migration now does just one thing: ADD COLUMN (metadata-only on PG11+, brief AccessExclusive). All complexity moved into runtime fallback. Updated test renames the backfill test to verify the COALESCE protects legacy NULL-column rows + drops the index-shape assertion.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts`:
- Around line 6-7: Update the stale header comment in the test file
runs-denormalize-and-reserve-cap.test.ts to remove the claim that a partial
index exists and that the verifier is index-only; instead state that index
creation was removed by the migration and any queries may require a full scan or
appropriate new index if needed. Locate the comment block containing the phrase
"`... ->> 'conversationId'`. A partial index covers the predicate so the
verifier is index-only" and replace it with wording that reflects the current
migration state (e.g., note that the partial index was removed and performance
assumptions no longer hold).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f28e8ddc-4b46-47f5-8556-d1c5516f6ff9

📥 Commits

Reviewing files that changed from the base of the PR and between ca7c687 and 3d28b25.

📒 Files selected for processing (4)

db/migrations/20260518050000_runs_denormalize_agent_conversation.sql
db/schema.sql
packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts
packages/server/src/gateway/infrastructure/queue/runs-queue.ts

🚧 Files skipped from review as they are similar to previous changes (1)

packages/server/src/gateway/infrastructure/queue/runs-queue.ts

…ate output CI's migration check regenerates db/schema.sql via dbmate and compares. The pre-existing schema in main has the constraint as agent_transcript_snapshot_org_agent_conv_run_key (likely renamed by hand or in an older dbmate version) while dbmate now emits the auto-named agent_transcript_snapshot_organization_id_agent_id_conversa_key. Align the snapshot file with dbmate's output to clear the drift check.

… job The 'embedded mode returns no-op sentinel' and 'lock-cross-conv-parallelism (embedded sentinel)' tests assert behavior specific to LOBU_DISABLE_PREPARE=1 (PGlite). On the real-PG integration job they'd hit the cap+reserve path and the embedded-sentinel assertions don't apply. Gated both with skipIf.

…unused) (#873) PR #870 added two columns (`runs.agent_id`, `runs.conversation_id`) on the premise that the snapshot verifier (`isRunOwnedByJwtScope`) was doing a JSONB full-scan on `runs`. That premise was wrong: the verifier query is a PK lookup (`WHERE id = $1`) and the JSONB extraction on the single matched row is microseconds. The columns landed with a COALESCE fallback to JSONB on every read, have zero downstream consumers (claim loop doesn't filter on them, reaper doesn't either), and add INSERT-path overhead in `RunsQueue.send` for no win. Remove the slop. Kept from PR #870: the `acquireConversationLock` reserved-connection cap, the counter helper, and `/health/orchestrator`. Those are genuine multi-replica plumbing and unrelated to the JSONB perf claim. * Add migration `20260518060000_revert_runs_denormalize.sql` to drop both columns (DROP COLUMN IF EXISTS — metadata-only on PG11+). * Update `db/schema.sql` to match; `schema_migrations.version` preserved as `varchar(128)` per the known CI footgun. * Revert `isRunOwnedByJwtScope` to JSONB-only. * Revert `RunsQueue.send` INSERT shape (drop agent_id / conversation_id / organization_id from the column list). * Rename `runs-denormalize-and-reserve-cap.test.ts` → `reserve-cap.test.ts` and remove the denormalize describe block. Reserve-cap tests stay verbatim. * Strip the new-columns from the `insertRun` helper in `agent-transcript-snapshot.test.ts`. Snapshot suite still passes via the JSONB-only verifier (all 858 gateway tests green). Validation: - `make typecheck`: same 18 pre-existing errors on origin/main, no new errors introduced. - `make build-packages`: clean. - `bun test packages/server/src/gateway/__tests__/`: 858 pass / 0 fail.

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread packages/server/src/index.ts Outdated

buremba added 3 commits May 18, 2026 14:18

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread packages/server/src/gateway/__tests__/runs-denormalize-and-reserve-cap.test.ts

buremba added 2 commits May 18, 2026 14:32

buremba merged commit d4691a7 into main May 18, 2026
3 of 4 checks passed

buremba deleted the perf/runs-jsonb-scan-and-reserve-cap branch May 18, 2026 13:37

This was referenced May 18, 2026

chore(main): release lobu 7.2.0 #863

Merged

feat(server,chart): flip snapshot default + drop workspaces PVC (Phase 5) #871

Merged

chore(server): revert runs.agent_id+conversation_id denormalization (unused) #873

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(server): denormalize runs.agent_id+conversation_id + reserve-connection cap#870

perf(server): denormalize runs.agent_id+conversation_id + reserve-connection cap#870
buremba merged 7 commits into
mainfrom
perf/runs-jsonb-scan-and-reserve-cap

buremba commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov-commenter commented May 18, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot May 18, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

buremba commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix 1 — isRunOwnedByJwtScope JSONB full scan

Fix 2 — acquireConversationLock reserve() pressure

Reproducer

Validation

Test plan

Notes for review

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov-commenter commented May 18, 2026

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

buremba commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Fix 1 — `isRunOwnedByJwtScope` JSONB full scan

Fix 2 — `acquireConversationLock` reserve() pressure

coderabbitai Bot commented May 18, 2026 •

edited

Loading