refactor(server): extract createServerLifecycle to eliminate Postgres/PGlite drift (#948)#951
Conversation
…/PGlite drift (#948) Both entries (server.ts, start-local.ts) now call a shared lifecycle spine in server-lifecycle.ts. Middleware ordering, route mounts, httpServer timeouts, shutdown sequence, and signal wiring are identical by construction. Mode differences are confined to four named hooks (databaseReadiness, preListenHooks, postListenHooks, extraTeardown) and per-mode resource construction in the entry files. Closes #948.
📝 WalkthroughWalkthroughThis pull request consolidates server startup/shutdown, middleware, and HTTP configuration into a shared ChangesShared Server Lifecycle Consolidation
Sequence Diagram(s)sequenceDiagram
participant Client
participant WrapperApp
participant Database
participant WorkspaceProvider
participant LobuGateway
participant Scheduler
participant Reaper
participant HTTPServer
participant EmbeddedWorker
Client->>WrapperApp: HTTP request
WrapperApp->>HTTPServer: dispatch to mounted apps (/ or /lobu)
Note over WrapperApp,HTTPServer: wrapper middleware stashes peer addr, injects env, handles errors
WrapperApp->>LobuGateway: route to gateway (when mounted)
alt Startup flow (createServerLifecycle.start)
createServerLifecycle->>Database: run databaseReadiness checks
createServerLifecycle->>WorkspaceProvider: init
createServerLifecycle->>LobuGateway: init/start
createServerLifecycle->>Scheduler: start
createServerLifecycle->>Reaper: start
createServerLifecycle->>HTTPServer: create + listen (keep-alive/header timeouts)
createServerLifecycle->>EmbeddedWorker: start after listen/postListenHooks
end
alt Shutdown flow (SIGTERM/SIGINT)
createServerLifecycle->>EmbeddedWorker: stop
createServerLifecycle->>Reaper: stop
createServerLifecycle->>Scheduler: stop
createServerLifecycle->>LobuGateway: stop
createServerLifecycle->>Database: close
createServerLifecycle->>HTTPServer: close
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/server/src/server-lifecycle.ts (1)
407-412: ⚡ Quick winAdd re-entrancy guard to prevent concurrent shutdown.
If SIGINT arrives while a SIGTERM-triggered shutdown is in progress,
shutdown()will be invoked a second time. This could cause double-closes on resources (logging confusing errors) or race conditions in the teardown sequence. Whileprocess.exit(0)terminates before the second call completes, a guard improves shutdown clarity.Proposed fix
// 9. Shutdown wiring — declared once, called from both SIGTERM and SIGINT. // Embedded worker handle is captured in the listen callback below. let embeddedWorker: ReturnType<typeof startEmbeddedConnectorWorker> = null; + let shuttingDown = false; const shutdown = async (signal: string): Promise<void> => { + if (shuttingDown) { + logger.info({ signal }, "Shutdown already in progress, ignoring duplicate signal"); + return; + } + shuttingDown = true; logger.info( { signal, mode }, "Received shutdown signal, stopping gracefully...", );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/server-lifecycle.ts` around lines 407 - 412, Add a re-entrancy guard so concurrent signal handlers cannot invoke shutdown() twice: introduce a module-level boolean/atomic flag (e.g., isShuttingDown) checked at the start of shutdown() and set to true before any async teardown begins; update the process.on("SIGTERM") and process.on("SIGINT") handlers to call shutdown() as before but rely on the guard to no-op if shutdown is already in progress; ensure the flag is set synchronously before awaiting any async operations and left true until teardown completes (or until you explicitly log a skipped duplicate call).packages/server/src/start-local.ts (1)
202-202: 💤 Low value
personalOrgIdis declared but only used within a single hook closure.The variable
personalOrgIdat line 202 is declared in the outer scope but is only assigned and used within the secondpreListenHooksclosure (lines 230-248). It could be scoped locally within that hook.♻️ Suggested refactor
- // Personal-org id for default-agent provisioning. Resolved once during the - // pre-listen phase rather than per-call, so the dynamic postgres import - // happens with a hot DATABASE_URL. - let personalOrgId: string | null = null; - const lifecycle = createServerLifecycle({ mode: "pglite", env, host: HOST, port: PORT, databaseReadiness: () => runMigrations(dbUrl), preListenHooks: [ // ... first hook unchanged ... async () => { try { const personalOrgRows = (await import("postgres")).default(dbUrl, { max: 1, }); try { const rows = (await personalOrgRows`...`) as unknown as Array<{ id: string }>; - personalOrgId = rows[0]?.id ?? null; + const personalOrgId = rows[0]?.id ?? null; if (personalOrgId) await ensureDefaultAgent(personalOrgId);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/start-local.ts` at line 202, The variable personalOrgId is declared in outer scope but only assigned/used inside the second preListenHooks closure; move its declaration into that closure (remove the outer let personalOrgId) so personalOrgId is a local variable inside the function passed to preListenHooks where it is set and referenced, updating any references inside that closure to the new local variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/server/src/server-lifecycle.ts`:
- Around line 407-412: Add a re-entrancy guard so concurrent signal handlers
cannot invoke shutdown() twice: introduce a module-level boolean/atomic flag
(e.g., isShuttingDown) checked at the start of shutdown() and set to true before
any async teardown begins; update the process.on("SIGTERM") and
process.on("SIGINT") handlers to call shutdown() as before but rely on the guard
to no-op if shutdown is already in progress; ensure the flag is set
synchronously before awaiting any async operations and left true until teardown
completes (or until you explicitly log a skipped duplicate call).
In `@packages/server/src/start-local.ts`:
- Line 202: The variable personalOrgId is declared in outer scope but only
assigned/used inside the second preListenHooks closure; move its declaration
into that closure (remove the outer let personalOrgId) so personalOrgId is a
local variable inside the function passed to preListenHooks where it is set and
referenced, updating any references inside that closure to the new local
variable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 105a8fd3-d304-46d2-8079-ece13dcaec8f
📒 Files selected for processing (4)
packages/server/src/__tests__/server-lifecycle.test.tspackages/server/src/server-lifecycle.tspackages/server/src/server.tspackages/server/src/start-local.ts
…dings:Env}> Per-package tsc (cd packages/server && bunx tsc --noEmit) rejects the contravariant assignment of a typed mainApp to a parameter declared as plain Hono. The root bun run typecheck is permissive enough to miss this; CI runs both.
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/server/src/server-lifecycle.ts (1)
373-406: 💤 Low valueConsider defensive error handling in shutdown sequence.
If any step in the shutdown sequence throws (e.g.,
stopLobuGateway()orembeddedWorker.wait()), subsequent cleanup won't run—closeDbSingleton(),extraTeardownhooks, andhttpServer.close()could all be skipped. Since signal handlers usevoid shutdown(...), errors become unhandled rejections.Given this is a shutdown path, the process will likely exit anyway, but wrapping each step or using a best-effort cleanup pattern would improve resilience.
🔧 Optional: best-effort shutdown pattern
const shutdown = async (signal: string): Promise<void> => { logger.info( { signal, mode }, "Received shutdown signal, stopping gracefully...", ); - // Order matters: - // a. Stop accepting new work from the embedded connector worker. - if (embeddedWorker) { - embeddedWorker.stop(); - await embeddedWorker.wait(15_000); + const safeRun = async (label: string, fn: () => Promise<void> | void) => { + try { + await fn(); + } catch (err) { + logger.error({ err, label }, "Shutdown step failed, continuing..."); + } + }; + + if (embeddedWorker) { + await safeRun("embeddedWorker.stop", () => { + embeddedWorker!.stop(); + return embeddedWorker!.wait(15_000); + }); } - // b. Close Vite (HMR sockets) before tearing down the http server - // so dev-mode listeners detach cleanly. - await vite?.close(); - // c. Stop the reaper poll loop. - stopReaper(); - // d. Stop the task scheduler dispatch loop. - taskScheduler.stop(); - // e. Drain MCP sessions / DB listeners / secret-proxy. Gateway - // holds postgres.js connections that must be released before - // mode-specific db teardown runs. - await stopLobuGateway(); - // f. Close the postgres.js singleton pool. - await closeDbSingleton(); - // g. Mode-specific teardown (PGlite kills embeddings child, stops - // socket server, closes the in-process db). + await safeRun("vite.close", () => vite?.close() ?? Promise.resolve()); + safeRun("stopReaper", stopReaper); + safeRun("taskScheduler.stop", () => taskScheduler.stop()); + await safeRun("stopLobuGateway", stopLobuGateway); + await safeRun("closeDbSingleton", closeDbSingleton); for (const teardown of extraTeardown) { - await teardown(); + await safeRun("extraTeardown", teardown); } - // h. Finally, close the listener. Matches the historical behavior of - // not awaiting (server.ts:260; start-local.ts:322). httpServer.close(); process.exit(0); };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/server-lifecycle.ts` around lines 373 - 406, The shutdown sequence in shutdown(...) can abort if any awaited step throws, skipping later cleanup and creating unhandled rejections; change it to a best-effort pattern: wrap each major step (embeddedWorker.stop()/embeddedWorker.wait(...), vite?.close(), stopReaper(), taskScheduler.stop(), stopLobuGateway(), closeDbSingleton(), and each extraTeardown() call) in its own try/catch that logs the error (use logger.error with context) and continues, ensuring httpServer.close() and process.exit(0) always run in a finally-style guarantee so the listener is closed and the process exits even if earlier steps fail.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/server/src/server-lifecycle.ts`:
- Around line 373-406: The shutdown sequence in shutdown(...) can abort if any
awaited step throws, skipping later cleanup and creating unhandled rejections;
change it to a best-effort pattern: wrap each major step
(embeddedWorker.stop()/embeddedWorker.wait(...), vite?.close(), stopReaper(),
taskScheduler.stop(), stopLobuGateway(), closeDbSingleton(), and each
extraTeardown() call) in its own try/catch that logs the error (use logger.error
with context) and continues, ensuring httpServer.close() and process.exit(0)
always run in a finally-style guarantee so the listener is closed and the
process exits even if earlier steps fail.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: cfbc57ff-3d28-49cb-ab11-baf27c25a357
📒 Files selected for processing (1)
packages/server/src/server-lifecycle.ts
…down safety 1. PGlite was missing assertExternalDepsResolvable() in postListenHooks; adding it back closes the exact drift this refactor exists to prevent. Caught by pi review on #951. 2. Wrap every shutdown step in a safe() helper so one rejecting teardown (e.g. stopLobuGateway) can no longer block the rest from running and leave the process stuck with a bound listener. 3. Single-flight guard on the shutdown handler: SIGTERM+SIGINT (or a double-tap on either) used to race gateway-stop / extraTeardown / process.exit. The guard short-circuits the second entry. Strengthens the lifecycle contract test with assertions for the safe() wrapper count and the single-flight guard so future refactors can't silently re-introduce either gap.
pi review applied
Applied as fixes (commit 4b76d89):
Contract test strengthened with:
Rejected:
Validation: 19/19 lifecycle contract tests pass; per-package |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/server/src/server-lifecycle.ts (1)
449-465:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject listen-phase failures instead of bypassing boot error handling.
start()only resolves here. A bind failure fromhttpServer.listen()(e.g.,EADDRINUSE) will emit an unhandled 'error' event and crash the process, bypassingmain().catch(reportBootFailure). Similarly, synchronous throws frompostListenHooksorstartEmbeddedConnectorWorker()inside the callback will not reject the Promise.Suggested fix
- await new Promise<void>((resolve) => { - httpServer.listen(port, host, () => { - logger.info( - { host, port, mode }, - `Lobu running at http://${host}:${port}`, - ); - for (const hook of postListenHooks) { - hook(); - } - const daemonHost = host === "0.0.0.0" ? "127.0.0.1" : host; - embeddedWorker = startEmbeddedConnectorWorker( - env, - `http://${daemonHost}:${port}`, - ); - resolve(); - }); - }); + await new Promise<void>((resolve, reject) => { + const onListenError = (err: Error) => { + httpServer.off("error", onListenError); + reject(err); + }; + httpServer.once("error", onListenError); + httpServer.listen(port, host, () => { + httpServer.off("error", onListenError); + try { + logger.info( + { host, port, mode }, + `Lobu running at http://${host}:${port}`, + ); + for (const hook of postListenHooks) { + hook(); + } + const daemonHost = host === "0.0.0.0" ? "127.0.0.1" : host; + embeddedWorker = startEmbeddedConnectorWorker( + env, + `http://${daemonHost}:${port}`, + ); + resolve(); + } catch (err) { + httpServer.close(); + reject(err); + } + }); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/server-lifecycle.ts` around lines 449 - 465, The current Promise around httpServer.listen never rejects on listen-phase failures or synchronous throws inside the listen callback; add rejection handling by attaching a one-time httpServer.once('error', (err) => reject(err)) before calling httpServer.listen, and remove that listener (or use httpServer.off) when the listen callback runs successfully; also wrap the postListenHooks loop and the call to startEmbeddedConnectorWorker(...) in a try/catch and call reject(err) on any thrown error (or resolve only after successful completion and assignment to embeddedWorker) so that start() rejects on bind or startup failures instead of letting errors become unhandled.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/server/src/start-local.ts`:
- Line 55: The post-listen dependency check currently calls process.exit(1)
directly (see assertExternalDepsResolvable usage), which bypasses the shared
shutdown lifecycle and skips extraTeardown; instead, make
assertExternalDepsResolvable throw an Error (or propagate its thrown error) and
remove any direct process.exit(1) calls in the postListenHook so that
createServerLifecycle() can detect the thrown error and run the centralized
teardown path. Update the postListenHook that calls assertExternalDepsResolvable
and the lifecycle boot handling in createServerLifecycle() to treat a thrown
error from postListenHook as a fatal boot error and invoke the shared
shutdown/extraTeardown rather than exiting inline.
---
Outside diff comments:
In `@packages/server/src/server-lifecycle.ts`:
- Around line 449-465: The current Promise around httpServer.listen never
rejects on listen-phase failures or synchronous throws inside the listen
callback; add rejection handling by attaching a one-time
httpServer.once('error', (err) => reject(err)) before calling httpServer.listen,
and remove that listener (or use httpServer.off) when the listen callback runs
successfully; also wrap the postListenHooks loop and the call to
startEmbeddedConnectorWorker(...) in a try/catch and call reject(err) on any
thrown error (or resolve only after successful completion and assignment to
embeddedWorker) so that start() rejects on bind or startup failures instead of
letting errors become unhandled.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 0a3d21f5-2732-42d0-aa09-f0210387f60c
📒 Files selected for processing (3)
packages/server/src/__tests__/server-lifecycle.test.tspackages/server/src/server-lifecycle.tspackages/server/src/start-local.ts
| import { pg_trgm } from "@electric-sql/pglite/contrib/pg_trgm"; | ||
| import { vector } from "@electric-sql/pglite/vector"; | ||
| import { PGLiteSocketServer } from "@electric-sql/pglite-socket"; | ||
| import { assertExternalDepsResolvable } from "@lobu/connector-worker/compile"; |
There was a problem hiding this comment.
Don't process.exit(1) from the post-listen dependency check.
Line 264 bypasses the shared shutdown path, so extraTeardown never runs when this check fails. In PGlite mode that can orphan the embeddings child and skip socket/DB cleanup on a failed boot. Let the lifecycle own the fatal path instead of exiting inline.
Suggested direction
postListenHooks: [
() => {
try {
assertExternalDepsResolvable(require.resolve);
} catch (err) {
logger.error({ err }, "Connector external dependency check failed");
- process.exit(1);
+ throw err;
}
},
],If createServerLifecycle() does not already treat a thrown postListenHook as a fatal boot error with teardown, that behavior should live there rather than in this hook.
Also applies to: 252-267
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/server/src/start-local.ts` at line 55, The post-listen dependency
check currently calls process.exit(1) directly (see assertExternalDepsResolvable
usage), which bypasses the shared shutdown lifecycle and skips extraTeardown;
instead, make assertExternalDepsResolvable throw an Error (or propagate its
thrown error) and remove any direct process.exit(1) calls in the postListenHook
so that createServerLifecycle() can detect the thrown error and run the
centralized teardown path. Update the postListenHook that calls
assertExternalDepsResolvable and the lifecycle boot handling in
createServerLifecycle() to treat a thrown error from postListenHook as a fatal
boot error and invoke the shared shutdown/extraTeardown rather than exiting
inline.
Summary
Extracts a shared
createServerLifecyclespine topackages/server/src/server-lifecycle.ts. Both entry files (server.tsPostgres,start-local.tsPGlite) call it with mode-specific hooks. Middleware ordering, route mounts, httpServer timeouts, shutdown sequence, and signal wiring are now identical by construction — drift is structurally impossible.Mode differences confined to:
databaseReadinesshook (Postgres asserts schema; PGlite runs migrations)preListenHooks(PGlite runs install-operator + default-agent provisioning)postListenHooks(Postgres runs connector dep resolvability check)extraTeardown(PGlite kills embeddings child, stops socket, closes PGlite db)Closes #948.
Reproducer (red→fix→green)
Red: server.ts and start-local.ts diverged twice in one week — #940 mount fix added
/lobumount + Sentry-aware error wiring only to one entry; #943 was a 7-hygiene catch-up patching start-local against server. The same fix was independently written by two parallel agents in #940 and #944.Green: with the new spine, every middleware/route/timeout/shutdown step lives in one file. The four named hooks are the only escape valves; anything else added to an entry that already has a home in the spine fails the comment-block contract at the top of
server.tsandstart-local.ts.Boot smoke verified end-to-end against the bundled PGlite mode:
Validation
make build-packages— green (server.bundle.mjs + start-local.bundle.mjs both rebuilt, 0 errors)bun run typecheck— greenbunx biome checkon the four touched files — greenbunx vitest run src/__tests__/server-lifecycle.test.ts— 17/17 passing (covers middleware ordering, env-inject preserving adapter fields, peer-address stash, Sentry 5xx capture vs onError, no double-report,serializeBootErrorcause chains, plus source-level shutdown-ordering and SIGTERM/SIGINT contracts)start-local.bundle.mjsreaches "Lobu running at ..." and SIGINT-shuts down cleanly through the documented orderOut of scope
Worker spawn refactor, signup-hang bug (#947), schema/migration changes.
Summary by CodeRabbit
New Features
Improvements
Tests