Skip to content

fix(lbug): prevent DuckDB extension install hangs#1129

Merged
magyargergo merged 2 commits into
mainfrom
fix/lbug-extension-install-timeout-1128
Apr 27, 2026
Merged

fix(lbug): prevent DuckDB extension install hangs#1129
magyargergo merged 2 commits into
mainfrom
fix/lbug-extension-install-timeout-1128

Conversation

@magyargergo

Copy link
Copy Markdown
Collaborator

Summary

  • Prevent gitnexus analyze from hanging when DuckDB fts / VECTOR extension downloads are blocked by moving INSTALL into a bounded child process.
  • Centralize optional extension lifecycle in ExtensionManager with explicit auto, load-only, and never policies so analyze can install while serve/MCP read paths only load preinstalled extensions.
  • Add a stable debuggable installer script at gitnexus/scripts/install-duckdb-extension.mjs, update vector/FTS callers to degrade gracefully, and document the new environment controls.

Closes #1128.

Test plan

  • npx tsc --noEmit
  • npx vitest run test/unit/lbug-extension-loader.test.ts --pool=threads
  • npx vitest run test/unit/embedding-pipeline.test.ts --pool=threads
  • node scripts/install-duckdb-extension.mjs "bad;name" (expected non-zero, verifies debuggable script stack trace)
  • npx vitest run --project=lbug-db test/integration/lbug-core-adapter.test.ts --pool=forks --maxWorkers=1
  • npx vitest run --project=lbug-db test/integration/lbug-vector-extension.test.ts --pool=forks --maxWorkers=1 (assertions pass; Windows may still emit the known native Vitest worker-exit noise after teardown)
  • npm run build
  • npx tsx src/cli/index.ts detect-changes -r GitNexus

Notes

  • Full npm test still shows unrelated pre-existing Swift failures in method-extraction, type-env, and Swift resolver coverage.

…#1128)

`gitnexus analyze` could hang indefinitely (60% / 85% on Windows) when
DuckDB's `INSTALL fts` or `INSTALL VECTOR` was unable to reach
`extensions.duckdb.org`. The DuckDB driver's INSTALL is a synchronous
network call, so any blocked egress would block the Node event loop
forever.

Replace the ad-hoc, in-process INSTALL/LOAD scattered across
`lbug-adapter.ts` and `pool-adapter.ts` with a single
`ExtensionManager` that owns the lifecycle of optional DuckDB
extensions:

* `LOAD` is always tried first — per-connection, idempotent, no network.
* If `LOAD` fails and policy permits, INSTALL runs in a short-lived
  child Node process bounded by `GITNEXUS_LBUG_EXTENSION_INSTALL_TIMEOUT_MS`
  (default 15s). The parent loop keeps spinning; on timeout the child is
  killed with SIGKILL and the capability is flagged unavailable.
* Capabilities and install attempts are cached per process, so a single
  bounded install per extension covers every subsequent call.

Install policy is now an explicit, per-context decision:

* `auto` (default for analyze) — try LOAD, fall back to bounded INSTALL.
* `load-only` — used by `pool-adapter` (serve / MCP read paths) so user
  queries never block on a network install.
* `never` — operator escape hatch for offline / airgapped environments.

`createFTSIndex` and `createVectorIndex` now check the boolean return
value before issuing the index DDL, so missing extensions degrade BM25
and semantic search gracefully without ever throwing during analyze.

Tests:
- New unit suite for `ExtensionManager` covering LOAD-first behavior,
  all three policies, install caching, observability, and warn dedup.
- Existing vector-extension integration tests pass against the new
  boolean return type.
- Existing embedding-pipeline mocks updated to return `true`.

Docs: `gitnexus/README.md` documents `GITNEXUS_LBUG_EXTENSION_INSTALL`
and `GITNEXUS_LBUG_EXTENSION_INSTALL_TIMEOUT_MS` with examples for
offline and slow-network environments.

Made-with: Cursor
Keep the bounded out-of-process INSTALL behavior, but replace the inline child code with a stable packaged ESM script. This makes the child process directly runnable and gives debuggable stack traces without source-vs-dist branching or a runtime transpiler.

Made-with: Cursor
@vercel

vercel Bot commented Apr 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment Apr 27, 2026 9:14pm

Request Review

@github-actions

github-actions Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 1s —— View job


PR #1129 Adversarial Production Readiness Review

Tasks:

  • Initialize review
  • Read DoD.md, AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md
  • Read extension-loader.ts (ExtensionManager)
  • Read install-duckdb-extension.mjs (out-of-process installer)
  • Read lbug-adapter.ts (core adapter lifecycle)
  • Read pool-adapter.ts (read-path behavior)
  • Read embedding-pipeline.ts (vector behavior)
  • Read test files (unit + integration)
  • Read README.md (env docs)
  • Check CI status
  • Produce review

Review bar for this PR

  • Analyze may attempt bounded extension installation but must never hang indefinitely when DuckDB extension downloads are blocked.
  • Serve/MCP/read-pool paths must never run INSTALL; they must only LOAD preinstalled extensions and degrade gracefully.
  • ExtensionManager policy precedence: per-call opts.policy → constructor options.policy → env → auto.
  • auto, load-only, and never policies behave distinctly and are tested.
  • loadFTSExtension() and loadVectorExtension() return accurate capability booleans and all callers handle false safely.
  • Out-of-process installer is safe (no shell interpolation), bounded (SIGKILL on timeout), and validates extension names in both parent and child.
  • Capability/install-attempt caching is correct across repeated calls, DB reconnects, and pool sessions.
  • Documentation accurately describes implemented behavior and env controls.

Understanding of the change

What hang #1128 was: gitnexus analyze blocked indefinitely because the old loadVectorExtension() called INSTALL VECTOR synchronously on the DuckDB connection. DuckDB's INSTALL makes a synchronous network request; in firewalled/airgapped environments this hangs the entire Node.js event loop with no timeout.

Why DuckDB INSTALL can block analyze: INSTALL is a synchronous blocking call inside the DuckDB native module. There is no timeout knob. Once entered, the event loop does not return until the network request completes or is killed at the OS level.

What lifecycle boundary this PR introduces: The installer now runs in a short-lived child node process with process.execPath (no shell interpolation), bounded by SIGKILL after a configurable timeout (default 15 s). The parent resolves with { timedOut: true } and logs a warning; analyze proceeds without the extension.

Which paths may install: Only auto policy (default for analyze / writable singleton connection). The ExtensionManager prevents re-installation after the first attempt within a process via installAttempted cache.

Which paths must only load: Pool adapter (initLbug, initLbugWithDb) explicitly passes { policy: 'load-only' } to loadFTSExtension / loadVectorExtension. The CYPHER_WRITE_RE regex in pool-adapter.ts also blocks any user-submitted LOAD or INSTALL Cypher as a defense-in-depth layer.

What must remain unchanged for BM25/vector users: Pre-installed extensions still load normally and search works as before. createFTSIndex and createVectorIndex skip SQL gracefully when the extension isn't available, so analyze still writes the graph DB — only search features degrade.


Findings

[high] No automated test proving pool adapter enforces load-only

  • Category: Tests / safety invariant
  • Files: gitnexus/test/ (gap), gitnexus/src/core/lbug/pool-adapter.ts
  • Issue: The PR's central production-safety claim — that serve/MCP/read paths never trigger a network install — is verified only by reading code. Neither doInitLbug nor initLbugWithDb in pool-adapter.ts has a test that confirms loadFTSExtension and loadVectorExtension are called with { policy: 'load-only' }. If the argument is removed in a future change, no test fails.
  • Why it matters here: GitNexus MCP paths are invoked by AI agents continuously in live editor sessions. A regression here would silently turn every MCP tool call into a potential network-install attempt, causing hangs for all serve/MCP users — exactly the scenario this PR is meant to prevent.
  • Recommended fix: Add a unit test for pool-adapter.ts's doInitLbug that injects a spy on loadFTSExtension/loadVectorExtension (or on extensionManager.ensure) and asserts the resolved policy is 'load-only' for both calls. This can be done without the native module by injecting a pre-built SharedDB or using initLbugWithDb with a mock connection.

[medium] GITNEXUS_LBUG_EXTENSION_INSTALL env var policy parsing is untested

  • Category: Tests / operator control
  • Files: gitnexus/src/core/lbug/extension-loader.ts, gitnexus/test/unit/lbug-extension-loader.test.ts
  • Issue: resolvePolicyFromEnv() has no unit test. Three behaviors are untested: valid values (auto, load-only, never) are parsed correctly; invalid values fall back to auto silently; and env var policy is overridden by per-call opts.policy or constructor options.policy. The timeout env var test exists (getExtensionInstallTimeoutMs) but the policy env var has no parallel test.
  • Why it matters here: This is the primary operator control for offline/airgapped deployments (GITNEXUS_LBUG_EXTENSION_INSTALL=load-only). If the parsing or precedence logic is wrong, operators who set it for firewall safety would get auto (install-attempting) behavior silently.
  • Recommended fix: Add a test group for resolvePolicyFromEnv mirroring the getExtensionInstallTimeoutMs tests: valid values, invalid fallback, and override by opts.policy in ensure().

[medium] createVectorIndex skip-on-false path not covered by unit tests

  • Category: Tests / changed contract
  • Files: gitnexus/test/unit/embedding-pipeline.test.ts, gitnexus/src/core/embeddings/embedding-pipeline.ts
  • Issue: All embedding pipeline unit tests mock loadVectorExtension to return true. The new behaviour — createVectorIndex returns early without issuing CREATE_VECTOR_INDEX_QUERY when loadVectorExtension() returns false — has no coverage. The change from void to boolean in loadVectorExtension's return type is the core contract change here, and only the happy path is tested.
  • Why it matters here: Silent degradation is the stated design goal for unavailable extensions. Without a test for the false path, a regression (e.g., the early return being accidentally removed) would not be caught — vector SQL would be issued against a connection that hasn't loaded the extension, producing a runtime error during analyze on any system where VECTOR isn't pre-installed.
  • Recommended fix: Add one test: mock loadVectorExtension to return false, run the pipeline with nodes that would normally trigger createVectorIndex, and assert no CREATE_VECTOR_INDEX query is present in queryCalls.

Definition of Done check

DoD item Status
Analyze never hangs indefinitely when extension downloads are blocked — bounded child process with SIGKILL + timedOut resolution satisfied
Serve/MCP/read-pool paths never run INSTALLload-only policy enforced in both initLbug and initLbugWithDb in pool-adapter.ts satisfied (code correct; test coverage gap noted in Findings)
ExtensionManager policy precedence correct: per-call → constructor → env → default satisfied
auto, load-only, never policies behave distinctly and are tested partially satisfiednever policy test is present; load-only and auto are covered as manager-level unit tests; env var parsing for policy is untested
loadFTSExtension() / loadVectorExtension() return accurate booleans; all callers handle false partially satisfied — callers handle false correctly in production code; the false path in createVectorIndex lacks a unit test
Out-of-process installer: no shell interpolation, validated names, bounded timeout, timer cleared, scratch DB cleaned satisfied — parent uses process.execPath + argv array; validation in both parent and child; settled flag prevents double-resolve; finally block in child cleans tmpdir
Capability/install-attempt caching correct across repeated calls and reconnects satisfiedinstallAttempted prevents re-install; module-level flags and shared.ftsLoaded handle reconnect; closeLbug resets flags correctly
Documentation accurately describes env controls and defaults satisfied — README table matches implementation defaults (15000 ms, auto, correct policy descriptions, offline guidance)

Validation assessment

Tests added/updated:

  • test/unit/lbug-extension-loader.test.ts (new, 252 lines): covers LOAD-first, already-loaded, auto/load-only/never policies, per-call override, timeout, caching, warning deduplication, name validation, child process args, env timeout parsing.
  • test/unit/embedding-pipeline.test.ts (updated, +4/-4): mock loadVectorExtension updated to mockResolvedValue(true) to match new boolean return; no new degradation path tests.
  • test/integration/lbug-vector-extension.test.ts (updated, +4/-4): assertions updated to resolves.toBe(true) matching new return type; idempotency and close/reinit tested against real native module.

Validation claimed by PR:

  • npx tsc --noEmit — typecheck passes.
  • Unit and integration test runs for affected files.
  • node scripts/install-duckdb-extension.mjs "bad;name" — non-zero exit verifies name validation in child.
  • npm run build — distribution build.
  • npx tsx src/cli/index.ts detect-changes -r GitNexus — smoke test.
  • Full npm test has pre-existing unrelated Swift failures.

Verified from diff/code:

  • SIGKILL on timeout: confirmed (child.kill('SIGKILL') in timer callback).
  • Timer cleared on success/error: confirmed (clearTimeout(timer) in both handlers).
  • No shell interpolation: confirmed (process.execPath + argv array).
  • Name validated in both parent (line 80 extension-loader.ts) and child (line 10 install-duckdb-extension.mjs).
  • scripts/ is in package.json files array — distribution path covered.
  • Path from compiled dist/core/lbug/extension-loader.js../../../scripts/install-duckdb-extension.mjs resolves to <package-root>/scripts/install-duckdb-extension.mjs. ✓
  • Pool adapter both doInitLbug and initLbugWithDb use { policy: 'load-only' }.
  • CYPHER_WRITE_RE in pool-adapter.ts blocks INSTALL and LOAD in user Cypher as defense-in-depth.

Missing or not-verifiable validation:

  • No blocked-install / timeout scenario tested end-to-end (only mocked installer). The real installDuckDbExtensionOutOfProcess timeout/kill path is not exercised.
  • No test proving pool init cannot trigger a child process spawn under any condition.
  • No serve/mcp smoke test showing no installer is spawned under query load.
  • Policy env var parsing behavior not verified in any test.
  • Windows SIGKILL / cleanup behavior not verified (noted by PR as known noise, not blocking).

Does missing validation block production readiness? The high-severity test gap (pool load-only invariant) is not blocking in the sense that the code is clearly correct, but it is a meaningful regression risk for the core safety claim. The two medium gaps are genuine coverage holes but not showstoppers individually.


Final verdict

production-ready with minor follow-ups

The core mechanism is sound: the out-of-process installer is correctly bounded, the parent kill-on-timeout path has no timer leak, extension names are validated in both parent and child, the pool adapter enforces load-only in both init paths, and graceful degradation is wired through the whole stack. The documentation accurately reflects the implementation. The three test gaps (pool load-only invariant, policy env var parsing, vector skip-on-false path) do not make the current code incorrect — they make future regressions in the key safety properties undetectable by CI. The pool load-only invariant test is the most important follow-up given it covers the central claim of this PR; the other two are straightforward additions.

Fix the pool load-only invariant test →

Fix the policy env var test →

Fix the createVectorIndex skip-on-false test →

@github-actions

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
7642 7541 0 97 295s

✅ All 7541 tests passed

97 test(s) skipped — expand for details
  • withTestLbugDB(lock-retry) > withLbugDb retry behaviour > returns the operation result on success
  • withTestLbugDB(lock-retry) > withLbugDb retry behaviour > retries on BUSY error and succeeds on later attempt
  • withTestLbugDB(lock-retry) > withLbugDb retry behaviour > propagates non-BUSY errors immediately without retrying
  • withTestLbugDB(lock-retry) > withLbugDb retry behaviour > throws after max retry attempts
  • Swift MethodExtractor > isTypeDeclaration > recognizes class_declaration
  • Swift MethodExtractor > isTypeDeclaration > recognizes protocol_declaration
  • Swift MethodExtractor > isTypeDeclaration > rejects import_declaration
  • Swift MethodExtractor > visibility > extracts public method
  • Swift MethodExtractor > visibility > extracts private method
  • Swift MethodExtractor > visibility > defaults to internal when no modifier
  • Swift MethodExtractor > protocol methods > marks protocol method as abstract
  • Swift MethodExtractor > static and class methods > detects static func as isStatic
  • Swift MethodExtractor > static and class methods > detects class func as isStatic
  • Swift MethodExtractor > parameters > extracts parameters with types and default values
  • Swift MethodExtractor > return type > extracts return type from -> annotation
  • Swift MethodExtractor > annotations > extracts @objc attribute
  • Swift MethodExtractor > isFinal > detects final func
  • Swift MethodExtractor > isFinal > is false when not final
  • Swift MethodExtractor > isAsync > detects async func
  • Swift MethodExtractor > isOverride > detects override method
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference uses lookupClassByName
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift explicit init inference uses lookupClassByName
  • buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference does not bind plain functions
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature
  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
  • Swift return type inference > detects User class and getUser function
  • Swift return type inference > detects save function on User (Swift class methods are Function nodes)
  • Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
  • Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
  • Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
  • Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()
  • Swift implicit imports (cross-file visibility) > detects UserService class in Models.swift
  • Swift implicit imports (cross-file visibility) > resolves UserService() constructor call across files (no explicit import)
  • Swift implicit imports (cross-file visibility) > resolves service.fetchUser() member call across files
  • Swift implicit imports (cross-file visibility) > creates IMPORTS edges between files in the same module
  • Swift extension deduplication > detects Product class
  • Swift extension deduplication > resolves Product() constructor despite extension creating duplicate class node
  • Swift extension deduplication > resolves product.save() to Product.swift (primary definition)
  • Swift constructor call fallback (no new keyword) > resolves OCRService() as constructor call across files
  • Swift constructor call fallback (no new keyword) > resolves ocr.recognize() member call via constructor-inferred type
  • Swift export visibility (internal vs private) > resolves PublicService() constructor across files
  • Swift export visibility (internal vs private) > resolves internalHelper() across files (internal = module-scoped)
  • Swift if let / guard let binding resolution > detects User and Repo classes
  • Swift if let / guard let binding resolution > resolves user.save() inside if-let to User#save
  • Swift if let / guard let binding resolution > resolves repo.save() inside guard-let to Repo#save
  • Swift if let / guard let binding resolution > user.save() in if-let does NOT resolve to Repo#save
  • Swift await / try expression unwrapping > resolves user.save() via await fetchUser() return type
  • Swift await / try expression unwrapping > resolves repo.save() via try parseRepo() return type
  • Swift await / try expression unwrapping > detects fetchUser and parseRepo as functions
  • Swift for-in loop element type inference > detects User and Repo classes
  • Swift for-in loop element type inference > creates implicit import edges between files
  • Swift field-type resolution > detects classes and their properties
  • Swift field-type resolution > emits HAS_PROPERTY edges from class to field
  • Swift field-type resolution > resolves field-chain call user.address.save() → Address#save
  • Swift field-type resolution > emits ACCESSES edges for field reads in chains
  • Swift field-type resolution > populates field metadata (visibility, declaredType) on Property nodes
  • Swift call-result binding > resolves call-result-bound method call user.save() → User#save
  • Swift call-result binding > getUser() is present as a defined function
  • Swift call-result binding > emits processUser -> getUser CALLS edge for let-assigned free function call
  • Swift method enrichment > detects Animal protocol and Dog class
  • Swift method enrichment > emits IMPLEMENTS edge Dog -> Animal
  • Swift method enrichment > emits HAS_METHOD edges for Dog methods
  • Swift method enrichment > marks protocol Animal.speak as isAbstract
  • Swift method enrichment > marks Dog.speak as NOT isAbstract
  • Swift method enrichment > marks breathe as isFinal
  • Swift method enrichment > marks classify as isStatic
  • Swift method enrichment > captures @objc annotation on breathe
  • Swift method enrichment > populates parameterTypes for classify(_ name: String)
  • Swift method enrichment > records parameterCount for classify
  • Swift method enrichment > records returnType for speak
  • Swift method enrichment > resolves dog.speak() CALLS edge
  • Swift method enrichment > resolves Dog.classify("dog") CALLS edge
  • Swift abstract dispatch > detects Repository protocol and SqlRepository class
  • Swift abstract dispatch > emits IMPLEMENTS edge SqlRepository -> Repository
  • Swift abstract dispatch > emits HAS_METHOD edges for Repository.find and Repository.save
  • Swift abstract dispatch > emits HAS_METHOD edges for SqlRepository.find and SqlRepository.save
  • Swift abstract dispatch > marks base Repository.find as isAbstract
  • Swift abstract dispatch > marks base Repository.save as isAbstract
  • Swift abstract dispatch > marks concrete SqlRepository.find as NOT isAbstract
  • Swift abstract dispatch > resolves repo.find(id: 42) CALLS edge
  • Swift abstract dispatch > resolves repo.save(entity: user) CALLS edge
  • Swift abstract dispatch > populates parameterTypes for Repository.find
  • Swift abstract dispatch > populates parameterTypes for Repository.save
  • Swift abstract dispatch > records returnType for SqlRepository.find
  • Swift abstract dispatch > emits METHOD_IMPLEMENTS edges from SqlRepository methods → Repository protocol methods
  • Swift overloaded method disambiguation > detects 2 distinct find Method nodes on SqlRepository
  • Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edges for both find overloads
  • Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edge for save
  • Swift overloaded method disambiguation > emits exactly 3 METHOD_IMPLEMENTS edges total
  • Swift Child extends Parent — inherited method resolution (SM-9) > detects Parent and Child classes
  • Swift Child extends Parent — inherited method resolution (SM-9) > resolves c.parentMethod() to Parent.parentMethod via first-wins MRO walk

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 75.83% 22202/29276 75.89% 📉 -0.1 🔴 ███████████████░░░░░
Branches 64.73% 14224/21974 64.7% 📈 +0.0 🟢 ████████████░░░░░░░░
Functions 80.68% 2151/2666 80.95% 📉 -0.3 🔴 ████████████████░░░░
Lines 78.71% 20048/25468 78.75% 📉 -0.0 🔴 ███████████████░░░░░

📋 View full run · Generated by CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gitnexus analyze hangs indefinitely at 60% or 85% on Windows when DuckDB extensions can't be downloaded (INSTALL VECTOR / INSTALL fts blocks event loop)

1 participant