Skip to content

fix(search): load FTS during core DB init#1123

Merged
magyargergo merged 2 commits into
mainfrom
fix/http-search-hybrid-init-1071
Apr 27, 2026
Merged

fix(search): load FTS during core DB init#1123
magyargergo merged 2 commits into
mainfrom
fix/http-search-hybrid-init-1071

Conversation

@magyargergo

Copy link
Copy Markdown
Collaborator

Summary

  • Load the FTS extension during core LadybugDB initialization so HTTP/Web search sessions can query existing BM25 indexes after reopening the DB.
  • Add a regression test for the reopened core-adapter path used by HTTP-style search.
  • Route every withTestLbugDB integration suite through the lbug-db Vitest project so LadybugDB tests use the serialized project configuration.

Closes #1071.

Test plan

  • npx vitest run test/integration/lbug-core-adapter.test.ts --pool=forks
  • npx vitest run test/unit/bm25-search.test.ts --pool=threads
  • npx vitest run test/integration/lbug-core-adapter.test.ts test/integration/lbug-vector-extension.test.ts test/integration/lbug-pool.test.ts test/integration/lbug-pool-stability.test.ts test/integration/lbug-lock-retry.test.ts --pool=forks --maxWorkers=1
  • npx vitest run --project lbug-db --maxWorkers=1
  • npx prettier --check src/core/lbug/lbug-adapter.ts test/integration/lbug-core-adapter.test.ts vitest.config.ts
  • npx tsc --noEmit
  • git diff --check
  • npx tsx src/cli/index.ts detect-changes -r GitNexus

Note: the same lbug category run without --maxWorkers=1 surfaced a Windows native fork-worker exit after 49/51 tests; all files passed individually and the serialized project run passed cleanly.

@vercel

vercel Bot commented Apr 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment Apr 27, 2026 6:19pm

Request Review

@github-actions

github-actions Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 3m 36s —— View job


PR #1123 Review: fix(search): load FTS during core DB init

Tasks:

  • Post initial comment
  • Read DoD.md and core docs
  • Read lbug-adapter.ts (init lifecycle, loadFTSExtension, loadVectorExtension, closeLbug, queryFTS)
  • Read regression test (lbug-core-adapter.test.ts)
  • Read vitest.config.ts and diff
  • Read withTestLbugDB helper
  • Read HTTP/Web search path (api.ts → withLbugDb, bm25-index.ts → queryFTS)
  • Inspect moved test files (lbug-vector-extension, java-class-impact, class-impact-all-languages)
  • Produce structured review

Review bar for this PR

  • initLbug() loads the FTS extension for every reopened DB session that needs BM25/FTS queries.
  • Existing vector extension initialization unchanged and not regressed by adding FTS loading.
  • queryFTS() works against existing indexes after closeLbug() + initLbug(dbPath).
  • Extension loading is idempotent and safe across repeated init/close cycles.
  • The regression test exercises the real reopened core-adapter path with an asserting result check, not just a no-throw.
  • vitest.config.ts correctly routes all LadybugDB-backed suites through serialized lbug-db execution with in-sync include/exclude lists.
  • Validation is proportionate for a DB lifecycle fix.

Understanding of the change

Issue #1071: HTTP/Web-style search sessions were calling withLbugDb(lbugPath, ...) to query existing BM25 indexes on a reopened database, but queryFTS() was failing silently because the FTS extension was never loaded for that session. The FTS extension had only been loaded via the loadFTSExtension() call in the withTestLbugDB helper and in createFTSIndex, not in the core doInitLbug path that withLbugDb ultimately calls.

Root cause: doInitLbug loaded only the VECTOR extension (await loadVectorExtension()) and left FTS extension loading to be done separately. The HTTP API path calls withLbugDb(lbugPath, operation)ensureLbugInitializeddoInitLbug on path change or first open. Without FTS loaded, CALL QUERY_FTS_INDEX(...) would fail.

Fix: One line added to doInitLbug at line 334: await loadFTSExtension() before await loadVectorExtension(). Comment updated from "Load VECTOR extension for semantic search support" to "Load query extensions once per core adapter session."

Affected paths:

  • withLbugDb (HTTP server, per-request DB switch) — now loads FTS on every new session
  • initLbug (direct init, analyze path) — same doInitLbug code path
  • withTestLbugDB helper's explicit loadFTSExtension() call becomes redundant (harmless, idempotent)

Must remain unchanged: VECTOR extension loading, existing FTS/BM25 index creation, closeLbug reset behavior, query contract for queryFTS.


Findings

[low] withTestLbugDB helper retains now-redundant loadFTSExtension() call

  • Category: Test maintainability / clarity
  • Files: gitnexus/test/helpers/test-indexed-db.ts (not in this diff)
  • Issue: The helper calls await adapter.loadFTSExtension() explicitly in step 2 after initLbug. Since doInitLbug now calls loadFTSExtension() internally, this step is a guaranteed no-op on every execution (the ftsLoaded guard returns true immediately). The comment "idempotent — skips if already loaded" is accurate, but the call now implies that the helper is still responsible for FTS loading when it no longer is.
  • Why it matters here: Future contributors reading withTestLbugDB may not realize FTS is already loaded by doInitLbug, leading to confusion about whether removing the explicit call would break something. The helper's comment describes the redundancy accurately but the call itself is misleading about ownership.
  • Recommended fix: Either (a) remove the explicit loadFTSExtension() call from the helper and update its step comment, or (b) add a one-line comment explaining it is retained for visibility only and is a no-op after this PR. Option (a) is cleaner. This is a follow-up; it does not block merge.

Definition of Done check

DoD item Status
initLbug() loads FTS for every reopened session ✅ satisfied — doInitLbug now calls await loadFTSExtension() unconditionally before await loadVectorExtension()
Vector extension init unchanged / not regressed ✅ satisfied — loadVectorExtension() is still called immediately after; order is FTS-first, vector-second; both still run; existing behavior preserved
queryFTS() works after closeLbug() + initLbug(dbPath) ✅ satisfied — regression test at line 73 creates an index, closes, reopens, asserts arrayContaining([objectContaining({ filePath: 'src/index.ts' })]) — a result assertion, not just no-throw
Extension loading idempotent across repeated cycles ✅ satisfied — loadFTSExtension() guards on ftsLoaded; closeLbug resets it; withLbugDb retry path resets it; "already loaded" catch branch handles duplicates
Regression test exercises real reopened core-adapter path ✅ satisfied — test uses the real adapter.initLbug/closeLbug/queryFTS exports; no mocks on the DB path; goes through runWithSessionLockensureLbugInitializeddoInitLbug, which is the same code path as withLbugDb
vitest.config.ts routes all lbug-backed suites through serialized lbug-db ✅ satisfied — all three moved files (lbug-vector-extension.test.ts, java-class-impact.test.ts, class-impact-all-languages.test.ts) use withTestLbugDB; include/exclude lists are in sync; duplicate lbug-lock-retry.test.ts exclusion removed
Include/exclude lists consistent ✅ satisfied — every entry added to lbug-db.include is also added to default.exclude; diff confirms symmetry
Validation sufficient for a DB/search lifecycle fix ⚠️ partially satisfied — see validation assessment below

Validation assessment

Tests added / updated:

  • lbug-core-adapter.test.ts +14 lines: regression test initLbug loads FTS so reopened HTTP-style sessions can query existing indexes — creates index, closes DB, reopens, asserts queryFTS returns expected results.

Validation claimed by the PR:

  1. npx vitest run test/integration/lbug-core-adapter.test.ts --pool=forks
  2. npx vitest run test/unit/bm25-search.test.ts --pool=threads
  3. Selected lbug files with --pool=forks --maxWorkers=1
  4. npx vitest run --project lbug-db --maxWorkers=1
  5. Prettier check on the three changed files
  6. npx tsc --noEmit
  7. git diff --check
  8. npx tsx src/cli/index.ts detect-changes -r GitNexus

Validation verified from diff / CI evidence:

  • CI check results are not available in this environment (no runs returned). Verifying against code only.
  • The regression test is correctly structured: it creates an FTS index, calls closeLbug() (which sets ftsLoaded = false), reopens via initLbug(handle.dbPath) (which triggers doInitLbugloadFTSExtension()), then queries. The assertion arrayContaining([objectContaining({ filePath: 'src/index.ts' })]) is a meaningful result check against the createMinimalTestGraph fixture (which seeds Function nodes with name='main' and filePath='src/index.ts').
  • The test does not just probe no-throw; it proves an actual search result is returned.
  • bm25-search.test.ts would cover the searchFTSFromLbug integration path (FTS_INDEXES enumeration, result merging).
  • TypeScript is unchanged in the adapter signature — tsc --noEmit should pass.
  • The closeLbug + initLbug test call is isolated to a dedicated fork (lbug-db, fileParallelism: false), so no cross-file singleton corruption.
  • Windows fork-worker exit: The PR reports 49/51 tests exiting with native fork-worker failure when running without --maxWorkers=1. This is consistent with the pre-existing dangerouslyIgnoreUnhandledErrors: true comment in vitest.config.ts which explicitly documents "vitest forks + native addon interaction where destructors run in arbitrary order at exit." Serializing with --maxWorkers=1 is the documented mitigation, not a correctness hack.

Missing or not-verifiable:

  • No CI run results to confirm the serialized lbug-db project passes reliably end-to-end.
  • The HTTP/Web search path (api.tswithLbugDb → BM25 mode) is not directly exercised in a test; search-core.test.ts was listed in lbug-db.include before this PR and covers that path indirectly.
  • No direct exercise of searchFTSFromLbugqueryFTS after a DB reopen in the new test (it tests at the queryFTS level which is the innermost call site; the layer above is consistently delegating).

Does missing validation block production readiness? No. The initLbugdoInitLbugloadFTSExtension path is the same code path used by withLbugDb, and the regression test covers it at the right level of abstraction. The HTTP search path calling withLbugDb adds only session-lock and retry logic on top; it does not bypass or alter extension loading.


Final verdict

production-ready with minor follow-ups

The core fix is a single, correct line addition in doInitLbug that closes a clear gap: the FTS extension was never loaded during core adapter initialization, so HTTP-style sessions reopening a DB could not query existing BM25 indexes. The regression test proves the fix at the right level (real adapter, real DB file, asserting result not just no-throw), and the closeLbug + reopen cycle exactly mirrors the production failure condition. The Vitest config changes are correctly scoped, in sync, and justified by LadybugDB's native file-lock behavior — the three newly serialized tests (lbug-vector-extension, java-class-impact, class-impact-all-languages) all use withTestLbugDB and genuinely belong in the serialized project. The Windows fork-worker exit is consistent with the pre-existing documented N-API destructor behavior, not a new correctness bug. The only follow-up worth noting is cleaning up the now-redundant loadFTSExtension() call in the withTestLbugDB helper, which is harmless but misleading about ownership — that's a one-liner cleanup that does not block merge.

@magyargergo magyargergo merged commit aa7bacd into main Apr 27, 2026
20 checks passed
@magyargergo magyargergo deleted the fix/http-search-hybrid-init-1071 branch April 27, 2026 18:43
caork added a commit to caork/GitNexus that referenced this pull request May 18, 2026
Bring in upstream fixes including:
- fix(search): create FTS indexes during analyze (abhigyanpatwari#1107) — ROOT CAUSE of
  query() returning 0 results (FTS indexes were never created because
  lazy creation failed on read-only MCP pool connection)
- fix(search): load FTS during core DB init (abhigyanpatwari#1123)
- fix(search): surface warning when FTS indexes missing (abhigyanpatwari#1418)
- fix(augment): add CONTAINS fallback when FTS unavailable (abhigyanpatwari#1476)
- fix(search): guard against undefined bm25Results (abhigyanpatwari#1489)
- feat(cpp): C++ ADL V2 overload resolution improvements
- feat(detect-changes): support git worktrees (abhigyanpatwari#1654)
- feat(cpp): parameter type class sidecar, SFINAE filter
- Various CI, security, and infrastructure improvements

AscendC provider updated to match upstream naming:
  sourcePreprocessor → preprocessSource

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hybrid RAG cannot fully execute on current query entrypoints (MCP loses semantic, Web search path loses BM25)

1 participant