Skip to content

refactor: migrate from KuzuDB to LadybugDB v0.15#275

Merged
magyargergo merged 11 commits into
abhigyanpatwari:mainfrom
candidosales:ladybug-db
Mar 15, 2026
Merged

refactor: migrate from KuzuDB to LadybugDB v0.15#275
magyargergo merged 11 commits into
abhigyanpatwari:mainfrom
candidosales:ladybug-db

Conversation

@candidosales

@candidosales candidosales commented Mar 14, 2026

Copy link
Copy Markdown
Contributor

Summary

  • KuzuDB was archived (Oct 2025). LadybugDB is the community fork with full API compatibility.
  • Swapped packages: kuzu@ladybugdb/core, kuzu-wasm@ladybugdb/wasm-core
  • Renamed all internal paths from kuzu to lbug (adapters, schema, storage directory)
  • Storage path changed: .gitnexus/kuzu.gitnexus/lbug with automatic cleanup of stale KuzuDB files
  • Added explicit INSTALL VECTOR; LOAD EXTENSION VECTOR; (required in LadybugDB v0.15, was implicit in KuzuDB v0.11)
  • Updated CI workflow (kuzu-dblbug-db test group), all documentation, and all tests

Test plan

  • 1151 unit tests passing
  • 27 integration tests passing (core adapter + pool adapter)
  • tsc build succeeds with no errors
  • npm install succeeds in both gitnexus/ and gitnexus-web/
  • No stale kuzu references in source code (only intentional: cleanupOldKuzuFiles, historical changelog)
  • CI matrix passes on all 3 OS (ubuntu, windows, macos)
  • Manual smoke test: npx gitnexus analyze on a repo, verify .gitnexus/lbug created

How to test it?

# Build
cd gitnexus && npm run build

# Analyze the codebase
node gitnexus/dist/cli/index.js analyze --force ../fizzy

## Run the serve
node gitnexus/dist/cli/index.js serve

# Open the 	Web UI
cd gitnexus-web && npm run dev                   

KuzuDB was archived (Apple acquisition, Oct 2025). LadybugDB is the
community fork with full API compatibility.

- Package swap: kuzu → @ladybugdb/core, kuzu-wasm → @ladybugdb/wasm-core
- Rename all internal paths: kuzu → lbug (adapters, schema, storage)
- Storage path: .gitnexus/kuzu → .gitnexus/lbug (with auto-cleanup)
- Add explicit VECTOR extension loading (required in v0.15)
- Update CI workflow, documentation, and all tests
- 1151 unit + 27 integration tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel

vercel Bot commented Mar 14, 2026

Copy link
Copy Markdown

@candidosales is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@candidosales candidosales marked this pull request as draft March 14, 2026 00:54
@github-actions

github-actions Bot commented Mar 14, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Unit Tests success 3 platforms
✅ Integration success 3 OS x 4 groups = 12 jobs

Test Results

Suite Tests Passed Failed Skipped Duration
Unit 1255 1255 0 0 8s
Integration 708 697 0 11 49s
Total 1963 1952 0 11 57s

✅ All 1952 tests passed

11 test(s) skipped — expand for details

Integration:

  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference

Code Coverage

Combined (Unit + Integration)

Metric Coverage Covered Base Delta Status
Statements 50.45% 4356/8633 37.07% 📈 +13.4 🟢 ██████████░░░░░░░░░░
Branches 43.53% 2669/6131 33.79% 📈 +9.7 🟢 ████████░░░░░░░░░░░░
Functions 52.81% 459/869 38.26% 📈 +14.6 🟢 ██████████░░░░░░░░░░
Lines 52.02% 3999/7687 38.16% 📈 +13.9 🟢 ██████████░░░░░░░░░░
Coverage breakdown by test suite

Unit Tests

Metric Coverage Covered Base Delta Status
Statements 40.51% 3498/8633 37.07% 📈 +3.4 🟢 ████████░░░░░░░░░░░░
Branches 35.89% 2201/6131 33.79% 📈 +2.1 🟢 ███████░░░░░░░░░░░░░
Functions 41.65% 362/869 38.26% 📈 +3.4 🟢 ████████░░░░░░░░░░░░
Lines 41.86% 3218/7687 38.16% 📈 +3.7 🟢 ████████░░░░░░░░░░░░

Integration Tests

Metric Coverage Covered Base Delta Status
Statements 23.03% 1989/8633 37.07% 📉 -14.0 🔴 ████░░░░░░░░░░░░░░░░
Branches 17.66% 1083/6131 33.79% 📉 -16.1 🔴 ███░░░░░░░░░░░░░░░░░
Functions 26.12% 227/869 38.26% 📉 -12.1 🔴 █████░░░░░░░░░░░░░░░
Lines 23.93% 1840/7687 38.16% 📉 -14.2 🔴 ████░░░░░░░░░░░░░░░░

📋 View full run · Generated by CI

P1: Fix WASM adapter to use getAll() API, wire cleanupOldKuzuFiles
into analyze command, add symlink path traversal protection.
P2: Cache VECTOR extension load state, batch augmentation engine
queries (20→4), fix web getCopyQuery for multi-language tables,
fix stale KuzuDB references, correct brainstorm package names.
P3: Complete lbug-wasm.d.ts type declarations, batch semantic
search per-label, update stale BM25 comment.
The read-only pool adapter never loaded the FTS extension, so all
QUERY_FTS_INDEX calls failed silently. This broke search-pool and
augmentation integration tests, and caused empty results in the
web UI server mode.

@xkonjin xkonjin left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Substantial migration PR replacing KuzuDB with LadybugDB v0.15. The mechanical rename is thorough and the CHANGELOG is clear. A few things stand out for review.

Migration correctness

Storage path migration: The CHANGELOG notes the index path changes from .gitnexus/kuzu.gitnexus/lbug with automatic cleanup of stale KuzuDB files. This cleanup logic isn't visible in the diff — where does it live? If it's handled elsewhere, a note in the CHANGELOG entry pointing to the cleanup location would help future debugging. More importantly: is there a migration guard that prevents gitnexus analyze from silently treating a repo as unindexed if it finds only a .gitnexus/kuzu directory? A user upgrading mid-project will lose their index unless there's an explicit migration step or detection.

VECTOR extension loading: The CHANGELOG mentions LadybugDB v0.15 requires explicit VECTOR extension loading for semantic search. This is a behavioral change that can silently break semantic search if the extension isn't loaded at startup. Where is this loading happening? It should be verified that it runs before any vector query, not lazily on first use — lazy loading could produce confusing empty results without errors.

CI changes

The rename from kuzu-db to lbug-db test group is clean. One concern: the test files referenced in the run step (test/integration/lbug-core-adapter.test.ts, lbug-pool.test.ts) — do these exist in this PR, or are they being added separately? If they don't exist, the CI job will silently succeed on a missing file in some configurations (no glob expansion = no tests run = green).

package-lock.json

The kuzu-wasm package is marked as deprecated in npm — good that it's being replaced. Worth verifying the @ladybugdb/wasm-core@0.15.1 package integrity hash is what you'd get from a fresh npm install (it's unusually long in the diff, which sometimes indicates a hand-edited lockfile).

Minor

  • README table alignment after the LadybugDB name change is slightly off (extra spaces in the native storage row). Not blocking but worth a tidy.
  • The acknowledgements link to ladybugdb.com — worth confirming that domain is live and resolves correctly.

Overall: the migration looks mechanically complete. The main concerns are the index migration path for existing users, and confirming the VECTOR extension loading.

@candidosales candidosales marked this pull request as ready for review March 14, 2026 01:54
Repository owner deleted a comment from claude Bot Mar 14, 2026
Repository owner deleted a comment from claude Bot Mar 14, 2026
Repository owner deleted a comment from claude Bot Mar 14, 2026
Repository owner deleted a comment from claude Bot Mar 14, 2026
@magyargergo

Copy link
Copy Markdown
Collaborator

Can you please add an automatic migration and cleanup from the KuzuDB? 🙏

Repository owner deleted a comment from claude Bot Mar 14, 2026
@magyargergo

magyargergo commented Mar 14, 2026

Copy link
Copy Markdown
Collaborator

Can you please also look into the forking issue while you're at it?

We ran into a nasty problem with KuzuDB where vitest fork workers would hang on Linux (our CI). Here's what was happening:

When a vitest fork worker finishes and calls process.exit(0), Node's N-API cleanup hooks fire. Those hooks call KuzuDB's C++ destructors, specifically NodeDatabase::Close() calls database.reset() on a shared_ptr<Database>, which synchronously runs the Database destructor. That destructor does heavyweight stuff (buffer pool flushing, file handle cleanup, WAL checkpointing) and it would block or deadlock during fork worker shutdown on Linux. We'd get "Timeout terminating forks worker" errors. Interestingly this only happened on Linux, never on Windows.

We tried a few things:

  • SIGKILL on Linux to skip the destructors entirely, but that broke test result reporting ("Worker exited unexpectedly")
  • Just nulling the JS references without calling .close(), the GC would later finalize them and trigger the same destructors anyway

What ended up working: we changed detachKuzu() to explicitly call .close() on all native Database/Connection objects before nulling JS references. Calling .close() resets the internal shared_ptr to null, so when process.exit(0) later triggers the N-API cleanup hooks, the destructors find null pointers and just return immediately (no-ops). Read-only pool connections are fast to close since there's no WAL flush.

That mostly solved it, but we still split tests into individual vitest runs as a safety measure, multiple fork workers each holding native KuzuDB objects meant non-deterministic teardown ordering, and one slow destructor could cascade into timeouts for the others.

Would be great to know if LadybugDB has the same behavior in their C++ destructors or if they've addressed this. If you can test with the forked workers setup and see if the hang is still there, that would save us from carrying the workaround.

@candidosales

Copy link
Copy Markdown
Contributor Author

@magyargergo I added an automatic migration in this commit: 9bf1fdb

@candidosales

Copy link
Copy Markdown
Contributor Author

@magyargergo, about the second question, LadybugDB has the same class of problem, but it's actually worse — and the fix that worked for KuzuDB actively makes it worse on LadybugDB. We've already landed the adapted workarounds for this.

The KuzuDB hang (blocked destructor during N-API cleanup) and the KuzuDB fix (detachKuzu() pre-calling .close() to null the shared_ptr<Database>) don't port over cleanly. LadybugDB Node.js addon has a different failure mode:

  • KuzuDB: process.exit(0) → N-API cleanup hooks → NodeDatabase::Close() → destructor does heavy sync work (WAL flush, buffer pool, file handles) → hangs/deadlocks on Linux
  • LadybugDB: process.exit(0) OR explicit .close() call → N-API cleanup hooks → destructor accesses inconsistent internal state → SIGSEGV on Linux and macOS

The critical difference: calling .close() early (the KuzuDB fix) itself triggers the segfault. So the "pre-empty the shared_ptr" pattern is unavailable here. We verified this was an unreported issue — searched all issues in the tracker, nothing covering N-API cleanup lifecycle or fork worker teardown. It can be an opportunity to report this issue.

@magyargergo

magyargergo commented Mar 15, 2026

Copy link
Copy Markdown
Collaborator

I submitted my findings here

@candidosales

Copy link
Copy Markdown
Contributor Author

@magyargergo Regardless of this N-API issue, I think we can merge this improvement since Ladybug continues to receive new contributions. What do you think?

@magyargergo magyargergo merged commit 5a58508 into abhigyanpatwari:main Mar 15, 2026
20 of 21 checks passed
dp-web4 added a commit to dp-web4/GitNexus that referenced this pull request Mar 20, 2026
…→lbug migration

The kuzu→lbug migration (abhigyanpatwari#275) didn't carry forward three pieces from
the markdown indexing PR (abhigyanpatwari#399):

1. 'Section' missing from NODE_TABLES constant — LadybugDB type system
   doesn't recognize Section as a valid node type
2. SECTION_SCHEMA missing from NODE_SCHEMA_QUERIES — Section table never
   created in the database (already fixed in abhigyanpatwari#399 merge, confirming)
3. getCopyQuery falls through to 7-column multi-lang default for Section,
   but Section CSV has 8 columns (includes 'level'). Causes:
   "Binder exception: Number of columns mismatch. Expected 7 but got 8"

Reproduces on any repo with .md files. Tested fix against a 2K+ markdown
file repo (40K nodes, 37K edges) — indexes in 38s with no crashes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
motolese pushed a commit to motolese/datamoto-gitnexus that referenced this pull request Apr 23, 2026
* refactor: migrate from KuzuDB to LadybugDB v0.15

KuzuDB was archived (Apple acquisition, Oct 2025). LadybugDB is the
community fork with full API compatibility.

- Package swap: kuzu → @ladybugdb/core, kuzu-wasm → @ladybugdb/wasm-core
- Rename all internal paths: kuzu → lbug (adapters, schema, storage)
- Storage path: .gitnexus/kuzu → .gitnexus/lbug (with auto-cleanup)
- Add explicit VECTOR extension loading (required in v0.15)
- Update CI workflow, documentation, and all tests
- 1151 unit + 27 integration tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address code review findings (P1-P3)

P1: Fix WASM adapter to use getAll() API, wire cleanupOldKuzuFiles
into analyze command, add symlink path traversal protection.
P2: Cache VECTOR extension load state, batch augmentation engine
queries (20→4), fix web getCopyQuery for multi-language tables,
fix stale KuzuDB references, correct brainstorm package names.
P3: Complete lbug-wasm.d.ts type declarations, batch semantic
search per-label, update stale BM25 comment.

* chore: remove outdated KuzuDB migration brainstorming document

* fix: load FTS extension in MCP pool adapter on init

The read-only pool adapter never loaded the FTS extension, so all
QUERY_FTS_INDEX calls failed silently. This broke search-pool and
augmentation integration tests, and caused empty results in the
web UI server mode.

* feat: implement shared Database caching and connection reference counting

* feat: enhance KuzuDB migration handling and status reporting

* fix: mock cleanupOldKuzuFiles in local backend callTool tests

* fix: update mock for cleanupOldKuzuFiles and adjust imports in callTool tests

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants