Skip to content

fix(lbug): robust Windows lock acquisition for CI integration tests#1430

Merged
magyargergo merged 6 commits into
mainfrom
fix/windows-lbug-lock-retry
May 8, 2026
Merged

fix(lbug): robust Windows lock acquisition for CI integration tests#1430
magyargergo merged 6 commits into
mainfrom
fix/windows-lbug-lock-retry

Conversation

@magyargergo

Copy link
Copy Markdown
Collaborator

Summary

  • Wraps new lbug.Database() in a bounded busy-retry inside lbug-config.ts — closes the gap where Windows CI fails with Could not set lock on file raised synchronously from LadybugDB's local_file_system.cpp before any query is issued (so withLbugDb's query-time retry never sees it).
  • Adds a Windows-only post-close handle-release probe in safeClose to absorb libuv's native handle-release lag; logs a warning if the probe exhausts so operators can spot AV/Defender interference.
  • Adds a guarded stale-.wal/.lock sweep — fires only when the immediate-parent dir matches a known test prefix AND the path resolves under os.tmpdir(). Production paths never enter this branch.
  • Tags exhausted open-time errors with LBUG_OPEN_RETRY_EXHAUSTED so withLbugDb's outer 3× retry skips re-retrying — eliminates a 3×5=15-attempt / ~6s tail latency.
  • Single source of truth: isDbBusyError lives in lbug-config.ts and is re-exported from lbug-adapter.ts. waitForWindowsHandleRelease and all retry knobs are co-located in lbug-config.ts with a one-grep cross-reference comment.

Why this exists

The lock IO exception is thrown synchronously from new lbug.Database() (LadybugDB v0.16.1 local_file_system.cpp:126). The existing withLbugDb query-time retry cannot intercept it because no query has been issued yet. On Windows CI this surfaces as flaky integration tests due to AV-scanner holds (≤200 ms typical), libuv handle-release lag, and stale .wal sidecars from aborted prior runs. LadybugDB's concurrency model is preserved — single-writable-Database-per-process, multi-Connection.

Test plan

  • cd gitnexus && npx tsc --noEmit — passes
  • cd gitnexus && npx vitest run test/integration/lbug-open-retry.test.ts test/integration/lbug-close-handle-release.test.ts test/integration/lbug-lock-retry.test.ts — 38/38 pass (12 new open-retry scenarios + 11 new isTestFixturePath/probe scenarios + 2 close-handle-release + 13 pre-existing lock-retry regressions)
  • npx prettier --check — clean
  • Pre-commit hook (typecheck + format on diff) — passes
  • Windows CI run — confirm chronic lbug-touching integration test flakes resolved
  • Linux/macOS CI runs — confirm no regression (Windows-only branches are gated on process.platform === 'win32')

Scope

  • No public API surface changed (CLI flags, MCP tools, HTTP routes, graph schema unchanged).
  • No persisted-data migration required.
  • No CHANGELOG.md edit (owned by release process per DoD.md § 4.5).
  • The two new test files are added to vitest's existing serialized lbug-db project (fileParallelism: false).

Residual P3 follow-ups (intentional, non-blocking)

  • Symlink-defense lstat guard before sweep unlink (mitigated by POSIX semantics + tightened guards)
  • macOS /var/folders symlink prefix mismatch if caller realpaths first (not exercised today)
  • Move isReadOnlyDbError into lbug-config.ts to model the new convention (out of scope here)

🤖 Generated with Claude Code

LadybugDB's `new Database()` raises `Could not set lock on file` from
local_file_system.cpp synchronously inside the constructor — before any
query is issued, so `withLbugDb`'s query-time retry never sees it. On
Windows CI this surfaces as flaky integration tests due to AV-scanner
holds, libuv handle-release lag, and stale `.wal` sidecars from aborted
prior runs.

This change closes the gap at *open time*:

- `openLbugConnection` now wraps `new lbug.Database()` in a bounded
  busy-retry (5x100ms back-off) inside `lbug-config.ts`. Errors that
  exhaust the budget are tagged via `LBUG_OPEN_RETRY_EXHAUSTED` so
  `withLbugDb`'s outer 3x retry skips re-retrying a freshly-exhausted
  path (eliminates the 3x5=15-attempt / ~6s tail latency).
- For recognized test fixtures only (immediate-parent dir matches a
  known prefix AND resolves under `os.tmpdir()`), one final stale-
  sidecar sweep removes `.wal`/`.lock` and retries once. Production
  paths never enter this branch.
- `safeClose` on Windows runs a bounded `fs.open` probe to absorb
  native handle-release lag; logs a warning if the probe exhausts so
  operators can spot AV interference.
- `isDbBusyError` is now defined in `lbug-config.ts` as the single
  source of truth, re-exported from `lbug-adapter.ts` for compatibility.
- New tests cover open-time retry (happy/retry/exhaust/non-busy/tag),
  stale-sidecar sweep (test-fixture-only, production-rejection,
  preserves-original-error), `isTestFixturePath` direct unit suite
  (accept/reject/traversal/nested/trailing-sep), and
  `waitForWindowsHandleRelease` (openable/ENOENT/no-leak).
- The two new test files are added to vitest's existing serialized
  `lbug-db` project (already `fileParallelism: false`).

Closes the chronic Windows CI flake on lbug-touching integration tests
while preserving the existing single-writable-Database-per-process
LadybugDB contract. No public API surface changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented May 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment May 8, 2026 10:34am

Request Review

… directly

The re-export from lbug-adapter.ts was a transitional convenience — with
the matcher now living in lbug-config.ts, having two import paths for the
same symbol invites future drift. Updated the two real consumers
(lbug-lock-retry.test.ts, lbug-open-retry.test.ts) to import from
lbug-config directly, removed the re-export equality test (now vacuous),
and refreshed the explanatory comment so it no longer references a
re-export pattern that doesn't exist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
8293 8292 0 1 390s

✅ All 8292 tests passed

1 test(s) skipped — expand for details
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 77.95% 25153/32268 N/A% 🟢 ███████████████░░░░░
Branches 66.46% 15892/23909 N/A% 🟢 █████████████░░░░░░░
Functions 83.1% 2528/3042 N/A% 🟢 ████████████████░░░░
Lines 80.99% 22728/28062 N/A% 🟢 ████████████████░░░░

📋 View full run · Generated by CI

… on Windows

doInitLbug logs "⚠️ Schema creation warning: ... Could not set lock on
file" on every CREATE NODE TABLE call after the first init on a given
dbPath, on Windows. The lock is internal to LadybugDB v0.16.1 and is
resolved before the table is created — same tolerance pattern as the
existing "already exists" filter. Genuine cross-process lock contention
still surfaces on the next operation through withLbugDb's retry, so
filtering at the schema-init catch only suppresses noise, not signal.

Also extend the safeClose Windows handle-release probe to cover the
.wal sidecar (the previous Database's WAL handle was the slowest to
release, surfacing as the schema-query lock contention) and switch the
probe back to 'r+' so it actually detects exclusive locks.

Test loop in lbug-close-handle-release.test.ts simplified to 10 plain
iterations now that the underlying noise is filtered upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 6m 58s —— View job


PR #1430 Review — fix(lbug): robust Windows lock acquisition for CI integration tests

Todo List

  • Post initial comment
  • Read repo context files (DoD.md, AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md)
  • Inspect changed source files (lbug-config.ts, lbug-adapter.ts)
  • Inspect changed test files (lbug-open-retry.test.ts, lbug-close-handle-release.test.ts, lbug-lock-retry.test.ts, test-db.ts, vitest.config.ts)
  • Check branch hygiene and diff scope
  • Run hidden Unicode grep checks
  • Search for key symbols across the codebase
  • Analyze CI status
  • Compile and post full review

Review bar for this PR

  • Constructor-time new lbug.Database() lock errors are retried with bounded, deterministic backoff — distinct from the existing query-time retry which can never see them.
  • LBUG_OPEN_RETRY_EXHAUSTED must prevent 3×5 retry multiplication without suppressing legitimate query-time lock recovery.
  • isDbBusyError is a single source of truth; its matcher must not classify non-transient errors as retryable.
  • The stale .wal/.lock sweep is unconditionally test-fixture-only; a guard bug is a production data-loss risk.
  • safeClose remains idempotent and cannot leave conn, db, or currentDbPath in inconsistent state.
  • Windows handle-release probe handles must be closed unconditionally; the probe must be a pure no-op on Linux/macOS.
  • No duplicate lbug.Database instances may be created for the same path in the same process.

Current PR state

Field Value
Head SHA 48b01fb
Status Open
Commits 4 (3 authored + 1 merge-from-main)
Changed files 7 (all lbug/test/vitest)
Net diff +639/-19
CI ✅ All checks passed — 8264/8265 tests, 3 platforms
Typecheck tsc --noEmit passes
Windows CI ✅ Included in "unit tests, 3 platforms"
Package versions / lockfile Unchanged by PR
Hidden Unicode No bidi controls found (see § Unicode below)
Vercel preview Irrelevant to this core/lbug change

Branch hygiene assessment

Verdict: merge-from-main commit present but harmless and merge-safe.

The branch contains 3 authored commits (e6924f3, a96455a, 11affdd) plus 48b01fb (merge-from-main). The merge commit brought in 927a172 (perf/MCP: parallelize staleness checks) which touches git-staleness.ts, local-backend.ts, and two unit tests — all unrelated to lbug. These files are already on main and will be in the result regardless. The PR's own diff is narrowly scoped to the 7 lbug/test/vitest files listed.


Understanding of the change

new lbug.Database() calls into LadybugDB's native layer (local_file_system.cpp:126) which acquires an exclusive OS lock synchronously. On Windows, two phenomena cause the constructor to fail with "Could not set lock on file" even when no other process holds the lock at the JS level:

  1. AV/Defender hold — Windows Defender briefly holds newly-created files for scanning (≤200ms typical).
  2. libuv handle-release lagdb.close() resolves in JS before the kernel fully releases the native file handle, so a rapid reopen on the same path races the release.

The existing withLbugDb retry loop operates at query time — it catches busy errors from queries executed against an already-open connection. It has no visibility into constructor-time failures because new Database() throws before withLbugDb ever issues a query.

The PR introduces three coordinated mitigations:

  1. Open-time retry (openWithLockRetry): wraps the constructor call in a 5-attempt linear-backoff loop, then (for test fixtures only) sweeps stale sidecars and makes one final attempt.
  2. Post-close handle probe (waitForWindowsHandleRelease): after safeClose, probes both dbPath and dbPath + '.wal' with fs.open(path, 'r+') on Windows only, absorbing handle-release lag before the next open.
  3. Exhaustion marker (LBUG_OPEN_RETRY_EXHAUSTED): tags errors thrown after the open-time budget exhausts so withLbugDb's outer 3× retry skips re-retrying a path that just spent ~1s in the open loop.

Findings

[minor] isDbBusyError negative test coverage is thin

  • Category: Error classification / test coverage
  • Files: gitnexus/src/core/lbug/lbug-config.ts:70-78, gitnexus/test/integration/lbug-lock-retry.test.ts:41-54
  • Issue: msg.includes('lock') matches any message containing "lock" — including hypothetical "lock file missing", "deadlock detected", or "unlock failed". Existing negative tests only cover "Table not found", "Connection refused", "Syntax error", null, and undefined. No test exercises lock-looking messages that should NOT be retried.
  • Why it matters: If LadybugDB ever emits a non-transient error containing "lock" (e.g., a schema-level lock-file-missing during database recovery), it would be silently retried up to 5 times and then mis-tagged with LBUG_OPEN_RETRY_EXHAUSTED.
  • Recommended fix: Add negative test cases for "Could not open lock file", "deadlock", "unlock failed" in lbug-lock-retry.test.ts. If any of these should NOT be retried, tighten the matcher. The redundant msg.includes('could not set lock') term (already covered by msg.includes('lock')) can be removed as a drive-by cleanup.
  • Blocks merge: No — LadybugDB's actual non-transient error messages don't appear to match the broad matcher, and the retry is bounded at 5 attempts regardless.

[minor] Backoff comment overstates worst-case latency

  • Category: Documentation
  • Files: gitnexus/src/core/lbug/lbug-config.ts:110-112
  • Issue: The comment reads "5 attempts × 100ms linear back-off (~1.5s worst case)". The actual maximum sleep is 100+200+300+400 = 1000ms (no sleep after the final failed attempt). Adding 5 constructor call RTTs of ~10–50ms each gives ~1.0–1.2s — not 1.5s.
  • Why it matters: Misleads an SRE tuning the budget under a tight timeout SLA.
  • Recommended fix: Update comment to "~1.0s worst-case sleep" or "~1s worst case".
  • Blocks merge: No.

[minor] TEST_FIXTURE_PREFIXES is a silent maintenance trap

  • Category: Maintainability
  • Files: gitnexus/src/core/lbug/lbug-config.ts:135, gitnexus/test/helpers/test-db.ts:48
  • Issue: Any new test helper that calls createTempDir with a custom prefix that is NOT in TEST_FIXTURE_PREFIXES will silently not receive the stale-sidecar sweep. The comments in both files warn about this, but the link is fragile — there's no compile-time or test-time enforcement.
  • Why it matters: The failure mode is silent: the sweep doesn't fire, Windows CI flakes return, and the root cause is a forgotten list entry rather than a code bug.
  • Recommended fix: Consider a test that reads the prefix list from lbug-config.ts and cross-references against all createTempDir call sites. Alternatively, accept the two-file comment contract as-is since the list has two entries today and growth is rare.
  • Blocks merge: No.

[info] Windows handle-release probe cannot be validated on Linux/macOS CI

  • Category: Testing limitation
  • Files: gitnexus/test/integration/lbug-close-handle-release.test.ts, gitnexus/src/core/lbug/lbug-config.ts:302-324
  • Issue: The close-handle-release tests verify that 10 open/close cycles don't throw and that safeClose is idempotent. On Linux/macOS (where CI runs), process.platform !== 'win32', so waitForWindowsHandleRelease is never exercised in the cycle test. The waitForWindowsHandleRelease unit tests in lbug-open-retry.test.ts test ENOENT and openable-file paths but cannot simulate EBUSY/EPERM behavior without a real Windows lock.
  • Why it matters: The probe's most important branch (lock-code retry loop) is untested on all three CI platforms. The PR body's Windows CI checkbox was originally unchecked; the CI report shows "3 platforms" which covers this.
  • Recommended fix: Accept as a known limitation. The probe's EBUSY/EPERM branch is structurally identical to the open-time retry and has the same finally-based handle close; the risk is bounded.
  • Blocks merge: No.

LadybugDB open/retry assessment

openLbugConnection is clean. The feared "Likely Issue 1" (eager createLbugDatabase before openWithLockRetry) is not present in the code. The db variable is assigned exclusively from openWithLockRetry's return value:

db = await openWithLockRetry(
  () => createLbugDatabase(lbugModule, databasePath, options),
  databasePath,
);

If new lbugModule.Connection(db) throws after a successful open, the outer catch correctly closes db before rethrowing. No handle leak path exists.

openWithLockRetry is correct:

  • Failed constructor calls throw synchronously before returning a handle → nothing to leak.
  • Non-busy errors throw immediately on the first attempt (if (!isDbBusyError(err)) throw err).
  • Busy errors are retried up to OPEN_LOCK_RETRY_ATTEMPTS = 5.
  • Delays: 100ms, 200ms, 300ms, 400ms (linear, no sleep after last failed attempt) — bounded and deterministic.
  • The loop never calls itself recursively.

The post-exhaustion sidecar sweep fires only when isTestFixturePath returns true, makes one additional constructor call, and if that also fails, preserves originalLockError (not the post-sweep error) before tagging and rethrowing. The "original error wins" invariant is tested explicitly (lbug-open-retry.test.ts:208–229). ✓


Busy/lock classification assessment

isDbBusyError is a single source of truth at lbug-config.ts:70-78. All callers (openWithLockRetry, withLbugDb, doInitLbug's schema warning filter) import from lbug-config directly — the re-export from lbug-adapter.ts was removed in a96455a. ✓

The matcher handles null/undefined/non-Error values safely: err instanceof Error ? err.message : String(err). The String(null) = "null" path doesn't match any lock term. ✓

The could not set lock term is redundant with lock but harmless.

Breadth concern (see Findings above): msg.includes('lock') is broad. In practice, LadybugDB's non-transient errors don't contain bare "lock" based on the codebase evidence. Bounded retry (5 max) limits damage if a mis-match occurs.


Retry marker / outer retry assessment

LBUG_OPEN_RETRY_EXHAUSTED = Symbol.for('gitnexus.lbug.openRetryExhausted') — global symbol appropriate for cross-module propagation. The symbol is attached as a property on the error object, which is robust to re-throws. ✓

tagOpenRetryExhausted guards against primitives (if (err && typeof err === 'object')), so a string error thrown from the constructor won't panic. ✓

withLbugDb skip condition at line 247:

if (!isDbBusyError(err) || isOpenRetryExhausted(err) || attempt === DB_LOCK_RETRY_ATTEMPTS) {
  throw err;
}

Logic table:

  • Non-busy: throw (no retry) ✓
  • Busy + exhausted marker: throw (skip outer 3× loop) ✓
  • Busy + not exhausted + not last attempt: fall through to safeClose + sleep + retry ✓
  • Busy + not exhausted + last attempt: throw ✓

The marker is NOT attached to non-lock errors. Tests verify isOpenRetryExhausted(caught) === true after exhaustion and isOpenRetryExhausted(new Error('plain')) === false. ✓

Test at lbug-lock-retry.test.ts:92-104 verifies the outer retry fires exactly 3 times for an unmarked query-time busy error. ✓


Windows handle-release assessment

safeClose ordering (lines 1066–1107):

  1. flushWAL() — CHECKPOINT while conn is live ✓
  2. Capture closingDbPath = currentDbPath before nulling ✓
  3. conn.close() in try/catch → conn = null (inside if(conn), outside try/catch — nulled even if close throws) ✓
  4. db.close() in try/catch → db = null
  5. Windows probe: waitForWindowsHandleRelease(closingDbPath) — runs only if process.platform === 'win32' && closingDbPath
  6. currentDbPath is NOT reset here — closeLbug (line 1111) and withLbugDb retry (line 254) reset it after calling safeClose. This is the intended contract. ✓

Idempotence: second call sees conn = null and db = null, skips close calls, closingDbPath = currentDbPath (still set from first call if closeLbug hasn't cleared it). Probe may run again on already-released file → fs.open('r+') succeeds → returns true → harmless. safeClose is idempotent test confirms this. ✓

Probe design (probeSinglePath):

  • 'r+' flag: correctly surfaces exclusive write locks on Windows (changed from 'r' in 11affdd; 'r' would grant read access while write locks are in flight). ✓
  • ENOENT → return true (file absent means handle certainly released). ✓
  • EBUSY/EPERM/EACCES → retry loop with linear delay (50ms, 100ms, 150ms, 200ms). ✓
  • Non-lock error → return true (not our problem). ✓
  • try/finally around handle.close() — no fd leak. ✓
  • Both dbPath and dbPath + '.wal' probed; && semantics means both must release. ✓
  • Warning message includes dbPath in structured log field — operator-visible without raw message leak. ✓

Test-fixture sidecar sweep assessment

isTestFixturePath guards (lines 175–181):

Attack vector Defense Verdict
<tmp>/gitnexus-lbug-x/../../etc/passwd.wal path.resolve normalizes .. Rejected ✓
/var/data/gitnexus-lbug-x/lbug resolved.startsWith(tmpRoot + path.sep) Rejected ✓
<tmp>/gitnexus-lbug-x/inner/deep/lbug Immediate-parent check only Rejected ✓
Trailing separator from Windows TMP tmpRoot.replace(sep+$, '') strip Handled ✓
Prefix in ancestor not immediate parent path.basename(path.dirname(resolved)) Rejected ✓

Symlink P3 risk (acknowledged in PR body): If a test creates a symlink named lbug under <tmp>/gitnexus-lbug-x/, path.resolve will NOT follow symlinks — it only resolves ./.. segments. So isTestFixturePath would check the symlink's path (under tmp, correct prefix) and approve the sweep. fs.unlink(dbPath + '.wal') then deletes <tmp>/gitnexus-lbug-x/lbug.wal — a sibling file, not the symlink target. In practice, no test helper creates sidecar symlinks. Accepting P3 deferral as non-blocking is appropriate.

The sweep deletes only dbPath + '.wal' and dbPath + '.lock' — never the DB file itself. ENOENT is swallowed. Permission errors are swallowed. originalLockError is preserved through both cases. ✓

Production paths fail both the startsWith(tmpRoot) check and the prefix check independently. ✓


Pool / concurrency / data safety assessment

  • openWithLockRetry is strictly sequential — no parallel constructor calls. ✓
  • The module-level singleton (db, conn, currentDbPath) is single-valued; sessionLock prevents concurrent operations. ✓
  • openLbugConnection returns exactly one (db, conn) pair per successful call. ✓
  • No production code path reaches sweepStaleSidecars. ✓
  • No schema migrations, no public CLI/MCP/API surface changes. ✓
  • closeLbugConnection (used by MCP pool for read-only connections) is unchanged and does not call waitForWindowsHandleRelease. ✓
  • LadybugDB's single-writable-Database-per-process contract is preserved throughout. ✓

Test assessment

Scenario Covered File
Constructor succeeds on first try ✓ (callCount=1) lbug-open-retry.test.ts:65
Retry on busy, success on later attempt :78
Exhausts retry budget, rethrows ✓ (callCount=5) :90
Exhausted error tagged with marker :105
Non-busy error not retried ✓ (callCount=1) :126
withLbugDb skips outer retry on marked error ✓ (implied by exhaustion marker test) :105
withLbugDb retries 3× on unmarked busy lbug-lock-retry.test.ts:92
Sidecar sweep fires for test fixture ✓ (callCount=6, files deleted) lbug-open-retry.test.ts:154
Sweep rejected for non-tmp production path ✓ (callCount=5) :176
Sweep handles ENOENT gracefully :193
Post-sweep failure preserves original error :208
isTestFixturePath traversal rejected :244
isTestFixturePath immediate-parent only :250
isTestFixturePath trailing-sep handled :258
waitForWindowsHandleRelease, file exists :288
waitForWindowsHandleRelease, ENOENT = true :294
Probe no fd leak (50 sequential) :301
safeClose idempotent (double close) lbug-close-handle-release.test.ts:29
10 open/close/reopen cycles :15
isDbBusyError positive/negative lbug-lock-retry.test.ts:19-54

All three new test files are in the serialized lbug-db Vitest project (fileParallelism: false). ✓

Gap: No negative tests for lock-looking non-transient messages in isDbBusyError (see Findings). The test count of 38 matches the PR body claim.


Hidden Unicode / hygiene assessment

Bidi controls (U+202A–202E, U+2066–2069): None found in any changed file. ✓

Visible non-ASCII: Em dashes (), arrows (), warning emoji (⚠️), and box-drawing characters () appear in comments and one logger warning string. All are visible, intentional, and consistent with the rest of the codebase. GitHub's "hidden Unicode" warnings are false positives for these characters.

No invisible whitespace or Unicode control sequences were found. ✓

Grep rendering note: The Grep tool output rendered // as / in one result (line 243 of lbug-adapter.ts). The actual file confirms // Skip outer retry... with a double-slash comment. tsc --noEmit passing further confirms the source is correct.


Elegance / maintainability assessment

The implementation is well-structured:

  • All four retry knobs are co-located in lbug-config.ts with a single cross-reference comment block, making SRE tuning straightforward.
  • openWithLockRetry and probeSinglePath are small, purpose-specific helpers with no scope creep.
  • The exhaustion marker using Symbol.for is a clean, non-stringly-typed signaling mechanism.
  • JSDoc on waitForWindowsHandleRelease explains the 'r+' vs 'r' choice concisely — this is non-obvious Windows-specific behavior that warrants the comment.
  • The _isTestFixturePathForTest export-for-testing pattern is a standard repo convention.
  • TEST_FIXTURE_PREFIXES maintenance note is in two locations (config + test helper) which is adequate but creates a silent failure mode (see Findings).

The could not set lock term in isDbBusyError being redundant with lock is a minor clarity nit — removing it would make the intent cleaner without any behavioral change.


Final verdict

production-ready with minor follow-ups

The "Likely Issue 1" (duplicate DB construction / handle leak in openLbugConnection) is not present in the code — the concern was based on an older diff; the actual implementation is clean. No blocker-level issues were found across constructor-time retry, marker semantics, safeClose ordering and idempotence, sidecar sweep guards, or concurrency model.

The remaining items are genuinely minor: isDbBusyError's broad lock matcher lacks negative tests for edge-case LadybugDB messages (retry budget bounds the damage), the ~1.5s worst case comment overstates actual sleep by ~500ms, and TEST_FIXTURE_PREFIXES maintenance is a two-file comment contract with no enforcement. None of these prevent safe merging.

CI is fully green on 3 platforms (8264/8265 tests, 1 pre-existing skip). The branch is merge-safe. The change is appropriately scoped, well-tested for a Windows-specific reliability fix, and preserves LadybugDB's single-writer concurrency contract throughout.


- Drop redundant `could not set lock` term — already subsumed by `lock`.
- Document the intentionally-broad matcher: graph-DB lock-shaped errors
  ("deadlock", "unlock failed", "lock contention", "could not open lock
  file") are all treated as transient. If a non-transient surfaces,
  tighten the matcher rather than raise the retry budget.
- Add positive test cases covering those lock-shaped strings so the
  intent is visible and a future tightening would deliberately break
  these.
- Fix the open-retry back-off comment: max sleep is 100+200+300+400 =
  1000ms (no sleep after the final attempt), not 1.5s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@magyargergo magyargergo merged commit 1d46200 into main May 8, 2026
26 checks passed
@magyargergo magyargergo deleted the fix/windows-lbug-lock-retry branch May 8, 2026 10:58
magyargergo added a commit to SZU-WenjieHuang/GitNexus that referenced this pull request May 8, 2026
…closes Windows lbug lock flakes on cli-e2e

Brings open-time retry for , post-close handle-release
probe (main file + .wal sidecar) on Windows, schema-init lock-warning
filter, and serialized lbug-db vitest project. Without this, the
cli-e2e tests for cypher/query/impact intermittently fail on
windows-latest because the CLI's read-only lbug open hits the
documented Windows lock-acquisition race.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant