fix(lbug): robust Windows lock acquisition for CI integration tests#1430
Conversation
LadybugDB's `new Database()` raises `Could not set lock on file` from local_file_system.cpp synchronously inside the constructor — before any query is issued, so `withLbugDb`'s query-time retry never sees it. On Windows CI this surfaces as flaky integration tests due to AV-scanner holds, libuv handle-release lag, and stale `.wal` sidecars from aborted prior runs. This change closes the gap at *open time*: - `openLbugConnection` now wraps `new lbug.Database()` in a bounded busy-retry (5x100ms back-off) inside `lbug-config.ts`. Errors that exhaust the budget are tagged via `LBUG_OPEN_RETRY_EXHAUSTED` so `withLbugDb`'s outer 3x retry skips re-retrying a freshly-exhausted path (eliminates the 3x5=15-attempt / ~6s tail latency). - For recognized test fixtures only (immediate-parent dir matches a known prefix AND resolves under `os.tmpdir()`), one final stale- sidecar sweep removes `.wal`/`.lock` and retries once. Production paths never enter this branch. - `safeClose` on Windows runs a bounded `fs.open` probe to absorb native handle-release lag; logs a warning if the probe exhausts so operators can spot AV interference. - `isDbBusyError` is now defined in `lbug-config.ts` as the single source of truth, re-exported from `lbug-adapter.ts` for compatibility. - New tests cover open-time retry (happy/retry/exhaust/non-busy/tag), stale-sidecar sweep (test-fixture-only, production-rejection, preserves-original-error), `isTestFixturePath` direct unit suite (accept/reject/traversal/nested/trailing-sep), and `waitForWindowsHandleRelease` (openable/ENOENT/no-leak). - The two new test files are added to vitest's existing serialized `lbug-db` project (already `fileParallelism: false`). Closes the chronic Windows CI flake on lbug-touching integration tests while preserving the existing single-writable-Database-per-process LadybugDB contract. No public API surface changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
… directly The re-export from lbug-adapter.ts was a transitional convenience — with the matcher now living in lbug-config.ts, having two import paths for the same symbol invites future drift. Updated the two real consumers (lbug-lock-retry.test.ts, lbug-open-retry.test.ts) to import from lbug-config directly, removed the re-export equality test (now vacuous), and refreshed the explanatory comment so it no longer references a re-export pattern that doesn't exist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 8292 tests passed 1 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
… on Windows doInitLbug logs "⚠️ Schema creation warning: ... Could not set lock on file" on every CREATE NODE TABLE call after the first init on a given dbPath, on Windows. The lock is internal to LadybugDB v0.16.1 and is resolved before the table is created — same tolerance pattern as the existing "already exists" filter. Genuine cross-process lock contention still surfaces on the next operation through withLbugDb's retry, so filtering at the schema-init catch only suppresses noise, not signal. Also extend the safeClose Windows handle-release probe to cover the .wal sidecar (the previous Database's WAL handle was the slowest to release, surfacing as the schema-query lock contention) and switch the probe back to 'r+' so it actually detects exclusive locks. Test loop in lbug-close-handle-release.test.ts simplified to 10 plain iterations now that the underlying noise is filtered upstream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Claude finished @magyargergo's task in 6m 58s —— View job PR #1430 Review — fix(lbug): robust Windows lock acquisition for CI integration testsTodo List
Review bar for this PR
Current PR state
Branch hygiene assessmentVerdict: merge-from-main commit present but harmless and merge-safe. The branch contains 3 authored commits ( Understanding of the change
The existing The PR introduces three coordinated mitigations:
Findings[minor] isDbBusyError negative test coverage is thin
[minor] Backoff comment overstates worst-case latency
[minor] TEST_FIXTURE_PREFIXES is a silent maintenance trap
[info] Windows handle-release probe cannot be validated on Linux/macOS CI
LadybugDB open/retry assessment
db = await openWithLockRetry(
() => createLbugDatabase(lbugModule, databasePath, options),
databasePath,
);If
The post-exhaustion sidecar sweep fires only when Busy/lock classification assessment
The matcher handles The Breadth concern (see Findings above): Retry marker / outer retry assessment
if (!isDbBusyError(err) || isOpenRetryExhausted(err) || attempt === DB_LOCK_RETRY_ATTEMPTS) {
throw err;
}Logic table:
The marker is NOT attached to non-lock errors. Tests verify Test at Windows handle-release assessment
Idempotence: second call sees Probe design (
Test-fixture sidecar sweep assessment
Symlink P3 risk (acknowledged in PR body): If a test creates a symlink named The sweep deletes only Production paths fail both the Pool / concurrency / data safety assessment
Test assessment
All three new test files are in the serialized Gap: No negative tests for lock-looking non-transient messages in Hidden Unicode / hygiene assessmentBidi controls (U+202A–202E, U+2066–2069): None found in any changed file. ✓ Visible non-ASCII: Em dashes ( No invisible whitespace or Unicode control sequences were found. ✓ Grep rendering note: The Grep tool output rendered Elegance / maintainability assessmentThe implementation is well-structured:
The Final verdictproduction-ready with minor follow-ups The "Likely Issue 1" (duplicate DB construction / handle leak in The remaining items are genuinely minor: CI is fully green on 3 platforms (8264/8265 tests, 1 pre-existing skip). The branch is merge-safe. The change is appropriately scoped, well-tested for a Windows-specific reliability fix, and preserves LadybugDB's single-writer concurrency contract throughout. |
- Drop redundant `could not set lock` term — already subsumed by `lock`.
- Document the intentionally-broad matcher: graph-DB lock-shaped errors
("deadlock", "unlock failed", "lock contention", "could not open lock
file") are all treated as transient. If a non-transient surfaces,
tighten the matcher rather than raise the retry budget.
- Add positive test cases covering those lock-shaped strings so the
intent is visible and a future tightening would deliberately break
these.
- Fix the open-retry back-off comment: max sleep is 100+200+300+400 =
1000ms (no sleep after the final attempt), not 1.5s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…closes Windows lbug lock flakes on cli-e2e Brings open-time retry for , post-close handle-release probe (main file + .wal sidecar) on Windows, schema-init lock-warning filter, and serialized lbug-db vitest project. Without this, the cli-e2e tests for cypher/query/impact intermittently fail on windows-latest because the CLI's read-only lbug open hits the documented Windows lock-acquisition race.
Summary
new lbug.Database()in a bounded busy-retry insidelbug-config.ts— closes the gap where Windows CI fails withCould not set lock on fileraised synchronously from LadybugDB'slocal_file_system.cppbefore any query is issued (sowithLbugDb's query-time retry never sees it).safeCloseto absorb libuv's native handle-release lag; logs a warning if the probe exhausts so operators can spot AV/Defender interference..wal/.locksweep — fires only when the immediate-parent dir matches a known test prefix AND the path resolves underos.tmpdir(). Production paths never enter this branch.LBUG_OPEN_RETRY_EXHAUSTEDsowithLbugDb's outer 3× retry skips re-retrying — eliminates a 3×5=15-attempt / ~6s tail latency.isDbBusyErrorlives inlbug-config.tsand is re-exported fromlbug-adapter.ts.waitForWindowsHandleReleaseand all retry knobs are co-located inlbug-config.tswith a one-grep cross-reference comment.Why this exists
The lock IO exception is thrown synchronously from
new lbug.Database()(LadybugDB v0.16.1local_file_system.cpp:126). The existingwithLbugDbquery-time retry cannot intercept it because no query has been issued yet. On Windows CI this surfaces as flaky integration tests due to AV-scanner holds (≤200 ms typical), libuv handle-release lag, and stale.walsidecars from aborted prior runs. LadybugDB's concurrency model is preserved — single-writable-Database-per-process, multi-Connection.Test plan
cd gitnexus && npx tsc --noEmit— passescd gitnexus && npx vitest run test/integration/lbug-open-retry.test.ts test/integration/lbug-close-handle-release.test.ts test/integration/lbug-lock-retry.test.ts— 38/38 pass (12 new open-retry scenarios + 11 new isTestFixturePath/probe scenarios + 2 close-handle-release + 13 pre-existing lock-retry regressions)npx prettier --check— cleanprocess.platform === 'win32')Scope
CHANGELOG.mdedit (owned by release process perDoD.md§ 4.5).lbug-dbproject (fileParallelism: false).Residual P3 follow-ups (intentional, non-blocking)
lstatguard before sweepunlink(mitigated by POSIX semantics + tightened guards)/var/folderssymlink prefix mismatch if caller realpaths first (not exercised today)isReadOnlyDbErrorintolbug-config.tsto model the new convention (out of scope here)🤖 Generated with Claude Code