fix(lbug): keep serve stable when sidecars are missing#1747
Conversation
Shared missing-shadow WAL recovery prevents repeated read-only open warnings when LadybugDB sidecars are absent, while the Express preflight fix keeps `gitnexus serve` compatible with Express 5 route parsing. Constraint: LadybugDB read-only replay can require a `.shadow` sidecar that may be absent after interrupted writes or checkpoint edge cases. Rejected: keep reactive WARN-only quarantine in each adapter | it leaves repeated user-visible warnings and duplicate recovery behavior. Confidence: high Scope-risk: broad Directive: Do not silently delete large orphan WALs; only quarantine tiny orphan WALs before open and keep large WALs for explicit recovery. Tested: cd gitnexus && npx vitest run test/unit/sidecar-recovery.test.ts test/unit/lbug-adapter-wal-schema.test.ts test/unit/pool-wal-recovery.test.ts test/unit/web-ui-serving.test.ts && npx tsc --noEmit Not-tested: full npm test in this split branch; full unit suite passed on the source branch before PR split. Co-authored-by: OmX <omx@oh-my-codex.dev>
|
Someone is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 9409 tests passed 1 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
|
Claude finished @magyargergo's task in 8m 23s —— View job Production-Readiness Review: PR #1747
|
| Call site | ENOENT handling |
|---|---|
finalizeLbugSidecarsAfterClose (sidecar-recovery.ts:199) |
✅ catch (err) { if (!missing(err)) { warnOnce(...) } } |
preflightLbugSidecars (sidecar-recovery.ts:145) |
❌ no try-catch → propagates to doInitLbug |
reopenReadOnlyAfterMissingShadow (lbug-adapter.ts:459) |
❌ catch { throw new Error(shadowSidecarRecoveryMessage) } → ENOENT becomes "rebuild index" error |
reopenWritableAfterMissingShadow (lbug-adapter.ts:486) |
❌ same pattern |
openReadOnlyDatabase in pool (pool-adapter.ts:388) |
❌ catch { throw new Error(shadowSidecarRecoveryMessage) } |
replayShadowPagesWithWritableOpen in pool (pool-adapter.ts:351) |
❌ same pattern |
Key distinction: The direct adapter (lbug-adapter.ts) wraps preflightLbugSidecars inside acquireInitLock (lbug-adapter.ts:703), a cross-process file lock (lbug-adapter.ts:310). This serializes concurrent direct-adapter opens for the same dbPath, making the TOCTOU window effectively zero there.
The pool adapter (pool-adapter.ts) uses only initPromises (in-process deduplication, line 471). No acquireInitLock call exists anywhere in pool-adapter.ts. Two pool-adapter processes opening the same dbPath concurrently are unprotected. The openReadOnlyDatabase and replayShadowPagesWithWritableOpen paths are directly exposed to the TOCTOU race.
This is the exact defect the fork PR's Codex review identified, and it remains unaddressed.
Recommended fix: Either (a) guard quarantineWalForMissingShadow internally — catch ENOENT from fs.rename, re-inspect, and return success if the WAL is already gone; or (b) wrap each call site's catch block to distinguish ENOENT (non-fatal, WAL already handled) from EACCES/EPERM (still fatal, permission problem). Option (a) is more defensive. Keep ENOENT non-fatal only when re-inspection confirms the WAL is gone.
Blocks merge: YES — pool adapter lacks cross-process init lock, and ENOENT from quarantine rename is an unhandled fatal error reachable from concurrent pool opens.
FINDING 2 — Large WAL silently quarantined via writable schema-creation path, bypassing preflight policy
Severity: MEDIUM
Risk: preflightLbugSidecars intentionally warns but does NOT quarantine large orphan WALs (kind: 'orphan-wal', size > 4 KB) — the code at sidecar-recovery.ts:153-160 is explicit about this. However, after open, runSchemaCreationQueries (lbug-adapter.ts:555) can catch isMissingShadowSidecarError and return it to doInitLbug:759, which then calls reopenWritableAfterMissingShadow (lbug-adapter.ts:481), which calls quarantineWalForMissingShadow unconditionally — no size check. A large orphan WAL that triggers a missing-shadow error during the first DDL write will be quarantined silently without the user seeing the preflight warning.
Evidence found: lbug-adapter.ts:759-773:
const missingShadowError = await runSchemaCreationQueries(dbPath);
if (missingShadowError) {
await safeClose();
resetOpenConnectionState();
const reopened = await reopenWritableAfterMissingShadow(dbPath, missingShadowError);
// ...
}reopenWritableAfterMissingShadow (lbug-adapter.ts:481-496) quarantines without checking WAL size.
Mitigating factor: The quarantined WAL is renamed to .wal.missing-shadow.{ts}-{rand}, not deleted. listQuarantinedMissingShadowWals allows recovery. So no silent permanent data loss occurs — the WAL is findable.
Recommended fix: In reopenWritableAfterMissingShadow, check WAL size before quarantining. If the WAL is large, fail with an actionable message ("large WAL found without shadow sidecar; manual recovery required") instead of quarantining. Alternatively, emit a WARN-level message before the rename explaining that a large WAL was quarantined during write recovery, so the user is not surprised.
Blocks merge: MAYBE — the WAL is preserved (not deleted), so there is no silent permanent data loss. However, the stated DoD criterion "large WAL files are preserved for explicit recovery and never silently deleted/quarantined in a way that hides user data loss" is inconsistent with the current writable recovery path. Whether the rename-to-quarantine-dir counts as "preserved for explicit recovery" is a judgment call. Treat as a required follow-up at minimum.
FINDING 3 — No TOCTOU test for quarantine rename ENOENT in adapter-level callers
Severity: MEDIUM
Risk: No test verifies behavior when fs.rename in quarantineWalForMissingShadow throws ENOENT from a pool-adapter path (openReadOnlyDatabase, replayShadowPagesWithWritableOpen). The existing test at pool-wal-recovery.test.ts:247 covers ENOENT during the corrupt WAL quarantine path (.wal.corrupt.), not the missing-shadow quarantine path (.wal.missing-shadow.).
Evidence found: pool-wal-recovery.test.ts:247-258 tests ENOENT in tryQuarantineAndReopen (corrupt WAL path). No equivalent test exists for openReadOnlyDatabase's quarantineWalForMissingShadow call at pool-adapter.ts:388, nor for replayShadowPagesWithWritableOpen at pool-adapter.ts:351.
Recommended fix: Add a test where connectionQueryMock rejects with the missing-shadow error AND fs.rename rejects with ENOENT. Assert that initLbug either succeeds (ideal) or fails with an informative message (not a raw ENOENT stack trace).
Blocks merge: NO independently, but this gap is what allowed Finding 1 to ship undetected.
FINDING 4 — Error string matching in isMissingShadowSidecarError and isReadOnlyShadowReplayError is brittle
Severity: LOW
Risk: isMissingShadowSidecarError matches /Cannot open file .*\.shadow: No such file or directory/i (sidecar-recovery.ts:64). isReadOnlyShadowReplayError matches /replay shadow pages under read-only mode/i (pool-adapter.ts:317, lbug-adapter.ts:451). These are hardcoded native LadybugDB message fragments. A LadybugDB version bump that rewrites these strings (e.g., changes "Cannot open file" to "Failed to open file", or localizes error messages) would silently break recovery — the missing-shadow errors would no longer be caught, and serve would emit the native error raw.
Evidence found: Tests use realistic strings matching the current LadybugDB output (e.g., pool-wal-recovery.test.ts:175: "IO exception: Cannot open file ${dbPath}.shadow: No such file or directory"). No LadybugDB version pinned to a specific string contract is visible. The @ladybugdb/core version in gitnexus/package.json is the source of truth (not read in this review — verify directly).
Recommended fix: Add a comment citing the LadybugDB version(s) and source file where these messages originate. If LadybugDB provides structured error codes, prefer those. At minimum, widen the shadow-missing regex to tolerate minor variations (e.g., /.shadow.*[Nn]o such file|[Cc]annot open.*\.shadow/).
Blocks merge: NO — tests use realistic strings; immediate risk is low. Mark as a follow-up before the next LadybugDB major version bump.
FINDING 5 — Express preflight test does not verify PNA header end-to-end
Severity: LOW
Risk: web-ui-serving.test.ts:328-333 verifies that app.options(PNA_PREFLIGHT_PATH_REGEX, ...) does not throw at registration time, and that a real OPTIONS request returns 404 (because no CORS/PNA middleware is set up in the test app). This confirms the route registration fix but does not verify that Access-Control-Allow-Private-Network: true is returned in production, or that an unauthorized origin does not receive the private-network grant.
Evidence found: The global middleware setting Access-Control-Allow-Private-Network: true (api.ts:713-716) runs on all requests regardless of origin, meaning even non-allowlisted origins receive the PNA header. The CORS origin callback only controls Access-Control-Allow-Origin. A browser receiving the PNA header without Access-Control-Allow-Origin will still block the request — so there is no exploitable CORS bypass. The security boundary is intact.
Recommended fix: Add a test that instantiates the full CORS + PNA middleware stack and verifies Access-Control-Allow-Private-Network: true is returned on OPTIONS requests. Not blocking, but would complete the regression coverage claimed in the PR.
Blocks merge: NO — security boundary is maintained by existing CORS middleware; this is a coverage gap, not a defect.
FINDING 6 — warnedKeys module-level global survives across serve restarts within a process
Severity: INFO
Risk: warnedKeys at sidecar-recovery.ts:19 is a Set<string> in module scope. A missing-shadow warning for a given dbPath will fire at most once per process lifetime. If serve recovers and the condition recurs, subsequent warnings are silently downgraded to debug. This is intentional anti-spam behavior, but if conditions fluctuate (e.g., repeated checkpoint failures), users may miss persistent problems.
_resetSidecarRecoveryWarningsForTest is exported for test cleanup. Production has no reset path.
Blocks merge: NO — intentional design decision, test reset provided. Worth a follow-up if serve becomes long-lived and users report persistent sidecar issues being silent.
9. PR-specific assessment sections
LadybugDB/lbug sidecar expert lane
State machine reconstructed:
| State | Preflight action | Open action | Writable schema action | Post-close action |
|---|---|---|---|---|
| clean (no wal, no shadow) | return clean |
open normally | run schema queries | no-op |
| wal + shadow (healthy WAL) | return wal-with-shadow |
LadybugDB replays normally | run schema queries | no-op |
| tiny orphan wal (≤ 4KB, no shadow) | quarantine WAL (rename) → re-inspect → clean |
open normally | run schema queries | quarantine if still present |
| large orphan wal (> 4KB, no shadow) | warn, return orphan-wal, do NOT quarantine |
LadybugDB replay attempt | quarantine if schema query triggers missing-shadow ← inconsistency | warn, preserve |
| missing shadow after open (probe error) | N/A | probe → isMissingShadowSidecarError → quarantine + reopen |
— | no-op |
| shadow replay needed (read-only) | N/A | probe → isReadOnlyShadowReplayError → writable open → re-read-only |
— | no-op |
The state machine is largely sound. The inconsistency in the "large orphan wal → writable schema" cell is Finding 2. All other transitions are correct.
Quarantine file naming: ${dbPath}.wal.missing-shadow.${Date.now()}-${Math.random().toString(36).slice(2)} — timestamp + random suffix. Deterministic enough for forensic debugging (timestamp identifies when quarantine occurred). Collision-safe: the random 5-7 char suffix prevents collisions on rapid concurrent calls. listQuarantinedMissingShadowWals correctly enumerates these files.
EACCES/EPERM handling: finalizeLbugSidecarsAfterClose correctly re-raises non-ENOENT errors via warnOnce. The adapter-level callers swallow all errors in the quarantine catch block (converting them to "rebuild index" messages), which may hide EACCES/EPERM. This is acceptable for the recovery path but could make permission problems harder to diagnose.
Windows handle-release timing: safeClose in lbug-adapter.ts:1542 explicitly probes Windows handle release before calling finalizeLbugSidecarsAfterClose. The pool adapter closeOne uses conn.close().catch(() => {}) (fire-and-forget) with no Windows probe — this is a pre-existing pattern, not introduced by this PR.
Direct adapter lane
preflightLbugSidecars is called at doInitLbug:741 inside the acquireInitLock critical section (doInitLbug:703). This cross-process lock serializes concurrent direct-adapter opens for the same dbPath, effectively preventing the TOCTOU race in this path. ✓
ensureReadOnlyConnectionUsable (lbug-adapter.ts:498) is also inside the lock. Its connection handle lifecycle is correct:
- Initial handle closed before writable open (
closeLbugConnection(handle)at 507, 511, 516) ✓ - Writable handle closed in
finallyblock at 528-530 ✓ - Reopened handle closed on probe failure at 540 ✓
runSchemaCreationQueries (lbug-adapter.ts:555) correctly distinguishes:
isMissingShadowSidecarError→ returns error for caller to handleisWalCorruptionError→ callssafeClose()+resetOpenConnectionState()+ throws with actionable message ✓- Other errors: logs WARN if not "already exists" / busy / read-only ✓
resetOpenConnectionState (lbug-adapter.ts:548) clears currentDbPath, ftsLoaded, vectorExtensionLoaded, ensuredFTSIndexes — state reset is consistent on error paths. ✓
FTS: read-only open uses policy: 'load-only' at lbug-adapter.ts:780. Schema DDL is skipped for read-only opens (if (!readOnly) { ... runSchemaCreationQueries ... } at line 758). Read-only path does not trigger network INSTALL. ✓
safeClose (lbug-adapter.ts:1514): calls flushWAL then closes conn then db, then Windows handle probe, then finalizeLbugSidecarsAfterClose. Sequence is correct; no resources left open on the happy path. ✓
queryAndDrain resource management: lbug-adapter.ts:442 always drains (await drainQueryResult(queryResult)) before returning. Not verified from the implementation of drainQueryResult in this review, but callers depend on it consistently. No obvious leak found.
Pool adapter lane
The pool adapter does not use acquireInitLock. In-process deduplication via initPromises (pool-adapter.ts:454) is correct for single-process use. Cross-process protection is absent — this is the source of Finding 1.
openReadOnlyDatabase (pool-adapter.ts:367): preflight + read-only open + probe + quarantine-on-shadow-missing + second preflight + second open + second probe. Connection handle lifecycle in the missing-shadow path (pool-adapter.ts:384-407):
db.close()called when probe fails at 385 ✓dbset toundefinedat 386 ✓- On final throw at 423,
if (db) await db.close()✓
replayShadowPagesWithWritableOpen (pool-adapter.ts:342): opens a writable DB, probes, closes in finally. Correct handle management. ✓
Pool integrity check at pool-adapter.ts:698-705: throws if totalConns < MAX_CONNS_PER_REPO, catching connection leaks during pool warmup. ✓
A single bad WAL sidecar does NOT poison the pool — doInitLbug throws before the pool entry is registered (pool entry registered last, at line 602). Concurrent callers waiting on initPromises will all see the same error and each retry independently. ✓
Express 5 serve startup lane
The fix: api.ts:133: export const PNA_PREFLIGHT_PATH_REGEX = /^\/.*$/; used in place of the Express 4 '*' glob. Express 5 accepts regex routes. The regex matches all paths starting with / — correct scope for a global OPTIONS handler.
Behavior preserved: The handler at api.ts:722-724 calls next(). The Access-Control-Allow-Private-Network: true header is already set by global middleware at api.ts:713. This is correct — the OPTIONS handler does not need to set headers itself; it just lets the CORS middleware (which runs on all methods including OPTIONS) handle the response.
CORS security boundary: isAllowedOrigin (api.ts:60) returns false for non-allowlisted origins. The cors() middleware at api.ts:701 uses this. The PNA header (Access-Control-Allow-Private-Network: true) is sent to all origins by the global middleware, but browsers only act on it when Access-Control-Allow-Origin is also present — and that is controlled by the CORS callback. No security regression introduced. ✓
Test at web-ui-serving.test.ts:328-333: Verifies startup no longer crashes. The test makes a real OPTIONS request and gets 404 (no routes registered on the minimal test app). This confirms the route registration fix without false positives. Gap noted in Finding 5.
Test coverage lane
sidecar-recovery.test.ts (real FS, no mocks):
inspectLbugSidecars: clean state ✓, wal+shadow ✓- Preflight: tiny orphan WAL quarantined without WARN ✓, large orphan WAL not quarantined + WARN ✓
- Finalize: tiny orphan WAL quarantined ✓
- Disable via env var ✓
listQuarantinedMissingShadowWalsfilter ✓- Gap: No TOCTOU test (ENOENT from quarantine rename). No concurrent open simulation.
pool-wal-recovery.test.ts:
- Corrupt WAL quarantine + retry ✓, shadow replay ✓, missing shadow quarantine + reopen ✓, lock error no-quarantine ✓, retry failure error message ✓, no poisoned state after failure ✓
- Line 247 test ("handles quarantine gracefully when .wal file does not exist"): Tests ENOENT during the corrupt WAL quarantine path (
tryQuarantineAndReopen). This path does handle ENOENT (it throws, and the test assertsrejects.toThrow(/gitnexus analyze/)). This test does NOT cover the missing-shadow quarantine ENOENT path inopenReadOnlyDatabase.
lbug-adapter-wal-schema.test.ts:
- Writable missing-shadow quarantine + retry schema creation ✓
- Read-only open skips DDL, uses load-only FTS ✓
- Shadow replay with temp writable open ✓
- Read-only missing-shadow quarantine + reopen ✓
- WAL corruption mid-schema → safeClose called ✓
- Gap: No test for ENOENT from quarantine rename in
reopenReadOnlyAfterMissingShadoworreopenWritableAfterMissingShadow.
web-ui-serving.test.ts:
- Express 5 route registration no-throw ✓
- Real OPTIONS request handled without crash ✓
- Gap: No end-to-end test for
Access-Control-Allow-Private-Networkheader presence.
Overall: Tests prove the happy-path recovery and the large-WAL non-quarantine behavior. The TOCTOU error path (Finding 1) is not tested, which is why the defect shipped.
Cross-domain lane
Both changes (Express 5 route, lbug sidecar recovery) are framed as gitnexus serve startup stability. They are causally connected at the serve-startup level. The Express route fix is a 2-line change with no interaction with the lbug changes. Combining them in one PR is acceptable given the scope, but the lbug sidecar changes are substantially larger and independently risky. If the Express fix is urgently needed, it could be fast-tracked separately. As-is, the combined PR is structurally acceptable — but the blocking lbug defect (Finding 1) prevents the whole PR from merging.
10. Back-and-forth avoided by verifying
- Init lock scope verified:
acquireInitLockis a cross-process file lock atlbug-adapter.ts:310. It protects the direct-adapter init path includingpreflightLbugSidecars. This eliminates the TOCTOU risk in the direct-adapter path but does NOT cover the pool adapter. - Large WAL not deleted verified:
quarantineWalForMissingShadowusesfs.rename, notfs.unlink. Quarantined files are findable vialistQuarantinedMissingShadowWals.cleanQuarantinedMissingShadowWalsis the only deletion path and is not auto-invoked anywhere in this PR. - FTS policy verified: Read-only open uses
policy: 'load-only'at both direct adapter (lbug-adapter.ts:780) and pool adapter (pool-adapter.ts:595). No DDL or network INSTALL in read-only paths. - State reset verified:
resetOpenConnectionStateclears all 4 module-level fields (currentDbPath,ftsLoaded,vectorExtensionLoaded,ensuredFTSIndexes). Called on all error paths indoInitLbug. - Probe query safety verified:
MATCH (n) RETURN n LIMIT 1is valid Cypher on empty/no-node graphs — returns empty result set, not an error. No schema or extension requirement. No DDL side effect. ✓ - Branch hygiene verified:
git diff --name-status HEAD~1...HEADconfirms exactly 8 files. No i18n pollution from PR feat(i18n): make web and CLI language-aware #1748. Unicode scan clean.
11. Open questions (unavoidable)
- Live CI state: The 2 cancelled PR Description Check annotations must be confirmed as stale superseded runs before merge. Cannot verify from local checkout.
- LadybugDB version and string contract: The
@ladybugdb/coreversion ingitnexus/package.jsonwas not read. Verify the missing-shadow and shadow-replay error strings are stable across the pinned version range before merge. - Pool adapter cross-process init lock: Was the absence of
acquireInitLockinpool-adapter.tsintentional (e.g., "pool is MCP-only and single-process")? If the pool is always single-process, Finding 1's TOCTOU risk is substantially reduced to a theoretical concern — but the ENOENT handling in the recovery callers remains incorrect regardless.
12. Final verdict
not production-ready
Justification: Finding 1 is a confirmed defect: quarantineWalForMissingShadow propagates ENOENT from fs.rename as a fatal error in all pool-adapter callers (openReadOnlyDatabase, replayShadowPagesWithWritableOpen), while finalizeLbugSidecarsAfterClose correctly treats ENOENT as non-fatal. The fork PR's Codex review explicitly flagged this, and it remains unaddressed. Two concurrent pool-adapter processes opening the same DB path can race on the WAL quarantine, causing serve initialization to abort with a spurious "rebuild the index" error even though the WAL was already handled. Finding 2 (large WAL quarantined during writable schema creation despite preflight deliberately warning-only for large WALs) is a secondary concern — the WAL is preserved via rename, but the behavior is inconsistent with the stated design goal. Both findings require small, targeted fixes. The Express 5 route fix and the overall sidecar recovery architecture are sound and well-tested, and both fixes should be unblocked once ENOENT handling is added to the quarantine rename call sites in the pool adapter paths.
|
@ChamHerry Could you please look into Claude's findings and address them? 🙏 |
|
I'm working on it now! This is an important fix and thank you for raising it! 🙏 |
…aware errors (PR abhigyanpatwari#1747 review) Addresses the production-readiness review of PR abhigyanpatwari#1747 (Findings 1, 2, 3 of 6). Findings 4, 5, 6 are deferred to follow-ups per the plan. 1. ENOENT-tolerance scoped to pool-adapter callers only - `quarantineWalForMissingShadow` stays strict in `sidecar-recovery.ts`. The direct adapter calls it inside `acquireInitLock` (cross-process file lock) — ENOENT there means the file vanished under lock and remains a real bug to surface. - New `tryQuarantineForMissingShadow` local helper in `pool-adapter.ts` returns a discriminated union { kind: 'quarantined', path } | { kind: 'peer-handled' }. Catches ENOENT, re-verifies via statIfExists, and converts to 'peer-handled' only when WAL really is gone. Defensive: if ENOENT but WAL still present, throws as classified error rather than silently returning success. 2. Symmetric WAL-size gate on both recovery paths - `refuseLargeWalQuarantine` applied in both `reopenReadOnlyAfterMissingShadow` and `reopenWritableAfterMissingShadow`. Closes the read-only data-loss vector (large orphan WAL silently discarded would never be replayed by a later writable open). 3. Permission-aware error classifier - New `renameFailureMessage` and `isPermissionRenameError` in `sidecar-recovery.ts`. EACCES / EPERM / EBUSY now surface a permission-specific message pointing at ACLs, AV exclusions, and file-locks. Other codes (ENOSPC, EROFS, EIO, ENOENT) fall through to `shadowSidecarRecoveryMessage`. - Used at both pool-adapter and direct-adapter caller catches around `quarantineWalForMissingShadow`. - `doInitLbug`'s pass-through classifier extended to include the new permission message. The lock-retry substring match tightened so "file-lock error" in the permission message is not mistaken for a LadybugDB lock-retry trigger. Tests - sidecar-recovery.test.ts: 7 new tests for `renameFailureMessage` and `isPermissionRenameError`. - pool-wal-recovery.test.ts: 6 new tests covering ENOENT race, EACCES/EPERM/EBUSY classification, ENOSPC fallthrough, and the defensive "WAL still present after ENOENT" branch. - lbug-adapter-wal-schema.test.ts: 5 new tests covering the symmetric size gate on both recovery paths, including the boundary at exactly TINY_ORPHAN_WAL_BYTES (4096) and the off-by-one at 4097. Deferred (tracked as follow-up work) - Brittle LadybugDB error-string matching (Finding 4). - PNA header end-to-end coverage gap (Finding 5). - warnedKeys module-global persistence (Finding 6). - Cross-process init lock for pool-adapter.
# Conflicts: # gitnexus/src/server/api.ts
…am (PR abhigyanpatwari#1747 review, Findings 4 & 6) Smallest viable response to the two remaining non-blocking findings from the production-readiness review of PR abhigyanpatwari#1747. An earlier-revision plan proposed regex widening + a near-miss detector + per-dbPath warn scoping; an adversarial doc-review found those defended against hypothetical strings LadybugDB does not produce, added observability theater with no recovery behavior change, and did not actually fix the long-running gitnexus serve case for hot dbPaths (where finalizeLbugSidecarsAfterClose rarely fires). Scope shrunk to dedup + counter-based — strictly behavior-changing and fully testable. Finding 4 — dedup + version-coupling markers - `isReadOnlyShadowReplayError` was inlined in both `lbug-adapter.ts:451` and `pool-adapter.ts:317`. Centralized as an export from `sidecar-recovery.ts`. The two local copies are removed; both adapters now import from the shared module. - Both LadybugDB-coupled predicates (`isMissingShadowSidecarError` and `isReadOnlyShadowReplayError`) gain a `// LADYBUGDB-CONTRACT:` marker comment citing `@ladybugdb/core ^0.16.1`. When bumping LadybugDB, `git grep "LADYBUGDB-CONTRACT"` enumerates every version-coupled spot. - Strict matcher unchanged — when LadybugDB actually changes the error format, the failure mode stays loud (raw native error propagates) and the markers make every affected predicate trivially greppable. Finding 6 — counter-based warn anti-spam - `warnedKeys: Set<string>` → `warnedKeyCounts: Map<string, number>`. `warnOnce` keeps its signature `(logger, key, message)` and keying convention unchanged — the swap is internal. - `WARN_MILESTONES = [1, 10, 100, 1000, 10000]`. Logarithmic spacing gives O(log N) warns for a condition that fires N times. Past the first occurrence the warn message is suffixed with "(Nth occurrence of this condition)" so persistence is visible in the log line itself. - Solves the long-running serve case: a hot dbPath hitting the same condition 100 times now fires 3 warns (occurrences 1, 10, 100) instead of 1 warn + 99 silent debug lines. Tests (10 new in sidecar-recovery.test.ts, all green) - Centralized isReadOnlyShadowReplayError: positive match, false-positive guard, structural assertion that the duplicate regex is gone from both adapter files, LADYBUGDB-CONTRACT marker count. - Counter-based warnOnce: milestone-at-10 with suffix, milestone-at-100, key isolation across dbPaths, reset zeroes the counter, first-occurrence message does NOT carry the suffix. Deferred (tracked separately) - Finding 5 — PNA header end-to-end coverage gap (CORS boundary is sound). - LadybugDB structured error codes (if/when the library exposes them). - Per-call milestone configurability — re-open if tuning is needed.
✨ PR AutofixFound fixable formatting / unused-import issues across 96 changed lines. Comment |
|
/autofix |
|
✅ Applied autofix and pushed a commit. (apply run) |
Add shared CLI translation resources and web i18next/react-i18next wiring so GitNexus can present core command and UI text in English or Chinese while preserving existing defaults. Constraint: PR abhigyanpatwari#1748 was previously stacked on obsolete abhigyanpatwari#1747 work, so this branch is rebuilt from upstream/main and contains only i18n changes.\nRejected: Keeping the old stacked branch | it carried obsolete lbug sidecar commits and stayed conflicting after abhigyanpatwari#1747 merged.\nConfidence: medium\nScope-risk: broad\nDirective: Keep future i18n follow-ups separated from native-worker and lbug recovery fixes; update both locale trees together when adding translatable UI text.\nTested: GitNexus impact analysis for App, Header, useAppState, cleanCommand, doctorCommand; GitNexus detect_changes on staged linked worktree; CLI i18n vitest targets; web i18n vitest targets; gitnexus npx tsc --noEmit; gitnexus-web npx tsc -b --noEmit; gitnexus-web npm test; gitnexus-web npm run build; gitnexus npm run build.\nNot-tested: Browser visual/manual language switching; production Vercel deployment; full gitnexus npm test ended with Vitest unhandled simulated worker-pool startup crash after all 355 files and 9180 assertions passed.
Add shared CLI translation resources and web i18next/react-i18next wiring so GitNexus can present core command and UI text in English or Chinese while preserving existing defaults. Keep the heartbeat reconnect banner on a dedicated translation key so disconnect recovery remains distinguishable from initial server connection failures. Constraint: PR abhigyanpatwari#1748 was previously stacked on obsolete abhigyanpatwari#1747 work, so this branch is rebuilt from upstream/main and contains only i18n changes. Rejected: Keeping the old stacked branch | it carried obsolete lbug sidecar commits and stayed conflicting after abhigyanpatwari#1747 merged. Confidence: medium Scope-risk: broad Directive: Keep future i18n follow-ups separated from native-worker and lbug recovery fixes; update both locale trees together when adding translatable UI text. Tested: GitNexus impact analysis for App, Header, useAppState, cleanCommand, doctorCommand, and AppContent; GitNexus detect_changes on the linked worktree; CLI i18n vitest targets; web i18n vitest targets; gitnexus npx tsc --noEmit; gitnexus-web npx tsc -b --noEmit; gitnexus-web npm test; gitnexus-web npm run build; gitnexus npm run build; local Playwright heartbeat-reconnect e2e spec. Not-tested: Full browser visual/manual language switching; production Vercel deployment; full gitnexus npm test ended with Vitest unhandled simulated worker-pool startup crash after all 355 files and 9180 assertions passed.
Add shared CLI translation resources and web i18next/react-i18next wiring so GitNexus can present core command and UI text in English or Chinese while preserving existing defaults. Keep the heartbeat reconnect banner on a dedicated translation key so disconnect recovery remains distinguishable from initial server connection failures. Constraint: PR abhigyanpatwari#1748 was previously stacked on obsolete abhigyanpatwari#1747 work, so this branch is rebuilt from upstream/main and contains only i18n changes. Rejected: Keeping the old stacked branch | it carried obsolete lbug sidecar commits and stayed conflicting after abhigyanpatwari#1747 merged. Confidence: medium Scope-risk: broad Directive: Keep future i18n follow-ups separated from native-worker and lbug recovery fixes; update both locale trees together when adding translatable UI text. Tested: GitNexus impact analysis for App, Header, useAppState, cleanCommand, doctorCommand, and AppContent; GitNexus detect_changes on the linked worktree; CLI i18n vitest targets; web i18n vitest targets; gitnexus npx tsc --noEmit; gitnexus-web npx tsc -b --noEmit; gitnexus-web npm test; gitnexus-web npm run build; gitnexus npm run build; local Playwright heartbeat-reconnect e2e spec. Not-tested: Full browser visual/manual language switching; production Vercel deployment; full gitnexus npm test ended with Vitest unhandled simulated worker-pool startup crash after all 355 files and 9180 assertions passed.
Add shared CLI translation resources and web i18next/react-i18next wiring so GitNexus can present core command and UI text in English or Chinese while preserving existing defaults. Keep the heartbeat reconnect banner on a dedicated translation key so disconnect recovery remains distinguishable from initial server connection failures. Constraint: PR abhigyanpatwari#1748 was previously stacked on obsolete abhigyanpatwari#1747 work, so this branch is rebuilt from upstream/main and contains only i18n changes. Rejected: Keeping the old stacked branch | it carried obsolete lbug sidecar commits and stayed conflicting after abhigyanpatwari#1747 merged. Confidence: medium Scope-risk: broad Directive: Keep future i18n follow-ups separated from native-worker and lbug recovery fixes; update both locale trees together when adding translatable UI text. Tested: GitNexus impact analysis for App, Header, useAppState, cleanCommand, doctorCommand, and AppContent; GitNexus detect_changes on the linked worktree; CLI i18n vitest targets; web i18n vitest targets; gitnexus npx tsc --noEmit; gitnexus-web npx tsc -b --noEmit; gitnexus-web npm test; gitnexus-web npm run build; gitnexus npm run build; local Playwright heartbeat-reconnect e2e spec. Not-tested: Full browser visual/manual language switching; production Vercel deployment; full gitnexus npm test ended with Vitest unhandled simulated worker-pool startup crash after all 355 files and 9180 assertions passed.
Summary
gitnexus servecan start reliably.Why
Local serving could surface repeated
lbug.shadowmissing-sidecar warnings after interrupted writes or checkpoint edge cases. The recovery now happens before read-only open for safe tiny orphan WALs, while larger WALs are preserved for explicit recovery.Validation
cd gitnexus && npx vitest run test/unit/sidecar-recovery.test.ts test/unit/lbug-adapter-wal-schema.test.ts test/unit/pool-wal-recovery.test.ts test/unit/web-ui-serving.test.ts && npx tsc --noEmitgitnexustypecheckFollow-up
The i18n changes are intentionally kept on a separate stacked branch (
ChamHerry:codex/i18n-20260521). They should be submitted to upstream after this PR lands or after an upstream base branch is available, so reviewers do not see this stability diff duplicated in the i18n PR.