fix(parse): survive non-cloneable worker results so large-repo analyze doesn't crash (#2112)#2135
Conversation
…e doesn't crash (abhigyanpatwari#2112) A parse worker delivers its accumulated result to the main thread via postMessage, which structured-clones the payload synchronously on the worker thread and throws a DataCloneError on the first value it can't serialize. The reporter's case was a node `properties` value pointing at a native `toString`. The worker re-posted the throw as {type:'error'}, the pool counted it as a worker death, and under GITNEXUS_WORKER_POOL_SIZE=1 the same graph re-threw on every respawn until the slot's budget was exhausted and the whole parse phase aborted -- defeating even the conservative single-worker workaround. Add a clone-safety net at the worker result boundary. On a clone failure the worker isolates the offending file, strips the non-cloneable value from a plain extraction record (keeping the record -- strictly-missing data, never wrong) or drops a whole ParsedFile so scope-resolution re-derives it on the main thread with intact edge data, records the affected paths on the result, warns naming the field + file so the leak is diagnosable, and re-posts. Healthy runs are byte-identical: the net runs only after a real DataCloneError, so there is zero overhead on the fast path. Skipped paths surface via the parsing processor alongside the skipped-language telemetry. The strip drops the same values the store path's JSON.stringify already silently removes, so store/no-store runs converge. Scope: PR-1 -- failure mode C, the deterministic POOL_SIZE=1 killer. The timeout/native-abort graceful-degradation cascade (failure modes A & B) is coupled to downstream-exclusion + a hard worker watchdog and is tracked as follow-up work. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@magyargergo is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 11018 tests passed 16 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
magyargergo
left a comment
There was a problem hiding this comment.
Tri-review: 3 methods — GitNexus swarm + Compound-Engineering personas + Codex
Engine breakdown: 6 Claude reviewer lanes (risk, test/CI, correctness, adversarial, maintainability, performance) + Codex (live — the one genuinely independent engine). Two of the three methods are Claude under different persona prompts, so Claude-only agreement is "consistent across personas," not independent confirmation; the strong signal is Codex+Claude agreement, flagged below.
What's solid (validated by the review, not merely unchallenged)
- Healthy path is zero-overhead — all lanes + Codex confirm the fast path is a single
postMessagewith an earlyreturn; the sanitizer runs only inside theDataCloneErrorcatch. worker-pool.tsis genuinely comments-only — risk + correctness + Codex each diffed it; no logic change (two stale comments corrected).- The fix neutralizes the #2112 deterministic
POOL_SIZE=1killer — theDataCloneErroris intercepted before the worker re-posts{type:'error'}; the integration GREEN/RED pair demonstrates it end-to-end. skippedPathsis backward-compatible (optional,?? []-guarded everywhere) and structured-clone reject-set parity holds for the realParseWorkerResultshapes — the correctness lane probed functions/symbols/Promises/WeakMaps/data-class-instances/BigInt/boxed-primitives/null-proto/ArrayBuffer/getters empirically. The earlier "Maps don't clone" hypothesis stays refuted — Maps clone fine and are left untouched.
Headline (inline) — [P2 · defensive hardening] the safety net can re-arm the cascade it fixes
In postResultCloneSafe the recovery call makeWorkerResultCloneSafe(...) and the re-post sit outside the try/catch (which wraps only the first post), and containsNonCloneable/stripNonCloneable recurse with a cycle guard but no depth bound. A throw inside the sanitizer — a RangeError from a deeply-nested record (the adversarial lane reproduced this at depth ≥3000 against the real exported helper) or a throwing getter — escapes to the handler's {type:'error'}, which under POOL_SIZE=1 is exactly the worker-death → respawn-exhaustion → phase-abort cascade this PR fixes. Reachability is low and both engines agree (Codex + correctness: real parse-result records are shallow plain-JSON, no getters), so this is hardening, not a production-likely trigger. Cheap fix: wrap the sanitizer + re-post in try/catch (fail closed deliberately) and bound the recursion depth.
Inline — [P3] integration-test fidelity (Codex + 2 Claude lanes)
The GREEN worker re-implements postResultCloneSafe inline (it imports the real helper functions but hand-rolls the orchestration), so the production wiring — the {type:'warning'} post and the skippedPaths append/merge — has no integration coverage. The RED control asserts a bare .rejects.toThrow(), so an unrelated failure would satisfy it. (CI builds dist/ via setup-gitnexus build:'true' before vitest run, so the "stale dist → false green" risk is mitigated in CI; it bites only local runs that skip the build.)
Lower-priority (body only)
- [P3]
stripNonCloneableover-drops DAG-aliased records (Claude-only): the sharedseenset returns the original on revisit, so when a non-cloneable is reachable via two paths the whole record is dropped (reported as skipped) instead of stripped-and-kept — contradicting the "record kept" intent. The last-resort guard prevents a crash. Fix: memoize the stripped copy per object. - Failure-path efficiency (Claude-only, severity disputed): the performance lane flags a transient ~2× memory spike (
containsNonCloneablestructuredClones a whole non-plain subtree to probe it — on a path that fires exactly when the worker is near its heap limit), plus a 2× per-element scan and a scan of all ~15 array fields regardless of which is dirty. The adversarial lane measured it as bounded (~185 ms / 1000 records, once per flush) and refuted it as a DoS. In store-mode the non-plainScopeMaps live inparsedFiles, which are emptied before the post, so the worst case is mostly the no-store path. A single-pass + shared-seencleanup is worthwhile but not blocking. - [P2 type-safety] maintainability: the
result as unknown as Record<string,unknown>double-cast and the untypedSet(['parsedFiles'])/Set(['skippedPaths'])field-name literals would silently break the drop-whole protection if a field is renamed — pin them withsatisfies keyof ParseWorkerResult. - [P3]
parsing-processorlogs the skipped paths a second time but drops the per-file reason the worker already logged; [P3]findFilePathcan attribute to a sibling child's path (telemetry-only); [P3]skippedPathsisn't zeroed inslimParseWorkerResultsForCache→ minor cache-shard bloat (ignored on replay, no correctness impact).
Test gaps
No coverage for: a deeply-nested / throwing-getter / detached-buffer element (the slow-path escape); a DAG-aliased element (the over-drop); the {type:'warning'} + skippedPaths production wiring; the last-resort "unsalvageable → drop" branch.
CI
Quality (typecheck/lint/format/typecheck-web), CodeQL, gitleaks, benchmarks, packaged-install-smoke (ubuntu), tree-sitter ABI — pass. Pending at review time: Analyze (js-ts), Build & Push, windows/macos platform tests, ubuntu coverage (runs the new unit+integration tests). Vercel fail = deploy-authorization, unrelated to code.
Automated multi-tool digest (GitNexus swarm + CE personas + Codex), critic-gated. Verify before acting — the headline is a defensive-hardening recommendation with low production reachability, not a known production break.
…abhigyanpatwari#2135 review) The clone-safety recovery path could re-arm the abhigyanpatwari#2112 worker-death cascade it was built to prevent: in postResultCloneSafe the sanitizer call and the re-post sat outside the try/catch, and containsNonCloneable/stripNonCloneable recursed with a cycle guard but no depth bound. A throw inside the sanitizer (a RangeError from a deeply-nested record, reproduced at depth >=3000) escaped to the message handler's {type:'error'}, which under GITNEXUS_WORKER_POOL_SIZE=1 is the respawn-budget-exhaustion abort. Wrap the sanitizer + re-post in their own try/catch so any throw fails closed to a primitive-only {type:'error'} deliberately, and thread a MAX_CLONE_DEPTH bound through both scan/strip functions so an over-deep subtree is treated as non-cloneable (dropped/undefined) instead of overflowing the stack. The isStructuredCloneable catch-all is left broad on purpose — it bounds structuredClone's own internal recursion in the non-plain-object probe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… buffers (abhigyanpatwari#2135 review) Two sanitizer-defeat vectors let the re-post throw a DataCloneError again: - A throwing getter on a record: containsNonCloneable/stripNonCloneable read obj[key], so a getter that throws escaped the scan/strip pass. Read defensively — a throwing property read is treated as non-cloneable (scan returns true, strip drops the property). - A detached ArrayBuffer/TypedArray: both passed buffers/views through unconditionally, but structuredClone rejects a detached one, so the re-post threw. Route buffers/views through the authoritative isStructuredCloneable probe instead. No byteLength heuristic — a legitimately empty new Uint8Array(0) also has byteLength 0 yet clones fine, so a length check would false-positive. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r-dropped (abhigyanpatwari#2135 review) stripNonCloneable carried a shared `seen` WeakSet and returned the ORIGINAL (un-stripped) value on revisit. When a non-cloneable was reachable via two paths (a DAG), the second path spliced the original function-bearing object back into the output, so the rebuilt element failed the last-resort isStructuredCloneable guard and the whole record was dropped as "unsalvageable" — contradicting the "record kept, value stripped" contract. Replace the WeakSet with a Map<object, stripped-copy>: allocate the empty copy, memoize it before recursing into children (so cycles return the in-progress copy), and return the memoized copy on revisit. DAG-aliased subtrees now collapse to one shared stripped copy and are kept-and-stripped, not dropped. The array branch moves from .map() to allocate-then-push so its identity can be pre-inserted. Object Map/Set keys aren't identity-preserved across stripping — acceptable because parse-result Maps are primitive-keyed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…bhigyanpatwari#2135 review) makeWorkerResultCloneSafe scanned each dirty array twice — a field-level whole-array containsNonCloneable probe, then a per-element pass — and always reassigned the field. Fold into one per-element pass that builds the output array lazily (copying the clean prefix only once the first dirty element appears) and reassigns the field only when something changed. A fully-clean array is now scanned once and keeps its referential identity; the clean prefix of a dirty array is copied by reference. Behavior is otherwise identical (failure-path-only code). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…keyof (abhigyanpatwari#2135 review) makeWorkerResultCloneSafe carried a generic `<T extends Record<string,unknown>>` that was never load-bearing (it mutates in place and returns {skipped}), and the call site passed untyped string-literal option sets — so renaming `parsedFiles` or `skippedPaths` would silently disable the drop-whole / skip protection. Drop the generic (plain `Record<string,unknown>` param) and type the option sets at the call site as `Set<keyof ParseWorkerResult>`, so a field rename is now a compile error. The `as unknown as Record<string,unknown>` widening stays — it's the standard cast for a no-index-signature interface (TS rejects a single-step `as`); the function genuinely operates structurally on the result's arrays. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…higyanpatwari#2135 review) The processor's skipped-file warning logged only the paths, dropping the per-file reason the worker already attached — losing the distinction between a recoverable "stripped N value(s)" and a whole-record "dropped" entry. Format each entry as `path (reason)` so the aggregate line carries the diagnostic detail. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…higyanpatwari#2135 review) findFilePath swept all child objects one level deep in Object.keys order, so a ParsedNode could be attributed to a sibling child's path-like key instead of its real path at properties.filePath. Check the known `properties` child first, then fall back to the generic sweep, so node attribution is deterministic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ri#2135 review) slimParseWorkerResultsForCache spread the worker result without clearing the clone-safety skippedPaths telemetry, so a sanitized result persisted its skip list into the on-disk parse-cache shard. Replay already ignores the field; zero it (like calls/assignments/parsedFiles) to keep shards lean and the intent explicit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ontrol (abhigyanpatwari#2135 review) The integration GREEN worker re-implemented postResultCloneSafe inline, so the production wiring (the {type:'warning'} post + the skippedPaths append) had no coverage, and the RED control asserted a bare .rejects.toThrow() that any failure would satisfy. Extract postResultCloneSafe into a side-effect-free module (post-result.ts) — importing it from the parse-worker entry module would construct the parser, post ready, and attach the real handler — and have the GREEN test worker import and call the real one. Tighten the RED matcher to the actual abort contract (/circuit breaker|consecutive failures|respawn budget|could not be cloned/), which also documents that the raw poison result aborts via the pool's consecutive-failure circuit breaker under POOL_SIZE=1. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tri-review findings addressed (9 commits)All findings from the tri-review are fixed, one commit each (rebased onto the updated branch):
Notes worth surfacing:
Verified locally: 🤖 Generated with Claude Code |
…nly DataCloneError (abhigyanpatwari#2135 review) The V8 structured-clone research surfaced the net's one real correctness hole: structuredClone invokes getters, and a getter that THROWS surfaces its own error (a RangeError, etc.) — NOT a DataCloneError (confirmed against a real MessageChannel). postResultCloneSafe gated recovery on isDataCloneError, so such a throw re-threw past the sanitizer and re-armed, under POOL_SIZE=1, the worker-death cascade the net exists to prevent. Attempt the sanitize + re-post recovery for ANY first-post failure (the sanitizer already reads properties defensively, so a throwing getter is dropped), falling closed to a primitive-only {type:'error'} only if the re-post still fails. Adds an integration case: a node with a throwing getter is recovered and delivered, not re-thrown. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nostic (abhigyanpatwari#2135 review) The clone-safety net's skip reason named only the array field + file ("stripped 1 value from nodes"), not the offending property key — which is precisely why the original abhigyanpatwari#2112 leak stayed unpinned. Thread a dotted key path through stripNonCloneable (recording each stripped value's path: properties.toString, meta.data[3], …) and surface the first few in the reason ("from nodes: properties.toString"). Now a single log line — or the contract/strict checks — names the leaking property, so a residual runtime escape can be fixed at source. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tructured-cloneable (abhigyanpatwari#2135 review) Shape-regression guard: builds a representative ParseWorkerResult (typed as the real interface) and asserts isStructuredCloneable. Typing it as ParseWorkerResult makes adding a new boundary field a compile error here until the test is updated, and the runtime assert catches a field whose type regresses to a non-cloneable shape — independent of language input. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…udly instead of silent sanitize (abhigyanpatwari#2135 review) The runtime net's silent recovery in production is exactly what let the original abhigyanpatwari#2112 leak stay unpinned. Add an opt-in strict mode (GITNEXUS_STRICT_CLONE=1, inherited by workers): on a clone failure, postResultCloneSafe THROWS with the exact offending key path instead of sanitizing + delivering, so a leak introduced by a future provider/extractor change fails loudly at its origin (CI/dev) rather than being quietly stripped. Off in production, where the net keeps the run alive. Adds a self-contained integration case (sets the flag, asserts the poison run rejects with the key path) and skips the synthetic-poison suite under a global strict run (its value there is running the REAL-extractor integration tests under strict). Wiring a strict CI lane (GITNEXUS_STRICT_CLONE=1 on a vitest integration step) is left to the maintainer — it needs a green full-suite verification and touches the protected workflow. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Root-cause deep-dive + prevention layerA deep investigation of the V8/Node structured-clone serializer (the
Added to this PR (the runtime guarantee)
Deferred (compile-time complement) → #2143The typed The runtime net stays as the production backstop; strict mode + the contract test give the CI/test guarantee that a silent recurrence can't slip through unnoticed. 🤖 Generated with Claude Code |
…boundary (abhigyanpatwari#2112) The forked analyze worker reports completion to the parent over child_process IPC, which uses Node's DEFAULT 'json' serialization (api.ts forks with no `serialization:` option). `AnalyzeResult.pipelineResult` is populated on every successful analysis and carries `pipelineResult.graph` — the live KnowledgeGraph closure object. Sending the raw result is wrong three ways: (1) the graph's nodes/relationships getters force-materialize the entire graph into two arrays and JSON-stringify them on every analyze, discarded immediately (a multi-hundred-MB no-op on a large repo — the abhigyanpatwari#2112 scenario); (2) the graph's methods are own function properties that JSON drops silently, so a surviving graph is a husk whose forEachNode() throws far from the cause; (3) a BigInt/circular value anywhere in the payload makes process.send throw TypeError synchronously — caught and re-sent as {type:'error'}, mis-reporting a SUCCESSFUL analysis (DB already written) as a FAILURE. This is the abhigyanpatwari#2112 failure family on the server path, and unlike the parse-worker result boundary it has no clone-safety net. The parent (api.ts) reads only result.repoName; pipelineResult's real consumers (CLI skill generation, cli/analyze.ts) call runFullAnalysis in-process and never cross this fork. So project the result down to an explicit JSON-safe allowlist of scalar fields. Typed as Omit<AnalyzeResult,'pipelineResult'> so a future non-serializable field added to AnalyzeResult fails to compile until handled here deliberately. Found by the abhigyanpatwari#2112 cross-process serialization-boundary audit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…boundary guard (abhigyanpatwari#2143) The runtime clone-safety net is the production backstop; this is its compile-time complement. The worker result is plain data except a few `unknown`-typed sinks (a node's `properties` bag, the provider `extractTemplateConstraints` / `collectCaptureSideChannel` hook returns) — `unknown` lets a non-serializable value (a function, a leaked tree-sitter SyntaxNode, …) cross the structured-clone boundary with no compile-time guard. That is the structural hole abhigyanpatwari#2112 leaked through. `Cloneable<T>` is a homomorphic recursive mapped type that maps a function or symbol member to `never`, so a struct carrying one is no longer assignable to its own `Cloneable<T>`. `assertCloneable(value)` is a runtime identity (zero cost) whose parameter is `T extends Cloneable<T> ? T : Cloneable<T>`, so a clone-unsafe argument fails to compile, naming the offending key. Because it is a homomorphic mapped type it preserves `interface` shapes and `readonly` modifiers and needs NO index signature on the payload types — this sidesteps the "closed interface is not assignable to a recursive index-signature type" wall that blocked the original value-typed-`Cloneable` attempt (the reason abhigyanpatwari#2143 was deferred from PR abhigyanpatwari#2135). The conditional parameter type avoids the `T extends Cloneable<T>` circular-constraint error. Tests: runtime identity contract, plus type-level @ts-expect-error assertions (enforced by tsconfig.test.json) that a function/symbol member is rejected and clean interface payloads are accepted. Applied to the real provider hooks in the next commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…able (abhigyanpatwari#2143) Apply the compile-time guard to the provider hooks that feed the `unknown`-typed worker-result sinks, so a future non-serializable value in their payloads is a compile error at the source site rather than a runtime DataCloneError at the worker post: - C++ extractTemplateConstraints (CppConstraintPayload) - C++ collectCaptureSideChannel (CppCaptureSideChannel) - C collectCaptureSideChannel (CCaptureSideChannel) - Kotlin collectCaptureSideChannel (KotlinCaptureSideChannel) The C++ template-constraint adapter previously returned `unknown`; it now returns the concrete `CppConstraintPayload | undefined` and routes its payload through `assertCloneable`. The side-channel hooks are wrapped at their provider wiring sites. `assertCloneable` is a runtime identity, so behavior is unchanged (C static-linkage + C++ constraint suites stay green); the guarantee is the type-check — src tsc now proves every nested member of those real payload trees is structured-clone safe. Test: type-level assertions (enforced by tsconfig.test.json) that each concrete payload type is `Cloneable<T>`, INDEPENDENT of the provider wiring — so the regression is caught even if the assertCloneable wrapper is later removed. Proven non-vacuous (a function-bearing type fails the same assertion). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
✨ PR AutofixFound fixable formatting / unused-import issues across 28 changed lines. Comment |
magyargergo
left a comment
There was a problem hiding this comment.
Tri-review (dismiss-and-replace): clone-safety + #2143 + analyze-worker IPC
Methods — 3, with engine breakdown: 7 Claude lanes (GitNexus swarm: risk, test/CI · CE personas: correctness, adversarial, performance, maintainability, testing) + Codex (live — the only independent engine). Two Claude lanes share priors, so Codex+Claude agreement is weighted strong and Claude-only as "consistent across personas." This review supersedes the prior tri-review (its two findings — recovery-path re-arm and test fidelity — were addressed in this PR's review-fix commits; those stale inline comments are minimized).
Verdict: the runtime clone-safety net and the #2143 compile-time guard are sound and well-tested for every payload the parse pipeline actually produces. Codex found no P0–P2 defect; adversarial+correctness refuted the scary hypotheses. The findings below are defense-in-depth holes for value shapes that don't occur today (P2/P3) plus one real CI failure. No P0/P1.
Credit (validated, not merely unchallenged):
- Fast-path is genuine zero-overhead —
postResultCloneSafeposts as-is and returns; the sanitizer runs only after a real post failure (performance lane, confirmed). - The analyze-worker IPC projection drops the right field — Codex + correctness confirmed the parent (
api.ts:1560) reads onlyresult.repoName; omittingpipelineResult(the liveKnowledgeGraph) is correct and a perf win. - Refuted (validation is a feature): general sanitizer divergence on non-plain objects (both paths defer to the authoritative
structuredCloneprobe), detached-buffer false-negatives, BigInt/circular instats(allNumber()-wrapped), boxed primitives / SharedArrayBuffer views, probe-induced OOM/hang, deep/cyclic/DAG over-drop, strict-mode double-throw, and transferList detached-buffer reuse.
Headline findings (inline)
- [P2]
analyze-worker-ipc.ts— type/runtime contract split; theOmit<>guarantee is overstated. (3 lanes · code-read) The typeAnalyzeResultIpc = Omit<AnalyzeResult,'pipelineResult'>(:42) includesisPrimaryBranch?, but the runtimeprojectAnalyzeResultForIpc(:54-61) is an explicit allowlist that omits it — so the type claims a field the wire never carries. The doc's "a future field fails to compile until handled here" holds only for required fields; an optional one (the common case forAnalyzeResult) is silently absent. No live break (parent reads onlyrepoName;isPrimaryBranch?:booleanatrun-analyze.ts:200isn't consumed server-side over IPC). Fix:Pick<AnalyzeResult, 'repoName'|'repoPath'|'stats'|'alreadyUpToDate'|'ftsRepairedOnly'|'ftsSkipped'>makes the allowlist the type — exhaustive by construction. - [P3]
clone-safety.ts— the array branch ignores non-index own-enumerable properties. (correctness · reproduced)structuredCloneof an array carrying e.g.arr.meta = fnTHROWS, butcontainsNonCloneable/stripNonCloneableiterate numeric indices only (:169-173/:279-284). SocontainsNonCloneablereturns false,makeWorkerResultCloneSafeleaves the field unrewritten (skipped:[]), the re-post still throws, falls through to the outer catch (post-result.ts:87), and emits{type:'error'}— the same worker-death signal that arms thePOOL_SIZE=1cascade #2112 fixes. Not pipeline-reachable today (no producer attaches non-index array props), but a hole in a net whose purpose is the unanticipated leak. Fix: also scan/copy non-indexObject.keys, or fall back to theisStructuredCloneableprobe when an array's own-key set exceeds its indices. - [P3]
clone-safety.tsfindFilePathviolates its documented "never throws"; a sanitizer-internal throw escapes recovery. (correctness + adversarial · reproduced)findFilePath(:347doc) readsrec[key]unguarded in its generic sweep (:353,:361-363;pathFromChild:338-345). A throwing getter at a non-path key (reproduced:makeWorkerResultCloneSafethrowsRangeErrorout) — or a Proxy with a throwinggetPrototypeOf/ownKeystrap (adversarial) — escapes to the fail-closed{type:'error'}, mis-reporting a recoverable result as a worker death. Not pipeline-reachable (real records aren't Proxies/throwing getters), but the "recover from ANY failure" promise doesn't cover throws during the sanitizer's own structural enumeration. Fix: try/catchfindFilePath's reads (honor its doc) and wrap each element's sanitize inmakeWorkerResultCloneSafeto drop-on-throw.
Lower-priority (body only)
- Maintainability:
isDataCloneError(clone-safety.ts:58) is now a dead export — the net recovers on ANY post failure and never inspects the error type; onlyclone-safety.test.tsimports it (git grepconfirms no prod caller).makeWorkerResultCloneSafe's JSDoc still says "Call this ONLY after a realDataCloneError" though the caller is deliberately broader.parsing-processor.tsinlines{path,reason}instead of importing the exportedSkippedPath. All trivial. - Forward-looking:
makeWorkerResultCloneSafesanitizes only array fields (:401) — a future non-array result sink would bypass the net;Cloneable<any>false-accepts (anany-typed member; add anIsAnybranch); theextractTemplateConstraints/collectCaptureSideChannelhook contracts inlanguage-provider.tsdon't document theassertCloneablerequirement for a future language implementer. - Performance: the failure path traverses each dirty element twice (
containsNonCloneablethenstripNonCloneable). Failure-path-only; the pre-scan is what preserves clean-element referential identity, so it's a deliberate tradeoff — notable only if the failure path becomes hot. - Tests: the GREEN integration test asserts graph content but not that
skippedPaths/the{type:'warning'}post actually surfaced (its docstring claims it covers that wiring —dispatchChunkParseonly logs them); the IPC test uses a synthetic object, not a realKnowledgeGraph; theGITNEXUS_STRICT_CLONE=1gate exists but no CI job runs the suite under it, so it gives no continuous protection; the "dropped unsalvageable" branch and themergeResultskippedPathsunion are untested.
CI
quality / format(root prettier) FAILS onc-cpp.ts+kotlin.ts(the #2143 wrapping lines) — a real merge-blocker, fixable withprettier --write.CI Gate(2s) is the aggregate gating on it.Vercelfails on deploy authorization (not code). Everything substantive passes: typecheck, lint, all test matrices (ubuntu/windows/macOS coverage), tree-sitter ABI, CodeQL, packaged-install smoke, benchmarks.
Automated multi-tool digest (GitNexus swarm + CE personas + live Codex), critic-gated. All findings P2/P3 — none live-reachable from current parse output; verify before acting.
| // yet clones fine, so a length check would false-positive. | ||
| if (obj instanceof ArrayBuffer || ArrayBuffer.isView(obj)) return !isStructuredCloneable(obj); | ||
| seen.add(obj); | ||
| if (Array.isArray(obj)) { |
There was a problem hiding this comment.
[P3 · defense-in-depth hole] (correctness · reproduced) This array branch (and stripNonCloneable at line 279) iterates numeric indices only, but structuredClone of an array also serializes non-index own-enumerable properties and throws on a non-cloneable one. Reproduced: with arr.meta = ()=>{}, structuredClone throws, yet makeWorkerResultCloneSafe returns skipped:[] and leaves the field unrewritten — the re-post then throws, falls through to post-result.ts:87, and emits {type:'error'}, the worker-death signal that arms the POOL_SIZE=1 cascade. Not produced by any current extractor, but it's a silent gap in a net meant for unanticipated leaks. Fix: scan/copy non-index Object.keys here and at line 279, or fall back to the isStructuredCloneable probe when an array's own-key set exceeds its indices. [reproduced]
…itizer (abhigyanpatwari#2135 review) structuredClone serializes an array's NON-index own-enumerable properties (e.g. `arr.meta = fn`) and throws DataCloneError on a non-cloneable one. The clone sanitizer's array branches iterated numeric indices only, so such an array was waved through (containsNonCloneable returned false, makeWorkerResultCloneSafe left the field unrewritten with skipped:[]) — the re-post then threw, fell through to the fail-closed {type:'error'}, and re-armed the POOL_SIZE=1 cascade the net exists to prevent. Add isArrayIndexKey() and, in BOTH containsNonCloneable and stripNonCloneable array branches (kept in lockstep), scan/strip the non-index own-enumerable keys after the index loop. A cloneable non-index prop is carried onto the stripped copy; a non-cloneable one is stripped and recorded. Not reachable from current parse output (no extractor attaches non-index array props) — a defense-in-depth hole closed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…aping to fail-closed (abhigyanpatwari#2135 review) findFilePath was documented "never throws" but read element properties unguarded in its generic sweep — a throwing getter at a non-path key (or a Proxy with a throwing ownKeys trap) threw out of makeWorkerResultCloneSafe, past postResultCloneSafe's recovery, to the fail-closed {type:'error'} that under POOL_SIZE=1 re-arms the cascade the net prevents. Likewise a Proxy with a throwing getPrototypeOf trap throws inside containsNonCloneable's instanceof checks. - findFilePath/pathFromChild now read via safeGet (try/catch) and guard Object.keys, honoring the "never throws" contract. - Each element's sanitize in makeWorkerResultCloneSafe is wrapped: a throw during scan/strip drops that one element (recorded as "sanitizer error") rather than sinking the whole result — so one pathological element can't fail-close the run. - Corrected the makeWorkerResultCloneSafe JSDoc ("ONLY after a DataCloneError" → after ANY post failure, matching the caller) and documented the deliberate failure-path double-traversal (the non-allocating pre-scan is what preserves clean-element referential identity). Tests: a throwing getter on a path-less element is stripped & delivered (not escaped); a Proxy structural-trap element is dropped, clean siblings survive. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…itizer (abhigyanpatwari#2135 review) makeWorkerResultCloneSafe rewrote only ARRAY result fields, so a future non-array sink (a nested object / Map result field) carrying a non-cloneable value — or an array field whose own non-index property the element loop didn't reach — would survive the sanitizer and throw on the re-post. Add a final `if (!isStructuredCloneable(result))` gate that strips any remaining offending field in place, making "the returned result is structured-cloneable" a hard postcondition independent of future ParseWorkerResult shape. Failure-path-only and a no-op once the array loop already made the result clean (the per-field probe short-circuits every clean field, so it adds no work or skip entries then). Tests: a function on a non-array result field is stripped & the result becomes cloneable; the gate adds no skip entry when the array loop already cleaned up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-time guard (abhigyanpatwari#2135 review) `Cloneable<any>` previously resolved to `any` (not `never`), so a payload with an `any`-typed member — the most likely escape hatch, since `unknown` is already blocked — passed `assertCloneable` with no compile error. Add an `IsAny<T>` branch (the canonical `0 extends 1 & T` probe) as the FIRST arm so `any` resolves to `never`, matching how `unknown` is already rejected. It must precede the primitive arm: `any extends CloneablePrimitive` would otherwise resolve to `any` and re-admit it. The IsAny-first arm perturbs inference for a bare `undefined` literal argument (T infers as `unknown` → never); real consumers pass `X | undefined` unions (the provider hooks), which are unaffected (src tsc clean), so the runtime identity test now uses a `string | undefined` value — the realistic shape. Tests: an `any` member fails `assertCloneable` (@ts-expect-error, enforced by tsconfig.test.json) and `Cloneable<any>` resolves to `never` at the type level. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…st, not Omit (abhigyanpatwari#2135 review) `AnalyzeResultIpc = Omit<AnalyzeResult,'pipelineResult'>` kept every other field in the type — including optional ones like `isPrimaryBranch?` — so the type advertised a field the runtime allowlist never sends, and the doc-comment's "a future field fails to compile until handled here" only held for REQUIRED fields. Switch to `Pick<AnalyzeResult, …the six scalar fields…>`: the allowlist IS the type, so the projection return literal is exhaustive by construction (omitting a key is a compile error) and a new `AnalyzeResult` field is simply absent from the wire until deliberately added here. `isPrimaryBranch` is intentionally excluded (nothing consumes it server-side over this fork; the parent reads only `repoName`). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…npatwari#2135 review) postResultCloneSafe recovers on ANY fast-path post failure and never inspects the error type (a throwing getter surfaces a RangeError, not a DataCloneError — gating on the type was the original net-gap bug). isDataCloneError has no production caller; it was only exercised by its own unit test. Remove the function and that test block. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…or (abhigyanpatwari#2135 review) The clone-safety telemetry accumulator inlined `Array<{path,reason}>` — a structural duplicate of the exported `SkippedPath`. Import and use the canonical type so a future rename of its fields is a compile error here instead of a silent structural drift. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ndary hooks (abhigyanpatwari#2135 review) extractTemplateConstraints and collectCaptureSideChannel return `unknown` and feed values across the worker structured-clone boundary, but the hook contracts didn't state the cloneability requirement — a future language implementing them without care could leak a non-serializable value. Document that the return MUST be structured-clone-safe and should be wrapped with assertCloneable, so the guarantee is a compile error at the source (abhigyanpatwari#2143). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tegration case (abhigyanpatwari#2135 review) The GREEN clone-safety integration test asserted only graph content (all files present), not that the skippedPaths / {type:'warning'} wiring its docstring claims to cover actually fired. Capture the production logger via _captureLogger and assert the sanitize telemetry names the offending file (poison.ts) AND the exact stripped key path (properties.toString) — proving the worker's skippedPaths append + the parsing-processor warning surfaced end to end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…bhigyanpatwari#2135 review) The IPC projection tests used a hand-built hostile object. Add a case that puts a real createKnowledgeGraph (whose nodes/relationships getters would materialize the whole graph under JSON.stringify) in pipelineResult and asserts the projection drops it entirely — the serialized payload stays under 300 bytes (a materialized 50-node graph would be thousands), with the scalar fields intact. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… merge union (abhigyanpatwari#2135 review) Two untested clone-safety branches from the tri-review: - "dropped unsalvageable": a dirty element whose stripped copy is STILL not structured-cloneable must be dropped, not delivered (else the re-post throws). Add a deterministic test (a non-plain member with a stateful getter that the strip-time probe sees clean but that turns into a function on the post-strip verification) asserting the element is dropped and the run survives. - mergeResult skippedPaths union across sub-batches. mergeResult (and its appendAll helper) was module-private in the parse-worker ENTRY module, which a main-thread test can't import (it runs MessagePort setup). Extract it to a side-effect-free result-merge.ts (mirroring post-result.ts) and unit-test the union (including the `??=` target-init path), the skippedLanguages sum, and array append. parse-worker imports it back; verified the built worker still parses + merges via the real-worker integration path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…bhigyanpatwari#2135 review) Clears the failing `quality / format` CI gate (root prettier, not the gitnexus-local config). Reformats the pre-existing abhigyanpatwari#2143 wrapping lines in c-cpp.ts + kotlin.ts plus the clone-safety review-fix files touched in this PR-update (clone-safety.ts and the new/updated tests). Formatting-only — no behavior change; tsc, the type-level assertions (tsconfig.test.json), and the unit + integration suites stay green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ertions (abhigyanpatwari#2135 review) CodeQL flagged the `expect(a && b && c).toBe(true)` lines in the type-level test assertions as js/trivial-conditional: after type erasure the operands are constant `true`, so the `&&` chain always evaluates the same. Replace the `&&` chain with array equality (`expect([...]).toEqual([true, ...])`) — no conditional, and the real assertions remain the `const x: …IsNever = true` / `: IsCloneable<…> = true` annotations (enforced by tsconfig.test.json, which fail to compile if a guard regresses). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem (#2112)
gitnexus analyzeaborts on large repos of vendored/generated C++ (ExecutorTorch, TensorFlow Lite). One of the three failure modes in the issue is a deterministic crash that defeats even the conservativeGITNEXUS_WORKER_POOL_SIZE=1workaround:A parse worker delivers its accumulated result via
postMessage, which structured-clones the payload synchronously on the worker thread and throws on the first value it can't serialize (here, a nodepropertiesvalue pointing at a nativetoString). The worker re-posts the throw as{type:'error'}; the pool counts it as a worker death; and underPOOL_SIZE=1the same graph re-throws on every respawn until the slot's budget is exhausted and the wholeparsephase aborts.This was confirmed empirically —
structuredClone({ toString: Object.prototype.toString })reproduces the exact issue string. (It is not theScopeMaps — those clone fine — and not a leakedError— those clone fine. Because production analyze runs in store-mode, whereparsedFilesare emptied before the post and the shard write'sJSON.stringifysilently drops functions, the leak rides in a plain-record array, notcaptureSideChannel.)What this PR does
Adds a clone-safety net at the worker result boundary (new
workers/clone-safety.ts). On a clone failure the worker:structuredCloneprobe — clean arrays keep referential identity, zero copy),ParsedFileso scope-resolution re-derives it on the main thread with intact edge data,result.skippedPaths, warns naming the field + file so the still-unpinned leak is diagnosable from logs and fixable at source, and re-posts.Healthy runs are byte-identical — the net runs only after a real
DataCloneError, so there's zero overhead on the fast path. The strip drops the same values the store path'sJSON.stringifyalready silently removes, so store/no-store runs converge. Skipped paths surface via the parsing processor alongside the existing skipped-language telemetry.The worker-pool resilience machinery is untouched (doc-comment corrections only — two stale comments wrongly claimed clone failures route to
messageerror; the failure is a sender-side synchronous throw). U1 sanitizes at the source, so the pool's death/respawn path is never reached.Scope
PR-1 — failure mode C, the deterministic
POOL_SIZE=1killer. The timeout/native-abort graceful-degradation cascade (failure modes A & B) is coupled to downstream re-parse exclusion + a hard per-file worker watchdog and is tracked as follow-up work (it would otherwise just relocate the hang/native-abort to the main thread).Test plan
test/unit/clone-safety.test.ts(10 cases): the exactfunction toString() { [native code] }repro; the refuted Map case; strip-and-keep vs drop-and-reparse; referential identity on healthy results.test/integration/parse-impl-clone-skip.test.ts: real worker,POOL_SIZE=1— GREEN (a non-cloneable result is sanitized and the run completes with all files) + RED control (the same raw result aborts the phase). The test worker imports the built helper and applies the exact production pattern, exercising the real clone boundary the fake-worker doubles can't.tsc --noEmit, prettier — all green.🤖 Generated with Claude Code