fix(grammars): load vendored tree-sitter grammars from vendor/ by absolute path (#2111)#2144
Conversation
|
@magyargergo is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 11025 tests passed 16 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
…i-review Addresses findings from the adversarial + maintainability review lanes: - Publish hygiene (P2): assert-publish-grammar-coverage.cjs (prepack) now fails if a stray `vendor/<name>/build/` exists. files:["vendor"] would ship it and node-gyp-build resolves build/Release BEFORE prebuilds/, so a locally source-built binding could shadow the curated committed prebuild on consumers. - Regression guard (P2): vendored-grammars.test.ts now also catches dynamic `import()`, side-effect `import 'x'`, `/subpath`, and backtick-quoted loads, scans test/ (not just src/), drops the `//`-substring false-negative, and adds a self-test asserting every load form is caught (and prose/Cpp ignored). - requireVendoredGrammar throws on a non-vendored name (drift guard across the package set / CLI probe / build registry) instead of a confusing path miss. - Fix stale comments: kotlin/query.ts called the grammar an "optionalDependency" (now vendored); proto.ts clarifies its remaining `_require` is for tree-sitter. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
magyargergo
left a comment
There was a problem hiding this comment.
Tri-review digest — PR #2144 (load vendored tree-sitter grammars from vendor/)
Methods & independence: Compound-Engineering personas (correctness, adversarial, maintainability) + GitNexus risk/test lanes — all Claude. Codex was unavailable (usage limit), so this is a two-method, both-Claude review — cross-persona agreement here is "consistent across personas," not three independent engines. Every finding was coordinator-verified and fixed in d1cb5e0c before posting.
Validated (credit where due)
- Correctness lane: clean
tsc --noEmit, live-loaded all 5 vendored grammars, 933 unit tests green. Refuted with evidence:VENDOR_ROOT(src/core/tree-sitter/vendored-grammars.ts:16) resolves correctly in dev/dist/worker (worker imports the helper relatively; tsc doesn't bundle); every consumer converted (grep: only comment mentions remain); no dangling/unusedcreateRequire;unknown-vs-anytypes flow clean; the optional-grammar missing-vs-broken probe is preserved (node-gyp-build's "No native build was found" matches the existing regex inoptional-grammars.ts). - CI: tree-sitter ABI green on windows/ubuntu/macos; packaged install smoke (ubuntu) green; quality lint/typecheck/format green. (Vercel fail = deploy-auth, not code.)
- Structural soundness: the fix removes the
node_modulesparticipation that caused the extraneous-prune/EPERM bug; loading by absolute path twice returns the same require-cache object (Node caches by realpath);node-gyp-buildresolves fromvendor/(direct dep —package.json:76— survives--omit=optional/pnpm-strict).
Findings — all addressed in d1cb5e0c
- [P2 · adversarial · reproduced] Publish hygiene. Retargeting the source-build into
vendor/<name>/build/means a stray build dir would ship (files:["vendor"]overrides.gitignore/.npmignore) and shadow the committed prebuild —node-gyp-build(node-gyp-build.js:33-41) resolvesbuild/Releasebeforeprebuilds/. → Added a prepack guard inscripts/assert-publish-grammar-coverage.cjs(findStrayBuildArtifacts) that failsnpm packif anyvendor/*/buildexists. - [P2 · adversarial + maintainability · reproduced] Regression-guard holes. The new guard's regex missed dynamic
import(), side-effectimport 'x',/subpath, and backtick loads, and only scannedsrc/. → Broadened the regex, now scanstest/(excl. fixtures), dropped the//-substring false-negative, and added a self-test asserting every load form is caught (prose/tree-sitter-cppignored) —test/unit/vendored-grammars.test.ts. - [P2/P3 · maintainability] Drift guard.
requireVendoredGrammaraccepted any string (three grammar lists could silently drift). → It now throws on a non-vendored name —src/core/tree-sitter/vendored-grammars.ts. - [P3 · maintainability] Stale comments.
kotlin/query.ts:3called the grammar an "optionalDependency" (now vendored);proto.ts:26_require. → Corrected.
Refuted / not issues
node-gyp-build resolution under --omit=optional/pnpm-strict (direct dep); double-load ABI (same cache object); tarball completeness (verified via npm pack of 1.6.8-rc.6 — all 6 prebuilds + bindings/node + src/node-types.json ship per grammar); anything else recreating node_modules/tree-sitter-* (only the now-deleted materialize did).
Coverage limit
No automated test reproduces the original Windows-only 2nd-npx-reify EPERM (install-level). The packaged install smoke CI job and the no-bare-require guard are the proxies.
Automated multi-tool digest — 2 Claude methods; Codex unavailable (not three independent engines). Verify before acting. Note: this is a self-review (PR author == reviewing identity); findings were applied directly rather than left as change-requests.
…olute path (abhigyanpatwari#2111) The recurring Windows `EPERM: operation not permitted, symlink` (errno -4048) when adding the MCP server to Antigravity is NOT the abhigyanpatwari#2101/abhigyanpatwari#2110 module-load crash — it is an install-time arborist failure during the `_npx` reify that the MCP client triggers on every `npx gitnexus` launch. Root cause: the `postinstall` materialize step copied each vendored grammar (`vendor/tree-sitter-{c,dart,proto,swift,kotlin}`) into `node_modules/gitnexus/node_modules/tree-sitter-*` as a real package so runtime `require('tree-sitter-dart')` would resolve. Those packages are in no dependency graph, so every subsequent npm/npx reify treats them as **extraneous** and prunes/relocates them — on Windows the relocation goes through `@npmcli/move-file`'s symlink path and throws EPERM (symlinks need Developer Mode/admin), and on every OS the 2nd run silently deletes the grammars. This is the same class as abhigyanpatwari#1728, which the materialize step itself claimed to have fixed. Fix (the prebuildify + node-gyp-build ecosystem pattern): never copy grammars into node_modules. Load each by absolute path from `vendor/<name>` via the new `requireVendoredGrammar` helper — the grammar's own `bindings/node` runs `node-gyp-build(<dir>)` and loads the committed `vendor/<name>/prebuilds/ <platform>-<arch>/…` directly (all 5 ship all 6 tuples). vendor/ is inside the package but not a node_modules subtree, so arborist never sees the grammars and the reify is idempotent — no EPERM, no silent deletion. - new src/core/tree-sitter/vendored-grammars.ts (requireVendoredGrammar / vendoredGrammarDir / VENDORED_GRAMMAR_PACKAGES; VENDOR_ROOT stable in dev+dist) - route all consumers through it: parser-loader, parse-worker, grpc proto, include-extractor (C), http-patterns kotlin, cli optional-grammars probe - postinstall drops the materialize step; build-tree-sitter-grammars.cjs builds in-place under vendor/ (gitignored) and deletes materialize-vendor-grammars.cjs - tests + grammar-introspection helper load grammars from vendor/ too (single source of truth); new vendored-grammars.test.ts guards against reintroducing a bare `require('tree-sitter-<vendored>')` Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drift guard (PR abhigyanpatwari#2144 review, P3): validate the argument against VENDORED_GRAMMAR_PACKAGES and fail loudly on an unknown name, so the three grammar lists (package set / CLI probe / build registry) drifting out of sync surfaces as a clear error instead of a confusing absolute-path require miss. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g prebuilds Publish hygiene (PR abhigyanpatwari#2144 review, P2). Now that build-tree-sitter-grammars.cjs source-builds into vendor/<name>/build/, a stray build dir would ship in the tarball (files:["vendor"] overrides .gitignore/.npmignore) AND shadow the committed prebuild — node-gyp-build resolves build/Release before prebuilds/. assert-publish-grammar-coverage.cjs (prepack) now fails `npm pack` if any vendor/*/build exists (findStrayBuildArtifacts), with a clear `rm -rf` fix hint. Adds unit coverage for the new pure function. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ssion guard PR abhigyanpatwari#2144 review (P2). The guard regex missed dynamic import(), side-effect `import 'x'`, /subpath, and backtick loads, and only scanned src/. It now covers every node_modules-forcing form (single/double/backtick quotes, optional subpath), scans test/ too (excluding fixtures and the guard file itself), drops the `//`-substring false-negative (leading-comment-only heuristic), and adds a self-test asserting every load form is caught while prose mentions and tree-sitter-cpp are ignored. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PR abhigyanpatwari#2144 review (P3). kotlin/query.ts called tree-sitter-kotlin an "optionalDependency" — it is vendored and loaded from vendor/ by absolute path (abhigyanpatwari#2111). proto.ts now states its remaining `_require` is only for the real `tree-sitter` dependency, not a vendored grammar (which goes through requireVendoredGrammar). Comment-only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dacd24b to
a78c904
Compare
Summary
Fixes the recurring Windows
npm error code EPERM … syscall symlink(errno-4048) reported in #2111 when adding the MCP server to Antigravity. This is not the #2101/#2110 module-load crash — it is an install-time arborist failure during the_npxreify that the MCP client triggers on everynpx gitnexuslaunch (note the…\_npx\<hash>\…path in the report).Root cause
The
postinstallmaterialize step copied each vendored grammar (vendor/tree-sitter-{c,dart,proto,swift,kotlin}) intonode_modules/gitnexus/node_modules/tree-sitter-*as a real package, so runtimerequire('tree-sitter-dart')would resolve.Those copies are in no dependency graph, so every subsequent
npm/npxarborist reify treats them as extraneous and prunes/relocates them:@npmcli/move-file's symlink path →EPERM: operation not permitted, symlink(symlinks require Developer Mode/admin). Install/setup aborts.This is the same class as #1728, which the materialize step itself claimed to have fixed (it only moved the symlink trigger from
file:deps to extraneous-node cleanup).Reproduced
Fix
Adopt the ecosystem-canonical
prebuildify+node-gyp-buildpattern: never copy grammars intonode_modules. Load each by absolute path fromvendor/<name>via a newrequireVendoredGrammarhelper — the grammar's ownbindings/noderunsnode-gyp-build(<dir>)and loads the committedvendor/<name>/prebuilds/<platform>-<arch>/…directly (all 5 grammars ship all 6 tuples).vendor/is inside the package but not anode_modulessubtree, so arborist never sees the grammars and the reify is idempotent — no EPERM, no silent deletion.src/core/tree-sitter/vendored-grammars.ts(requireVendoredGrammar/vendoredGrammarDir/VENDORED_GRAMMAR_PACKAGES;VENDOR_ROOTstable in dev anddist/).parser-loader,parse-worker, gRPCproto,include-extractor(C), http-patternskotlin, and the CLIoptional-grammarsprobe.postinstalldrops the materialize step;build-tree-sitter-grammars.cjsnow source-builds in place undervendor/(gitignored) only when a platform lacks a prebuild; deletedscripts/materialize-vendor-grammars.cjs.grammar-introspectionhelper load grammars fromvendor/too (single source of truth).Testing
test/unit/vendored-grammars.test.ts— regression guard: fails if anyone reintroduces a barerequire('tree-sitter-<vendored>'), and asserts every grammar resolves undervendor/(nevernode_modules) and loads.node_modules/tree-sitter-*, then all 5 grammars load fromvendor/by absolute path (cold), and confirmed the published rc tarball ships all 6 prebuilds +node-types.jsonper grammar.Refs #2111, #1728.
🤖 Generated with Claude Code