fix(group): contract extractors honour .gitnexusignore via shared IgnoreService (#1185)#1247
Conversation
…oreService (abhigyanpatwari#1185) The HTTP, gRPC, and topic contract extractors each globbed the repo with a hardcoded `ignore: ['**/node_modules/**', '**/.git/**', '**/dist/**', '**/build/**', '**/vendor/**']` array, bypassing the shared `IgnoreService` that the rest of the ingestion pipeline uses for `.gitnexusignore` and `.gitignore` parsing. Result: a vendored Python venv (`mentor_env/`), generated stubs, or any user-defined exclusion silently produced false-positive contracts. Replace each hardcoded array with `createIgnoreFilter(repoPath)`, mirroring the canonical pattern in `filesystem-walker.ts`. The 5 hardcoded names are all in `DEFAULT_IGNORE_LIST`, so default behaviour is preserved; users now also get `.gitnexusignore` patterns, the rest of the hardcoded list (e.g. `__pycache__`, `.pytest_cache`), and the `.gitnexusignore` negation semantics introduced in abhigyanpatwari#771. The topic extractor additionally filters Go `*_test.go` at the glob level. That filter is preserved via a small wrapper around `createIgnoreFilter` that short-circuits before delegating, so glob-level pruning still applies and the existing `_test.go` skip test (with new content asserting the pruning is real) still passes. Tests added to all three `*-extractor.test.ts` files exercising `.gitnexusignore` honouring end-to-end via real temp directories.
|
@azizur100389 is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 7736 tests passed 1 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
|
Claude finished @magyargergo's task in 2m 50s —— View job PR #1247 Review — fix(group): contract extractors honour .gitnexusignore via shared IgnoreServiceTasks:
Review bar for this PR
Issue and PR context considered#1185 symptom: HTTP, gRPC, and topic extractors used hardcoded glob ignore arrays ( Affected extractors: Changed files: 3 source files (+36 lines), 3 test files (+142 lines), 0 public API or contract format changes. CI status: ✅ typecheck clean, ✅ 7,735 tests passed on 3 platforms, ✅ E2E passed, 1 unrelated Ruby skip. SHA Core risk areas: (1) Understanding of the change#1185 showed that group extractors were silently scanning directories that Findings[medium] gRPC test claims source-scan coverage it does not exercise
[low] Extractor tests cover only
|
| DoD item | Status |
|---|---|
HTTP route extraction honours .gitnexusignore, .gitignore, and default ignore rules |
✅ satisfied — createIgnoreFilter wired on real path; test proves .gitnexusignore case |
gRPC extraction honours ignores in both .proto context building and source scanning |
✅ satisfied (implementation) / |
Topic extraction honours ignores while preserving *_test.go glob-level pruning |
✅ satisfied — wrapper is correct; two tests prove both behaviours |
Existing default ignores remain effective for node_modules, .git, dist, build, vendor |
✅ satisfied — all 5 names confirmed in DEFAULT_IGNORE_LIST |
.gitnexusignore negation semantics consistent with ingestion walker |
✅ satisfied at implementation level (same filter object); not tested at extractor level |
Excluded files produce no provider/consumer contracts and no symbolRef paths under ignored dirs |
✅ satisfied — all three tests assert symbolRef.filePath check |
| Existing HTTP/gRPC/topic contract detection behaviour unchanged for non-ignored files | ✅ satisfied — all pre-existing tests pass (7,735 total) |
Tests cover reporter-style mentor_env/ false-positive case |
✅ satisfied — all three tests include mentor_env/ scenario |
| CI, typecheck, focused tests, and full relevant tests are green | ✅ satisfied |
Ignore semantics assessment
.gitnexusignore: Correctly wired through createIgnoreFilter in all three extractors. Tested in all three new test blocks with the reporter's exact directory name (mentor_env/).
.gitignore: Correctly included via loadIgnoreRules (both .gitignore and .gitnexusignore are read unless GITNEXUS_NO_GITIGNORE=1). Not tested at the extractor level, but the integration point is correct.
GITNEXUS_NO_GITIGNORE: Flows through createIgnoreFilter → loadIgnoreRules unchanged. Extractors automatically respect this env var with no additional code; no extractor bypasses it.
Default ignores: node_modules, .git, dist, build, vendor are all present in DEFAULT_IGNORE_LIST (lines 24, 7, 44, 45, 26 of ignore-service.ts). Default behaviour is byte-identical for repos without .gitnexusignore. Additionally, the richer DEFAULT_IGNORE_LIST entries (__pycache__, venv, .venv, etc.) are now also effective.
Negation semantics: createIgnoreFilter implements hasExplicitUnignore with ancestor-walk and last-match-wins semantics. Extractor tests do not cover this, but the shared filter object is used correctly.
Windows/path normalization: p.relative() from path-scurry returns POSIX paths on all platforms (documented in ignore-service.ts:435). Topic wrapper's endsWith('_test.go') is safe. makeContract in gRPC normalizes backslashes via .replace(/\\/g, '/'). CI on 3 platforms confirms this.
Extractor assessment
HTTP route extractor: scanFiles(repoPath) correctly calls createIgnoreFilter(repoPath) once per extract() invocation (lazy evaluation via getScannedFiles() caches the result). nodir: true, cwd: repoPath, HTTP_SCAN_GLOB unchanged. Detection logic untouched. ✅
gRPC proto context: buildProtoContext at line 235 calls createIgnoreFilter(repoPath) before the .proto glob. Ignored .proto files never enter the service map. Provider contracts from ignored protos cannot be emitted. ✅
gRPC source scan: extract() at line 414 calls createIgnoreFilter(repoPath) before the source-scan glob. Creates a separate filter instance (slight duplication of I/O vs. proto-context call, but createIgnoreFilter reads at most 2 files at root — acceptable). Ignored source files are not parsed. ✅ implementation /
Topic extractor: createIgnoreFilter(repoPath) called once; result wrapped with *_test.go short-circuit before delegating to base filter for ignored; childrenIgnored delegates cleanly. One parser reused across files. ✅
Topic *_test.go pruning: Wrapper is at line 68–71. p.relative().endsWith('_test.go') fires before baseFilter.ignored(p), so test files are never read. Second test (still prunes _test.go even when .gitnexusignore is empty) writes a real Sarama consumer call and asserts zero contracts — this is a strong regression guard. ✅
Unchanged contract detection for non-ignored files: Pre-existing tests (Spring, Express, Gin, NestJS, gRPC, Kafka, RabbitMQ, NATS) all pass, proving the change is transparent for normal repos. ✅
Test assessment
Tests added: 3 describe('respects .gitnexusignore (#1185)') blocks, 1–2 it cases each. Total ~142 lines.
HTTP ignore tests: Control at src/routes/users.ts, ignored under mentor_env/lib/leaked.ts using same Express pattern. Asserts control emitted, ignored not emitted, and no symbolRef.filePath starts with mentor_env/. Calls extract(null, ...) to force source-scan path. ✅
gRPC proto coverage: Control at proto/auth.proto, ignored at mentor_env/lib/leaked.proto. Both assert contract presence/absence and symbolRef path. ✅ for proto-context path.
gRPC source-scan coverage: No source file in mentor_env/ with a detectable gRPC consumer pattern. The test name and comment overclaim.
Topic mentor_env test: Control at src/EventHandler.java, ignored at mentor_env/lib/LeakedHandler.java with @KafkaListener. Asserts ids contain user.created, not leaked.event, and no mentor_env/ in symbolRef. ✅
Topic _test.go test: Writes real Sarama ConsumePartition("real-topic-from-test", ...) in src/orders_test.go, asserts zero contracts. This is a strong, meaningful test. ✅
Missing cases: .gitignore-only test per extractor; gRPC ignored source file; negation (!pattern) at extractor level.
Validation assessment
CI status: ✅ All checks green at HEAD SHA 61520aa. Typecheck (tsc --noEmit), unit tests (3 platforms), E2E passed. 7,735 passed / 1 skipped (unrelated Ruby). The 3 pre-existing local failures cited in the PR body (git-utils, skip-git-cli) are confirmed pre-existing and do not appear in CI.
Targeted tests: The 3 new extractor test files are part of npm run test:unit which passed. The PR reports 108/108 pass on targeted extractor suites.
Platform coverage: macOS, Windows, Linux — all passing.
Unverified: .gitignore-only code path not tested in extractors. Negation semantics not tested in extractors. Neither blocks merge given IgnoreService coverage.
Final verdict
production-ready with minor follow-ups
All three extractors are correctly wired to createIgnoreFilter on the real runtime path. DEFAULT_IGNORE_LIST covers every previously hardcoded entry. The topic *_test.go wrapper is correct and strongly regression-locked. CI is green on 3 platforms with 7,735 tests passing. The only meaningful gap is that the gRPC test claims to cover both proto-context and source-scan ignore paths but only exercises the proto-context path — the source-scan implementation is correct by inspection (identical pattern to the proto-scan), but the test overclaims its coverage. The .gitignore and negation paths are not tested at the extractor level, which is acceptable given IgnoreService-level coverage. These are follow-up items, not blockers.
|
@azizur100389 can you please look at these findings? 🙏 |
…overage (abhigyanpatwari#1185) Addresses two findings from the @claude review on PR abhigyanpatwari#1247: [medium] The gRPC ignore test claimed to cover both proto-context and source-scan paths but only wrote a .proto file under mentor_env/. Added a Python `_pb2_grpc.<Name>Stub(channel)` consumer file under the same ignored dir (mirroring the canonical pattern from `test_extract_python_stub_returns_consumer`); without the `.gitnexusignore` filter that file would emit a consumer contract. The test now exercises both `createIgnoreFilter` calls inside the gRPC extractor (`buildProtoContext` + `extract`) in a single run, with both defence-in-depth path-prefix assertions and a specific `role: consumer` LeakedService assertion. [low] Added one shared .gitignore-only test on the HTTP extractor. `createIgnoreFilter` reads both `.gitignore` and `.gitnexusignore` via `loadIgnoreRules`, but no extractor-level test exercised the `.gitignore` path. One shared test is sufficient because all three extractors consume the same filter object — verified at `IgnoreService` level already. The remaining [low] finding — "negation semantics (!pattern) not tested at extractor level" — is deferred deliberately, not skipped. Three reasons: 1. The negation logic (introduced in abhigyanpatwari#771) lives entirely inside `createIgnoreFilter`'s `hasExplicitUnignore` ancestor-walk in `ignore-service.ts`. The extractors only consume the returned filter object — they never inspect patterns, never call `hasExplicitUnignore` directly, and have no code path that could diverge from the IgnoreService's negation behaviour. 2. Negation is already locked in by 8 dedicated unit tests in `test/unit/ignore-service.test.ts` (the abhigyanpatwari#771 suite), plus the `!parent/` + `parent/child/` last-match-wins regression test added in PR abhigyanpatwari#1046. An extractor-level negation test would re-prove the same code path and would not catch any failure mode the existing tests don't already catch. 3. The bot itself flagged the gap as "Acceptable to leave as follow-up referencing existing IgnoreService negation tests" — the deferral matches its own recommendation. If a future change inserts an extractor-side wrapper around the filter (as topic-extractor.ts already does for `*_test.go`) that could plausibly affect negation, an extractor-level negation test should be added at that point — not pre-emptively here.
|
Addressed in [medium] gRPC source-scan path now actually exercised. The original test wrote only a
Both globs are now exercised in the same run, and the test name + comment are updated to accurately reflect what's covered. [low] [low] negation deferred deliberately. The commit body spells out the rationale; short version:
If a future change introduces an extractor-side wrapper around the filter (as Local verification on
CI re-running. |
|
Claude finished @magyargergo's task in 2m 54s —— View job PR #1247 Review — fix(group): contract extractors honour .gitnexusignore via shared IgnoreServiceTasks:
Review bar for this PR
Issue and PR context considered#1185 symptom: HTTP, gRPC, and topic extractors maintained their own hardcoded Affected extractors: Changed files: 3 source files (+22 lines net), 3 test files (+205 lines). No public API or contract-format changes. CI status: ✅ All checks green at HEAD SHA Core risk areas verified: (1) IgnoreService glob wiring — confirmed correct; (2) gRPC both glob paths — both tested with real detectable patterns; (3) Understanding of the change#1185 showed that group extractors silently scanned directories that FindingsNo production-readiness issues found against the current DoD bar. Definition of Done check
Ignore semantics assessment
Default ignores: Negation semantics: Windows/path normalization: Extractor assessmentHTTP route extractor: gRPC proto context: gRPC source scan: Topic extractor: Topic Unchanged contract detection for non-ignored files: All pre-existing tests pass (7,736 total across 3 platforms). Spring Test assessmentTests added: 3 HTTP HTTP gRPC proto + source-scan: Control at Topic Topic Missing cases: Negation ( Validation assessmentCI: ✅ All checks green at HEAD SHA Test counts: 7,736 passed / 1 skipped (pre-existing Ruby skip, unrelated). PR body reports 3 pre-existing local failures ( Targeted extractor tests: Part of Platform coverage: macOS, Windows, Linux — all passing. Backslash normalization in Not verified by tests: gRPC Final verdict
All three extractor paths are correctly wired to |
|
Thank you for your contribution! |
…oreService (abhigyanpatwari#1185) (abhigyanpatwari#1247) * fix(group): contract extractors honour .gitnexusignore via shared IgnoreService (abhigyanpatwari#1185) The HTTP, gRPC, and topic contract extractors each globbed the repo with a hardcoded `ignore: ['**/node_modules/**', '**/.git/**', '**/dist/**', '**/build/**', '**/vendor/**']` array, bypassing the shared `IgnoreService` that the rest of the ingestion pipeline uses for `.gitnexusignore` and `.gitignore` parsing. Result: a vendored Python venv (`mentor_env/`), generated stubs, or any user-defined exclusion silently produced false-positive contracts. Replace each hardcoded array with `createIgnoreFilter(repoPath)`, mirroring the canonical pattern in `filesystem-walker.ts`. The 5 hardcoded names are all in `DEFAULT_IGNORE_LIST`, so default behaviour is preserved; users now also get `.gitnexusignore` patterns, the rest of the hardcoded list (e.g. `__pycache__`, `.pytest_cache`), and the `.gitnexusignore` negation semantics introduced in abhigyanpatwari#771. The topic extractor additionally filters Go `*_test.go` at the glob level. That filter is preserved via a small wrapper around `createIgnoreFilter` that short-circuits before delegating, so glob-level pruning still applies and the existing `_test.go` skip test (with new content asserting the pruning is real) still passes. Tests added to all three `*-extractor.test.ts` files exercising `.gitnexusignore` honouring end-to-end via real temp directories. * test(group): exercise gRPC source-scan ignore + add .gitignore-only coverage (abhigyanpatwari#1185) Addresses two findings from the @claude review on PR abhigyanpatwari#1247: [medium] The gRPC ignore test claimed to cover both proto-context and source-scan paths but only wrote a .proto file under mentor_env/. Added a Python `_pb2_grpc.<Name>Stub(channel)` consumer file under the same ignored dir (mirroring the canonical pattern from `test_extract_python_stub_returns_consumer`); without the `.gitnexusignore` filter that file would emit a consumer contract. The test now exercises both `createIgnoreFilter` calls inside the gRPC extractor (`buildProtoContext` + `extract`) in a single run, with both defence-in-depth path-prefix assertions and a specific `role: consumer` LeakedService assertion. [low] Added one shared .gitignore-only test on the HTTP extractor. `createIgnoreFilter` reads both `.gitignore` and `.gitnexusignore` via `loadIgnoreRules`, but no extractor-level test exercised the `.gitignore` path. One shared test is sufficient because all three extractors consume the same filter object — verified at `IgnoreService` level already. The remaining [low] finding — "negation semantics (!pattern) not tested at extractor level" — is deferred deliberately, not skipped. Three reasons: 1. The negation logic (introduced in abhigyanpatwari#771) lives entirely inside `createIgnoreFilter`'s `hasExplicitUnignore` ancestor-walk in `ignore-service.ts`. The extractors only consume the returned filter object — they never inspect patterns, never call `hasExplicitUnignore` directly, and have no code path that could diverge from the IgnoreService's negation behaviour. 2. Negation is already locked in by 8 dedicated unit tests in `test/unit/ignore-service.test.ts` (the abhigyanpatwari#771 suite), plus the `!parent/` + `parent/child/` last-match-wins regression test added in PR abhigyanpatwari#1046. An extractor-level negation test would re-prove the same code path and would not catch any failure mode the existing tests don't already catch. 3. The bot itself flagged the gap as "Acceptable to leave as follow-up referencing existing IgnoreService negation tests" — the deferral matches its own recommendation. If a future change inserts an extractor-side wrapper around the filter (as topic-extractor.ts already does for `*_test.go`) that could plausibly affect negation, an extractor-level negation test should be added at that point — not pre-emptively here.
Summary
Reported in #1185 by @DungNg03051999: the HTTP, gRPC, and topic contract extractors globbed the repo with a hardcoded
ignore: ['**/node_modules/**', '**/.git/**', '**/dist/**', '**/build/**', '**/vendor/**']array, bypassing the sharedIgnoreServicethat the rest of the ingestion pipeline uses for.gitnexusignore+.gitignoreparsing. Reporter measured 8 false-positive consumer contracts from a Python venv (mentor_env/) that.gitnexusignorewas supposed to exclude.Fix
Each extractor's
globcall now consumes the shared filter, mirroring the canonical pattern infilesystem-walker.ts:6,34:Touched call sites:
gitnexus/src/core/group/extractors/http-route-extractor.ts—scanFilesgitnexus/src/core/group/extractors/grpc-extractor.ts— both glob calls (buildProtoContextfor.protofiles and the source-scan insideextract)gitnexus/src/core/group/extractors/topic-extractor.ts—extract's globBackward compatibility
All 5 hardcoded entries (
node_modules,.git,dist,build,vendor) are inDEFAULT_IGNORE_LISTalready (verified). Default behaviour for any repo without a.gitnexusignoreis byte-identical. Users now additionally get:.gitnexusignoreand.gitignorepatterns honoured;DEFAULT_IGNORE_LIST(__pycache__,.pytest_cache,venv,.venv, etc.);.gitnexusignorenegation semantics from Feature: Allow indexing test directories (__tests__, __mocks__) #771.Topic-extractor specifics
topic-extractor.tspreviously had a 6th hardcoded entry**/*_test.go(Go test convention pushed to the glob level so the orchestrator stays language-agnostic — comment preserved in spirit). The fix wrapscreateIgnoreFilterso that glob-level pruning of*_test.gois preserved without falling back to a post-hoc.filter():Wrapper short-circuits on the
_test.gocheck before delegating to the base filter, soglobnever reads those files. Regression-locked by a new test that writes a Sarama consumer call inside*_test.goand asserts no contract is emitted.Tests
A
respects .gitnexusignore (#1185)describe block was added to each of:test/unit/group/http-route-extractor.test.tstest/unit/group/grpc-extractor.test.tstest/unit/group/topic-extractor.test.tsEach exercises the full extractor with real temp directories: a control file in a normal path, a vendored file under
mentor_env/(matching reporter's repro), and a.gitnexusignoreexcludingmentor_env/. Asserts the control survives and the excluded path produces no contracts. The topic test additionally pins*_test.goglob-level pruning.Verification
tsc --noEmitnpm run test:unit(full)The 3 unit-suite failures (2×
git-utils.test.tsplus 1×skip-git-cli.test.ts) are pre-existing environment failures on the currentupstream/main— verified viagit stash+ re-run on a clean tree (same 3 failures, unchanged). No CI-relevant regression.Why this is safe
IgnoreService(already battle-tested in the main ingestion walker).AGENTS.md/--helpaccuracyGITNEXUS_NO_GITIGNORE=1 Skip .gitignore parsing (still reads .gitnexusignore)— implying universal honouring. This PR makes that contract true for the contract-extractor surface.createIgnoreFilteris called once perglobinvocation, identical cost to the previous hardcoded array (no per-file overhead).Closes #1185.