Skip to content

feat(analyze): add .gitnexusignore support with --no-user-ignore override#203

Closed
L1nusB wants to merge 1 commit into
abhigyanpatwari:mainfrom
L1nusB:ignore_analyze
Closed

feat(analyze): add .gitnexusignore support with --no-user-ignore override#203
L1nusB wants to merge 1 commit into
abhigyanpatwari:mainfrom
L1nusB:ignore_analyze

Conversation

@L1nusB

@L1nusB L1nusB commented Mar 6, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR adds repository-level user ignore support for gitnexus analyze via a new .gitnexusignore file.

By default, when .gitnexusignore exists in the analyzed repository root, its patterns are applied during file scanning in addition to GitNexus's built-in hardcoded ignore rules. A new CLI escape hatch, --no-user-ignore, allows users to disable this behavior and keep current built-in ignore logic only.

Why

Many repositories include large non-core areas (for example: data/, generated snapshots, imported datasets, domain artifacts) that are useful for the repo but noisy for code intelligence indexing.

Today, users cannot control this in a repo-specific way without changing GitNexus source itself. This PR makes indexing more practical and intentional for real-world repos by giving users a simple, local ignore file.

What Changed

1. User ignore parsing + matching

Implemented .gitnexusignore support in the ignore service layer.

  • Added USER_IGNORE_FILE = '.gitnexusignore'
  • Added loader and parser:
    • loadUserIgnoreRules(...)
    • parseUserIgnoreRules(...)
  • Added matcher:
    • shouldIgnorePathByUserRules(...)

Supported rule behavior:

  • blank lines and comments (# ...)
  • glob patterns (*, **)
  • directory rules with trailing slash (e.g. data/)
  • anchored root-style patterns with leading /
  • negation rules with ! (later rules override earlier rules)

2. Scan pipeline integration

Threaded user-ignore options through ingestion scan path:

  • walkRepositoryPaths(...) now accepts options and applies user rules during filtering
  • runPipelineFromRepo(...) now accepts pipeline options and forwards them to the walker

This keeps all existing hardcoded ignore behavior intact and adds user ignore as an additive layer.

3. Analyze CLI override flag

Added new analyze option:

  • --no-user-ignore

Behavior:

  • default: .gitnexusignore is used if present
  • with --no-user-ignore: .gitnexusignore is ignored, built-in rules only

How It Works

  1. Analyzer starts pipeline.
  2. Filesystem walker loads .gitnexusignore rules (if enabled and file exists).
  3. During scan filtering, each path is excluded if:
    • built-in ignore rules match, OR
    • user ignore rules match.
  4. Remaining files proceed through existing structure/parse/index pipeline unchanged.

Examples

Example .gitnexusignore:

# ignore non-code data
/data/

# ignore json files globally
**/*.json

# but keep one config file
!src/config/keep.json

Default run:

gitnexus analyze

Bypass user ignore file:

gitnexus analyze --no-user-ignore

Testing

Added tests for both parsing/matching and end-to-end scan behavior:

  • Unit tests (ignore-service.test.ts)
    • comments/blanks
    • glob patterns
    • negation
    • root-anchored patterns
  • Integration tests (filesystem-walker.test.ts)
    • .gitnexusignore applied by default
    • useUserIgnoreFile: false bypass path

Validation performed:

  • npm run build
  • npm run test
  • npm run test:integration ✅ for test assertions; existing unrelated worker-module warnings/errors in integration environment remain unchanged by this PR.

Backward Compatibility

  • No breaking change.
  • Existing built-in ignore behavior is preserved.
  • New behavior is opt-in by file presence and opt-out via flag.

Notes

This implementation intentionally focuses on practical glob-based repo-level control for indexing scope and avoids introducing a heavier full gitignore engine. It is designed to solve the high-value use case (excluding non-core directories/files from indexing) with minimal complexity.

@vercel

vercel Bot commented Mar 6, 2026

Copy link
Copy Markdown

@L1nusB is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@xkonjin xkonjin left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review pass:

  • Main risk area here is data integrity, transaction boundaries, and backward-compatible persistence.
  • Good to see test coverage move with the code; I’d still make sure it exercises the unhappy path around data integrity, transaction boundaries, and backward-compatible persistence rather than only the happy path.
  • Before merge, I’d smoke-test the behavior touched by README.md, README.md, analyze.ts (+6 more) with malformed input / retry / rollback cases, since that’s where this class of change usually breaks.

@reversTeam reversTeam left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, well-implemented feature. The .gitnexusignore support follows familiar .gitignore semantics, which is great for user ergonomics.

Strengths:

  • Solid parsing logic: supports comments, blank lines, directory patterns (trailing /), negation (!), root-anchored patterns (leading /), and glob patterns. The parseUserIgnoreRule function is well-thought-out.
  • Correct .gitignore-like last-match-wins semantics in shouldIgnorePathByUserRules.
  • The --no-user-ignore CLI flag integrates cleanly via Commander's built-in --no- prefix convention.
  • Good error handling in loadUserIgnoreRules — ENOENT/ENOTDIR are gracefully handled while unexpected errors are re-thrown.
  • .gitnexusignore itself is added to the IGNORED_FILES set so it won't be indexed.
  • Comprehensive test coverage: unit tests for parsing, negation, anchoring, bare patterns, and load error handling. Integration tests for filesystem-walker with both enabled and disabled ignore rules.
  • Documentation updated in both README files with clear examples.

Minor note:

  • The --skip-embeddings flag was replaced by --embeddings (opt-in instead of opt-out) in the README. This looks like a separate behavioral change bundled with this PR — if it was intentional, it might warrant a note in the PR description. Not a blocker.

LGTM!

feat(analyze): support .gitnexusignore with optional bypass

Apply repo-level user ignore patterns during indexing by default.

Add --no-user-ignore to keep built-in ignore behavior only.

Cover parsing and walker behavior with unit/integration tests.

Include analyzer-generated context files per workspace request.

docs(readme): explain index scope control with .gitnexusignore

Add a dedicated Index Scope Control section in both READMEs.

Document --no-user-ignore behavior and add practical examples.

Keep agent guidance minimal via ai-context template and regenerated files.

chore(pr): revert auto-generated and non-feature docs churn

Reset generated AGENTS/CLAUDE/.claude artifacts and package-lock noise to base state.

Keep PR focused on .gitnexusignore feature code, tests, and README docs only.

fix(ignore): cover directory matches and surface fs errors

Expand user ignore rule parsing so bare and trailing-slash directory patterns match both the directory and descendants.

Only treat missing .gitnexusignore as optional; rethrow other read errors to avoid silently hiding IO/permission failures.

Add unit tests for bare-pattern directory behavior and non-ENOENT error propagation.

fix(ignore): treat .gitnexusignore as metadata file

Add USER_IGNORE_FILE to IGNORED_FILES so the control file is explicitly excluded as repository metadata.

Keep behavior robust even if glob dotfile settings change by relying on ignore-service defaults.

Extend unit coverage to assert .gitnexusignore is ignored by exact-name rules.
@L1nusB

L1nusB commented Mar 11, 2026

Copy link
Copy Markdown
Contributor Author

Resolve merge conflicts

@github-actions

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Unit Tests success 3 platforms
✅ Integration success 3 OS x 4 groups = 12 jobs

Test Results

1152 passed
· 316 suites · 1152 total
· ⏱️ 18s
· 📊 926 unit + 226 integration

Code Coverage

Combined (Unit + Integration)

Metric Coverage Covered Threshold Status
Statements 40.86% 2552/6245 26% 🟢 ████████░░░░░░░░░░░░
Branches 34.24% 1442/4211 23% 🟢 ██████░░░░░░░░░░░░░░
Functions 43.67% 297/680 28% 🟢 ████████░░░░░░░░░░░░
Lines 42.03% 2391/5688 27% 🟢 ████████░░░░░░░░░░░░
Coverage breakdown by test suite

Unit Tests

Metric Coverage Covered Threshold Status
Statements 29.36% 1834/6245 26% 🟢 █████░░░░░░░░░░░░░░░
Branches 26.26% 1106/4211 23% 🟢 █████░░░░░░░░░░░░░░░
Functions 31.61% 215/680 28% 🟢 ██████░░░░░░░░░░░░░░
Lines 30.32% 1725/5688 27% 🟢 ██████░░░░░░░░░░░░░░

Integration Tests

Metric Coverage Covered Threshold Status
Statements 21.79% 1361/6245 26% 🔴 ████░░░░░░░░░░░░░░░░
Branches 16.97% 715/4211 23% 🔴 ███░░░░░░░░░░░░░░░░░
Functions 23.67% 161/680 28% 🔴 ████░░░░░░░░░░░░░░░░
Lines 22.45% 1277/5688 27% 🔴 ████░░░░░░░░░░░░░░░░
Coverage thresholds are auto-ratcheted — they only go up

Vitest thresholds.autoUpdate bumps the floor whenever local coverage exceeds it.
CI enforces the current thresholds; developers commit the ratcheted values.


📋 View full run · Generated by CI

@L1nusB L1nusB requested a review from reversTeam March 12, 2026 19:43
@jecanore

Copy link
Copy Markdown
Contributor

Hi @L1nusB! I opened PR #301 which addresses .gitnexusignore alongside the unsupported language crash. I went with the ignore npm package (.gitignore-spec compliant) rather than a custom parser to keep the surface area small, but your test coverage was a great reference. Would love your review if you have time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants