style: improve parse-worker JSDoc for isLanguageAvailable#301
Conversation
|
@jecanore is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
|
Thank you for your contribution! This is an impressive work! |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 2068 tests passed 17 test(s) skipped — expand for detailsIntegration:
Code CoverageCombined (Unit + Integration)
Coverage breakdown by test suiteUnit Tests
Integration Tests
📋 View full run · Generated by CI |
|
Thank you for your contribution! |
|
@claude try to find gaps and also take a careful look at the description and the problem space. Was this tackled properly as well as covered with properly? |
|
Claude finished @magyargergo's task in 3m 14s —— View job PR Review:
|
| What's Tested | What's Missing |
|---|---|
shouldIgnorePath with .gitnexusignore patterns ✓ |
Wiki generator shouldIgnorePath with user patterns |
isLanguageAvailable for installed/fabricated langs ✓ |
Worker's local isLanguageAvailable function (only integration coverage) |
| Cache/reset behavior ✓ | Multi-repo singleton behavior (different repoPath on second call) |
No .gitnexusignore file ✓ |
console.warn output for skip logging in sequential path |
The negation test (!Package.swift) correctly validates the ignore package behavior. The "no file" test correctly validates that default ignore rules still apply. These are good.
Summary
The PR correctly and completely addresses the unsupported language crash in both parsing paths. The .gitnexusignore feature works correctly within the analysis pipeline. The main gaps are: (1) wiki generation doesn't respect .gitnexusignore (functional bug), (2) the singleton can silently ignore patterns for a second repo in the same process (latent bug), and (3) the skip-logging feature doesn't work on the primary worker path (documentation mismatch + usability gap).
… wiki init, worker skip logging - Wiki generator now calls loadUserIgnore() before shouldIgnorePath() (Fix 1) - Ignore service caches by repoPath instead of boolean, so switching repos reloads .gitnexusignore patterns correctly (Fix 2) - Worker parse results include skippedLanguages; parsing-processor merges and logs the summary -- matching the sequential path behavior (Fix 3) - Replace continue with if-guard so unavailable regular files don't skip the TSX block for the same language (Fix 4) - Add multi-repo cache reload test
|
Addressed all review gaps identified by @magyargergo's Claude bot review in commit
Tests: Added multi-repo cache reload test. All 1361 unit tests pass. |
|
@claude try to identify missing gaps and use ultra-thinking to fit this into the architecture that will keep it futures proof |
|
Claude finished @magyargergo's task in 4m 29s —— View job Claude finished @magyargergo's task — Deep Architectural Review:
|
| Location | Signature |
|---|---|
parser-loader.ts:41 |
isLanguageAvailable(language): boolean |
parse-worker.ts:179 |
isLanguageAvailable(language, filePath): boolean |
The worker's filePath parameter exists to distinguish .tsx from .ts (which use different grammars under the same SupportedLanguages.TypeScript key). The main thread doesn't need this because loadLanguage handles TSX inline.
This divergence is intentional right now, but the comment at parse-worker.ts:177 only says "duplicated because workers can't import from the main thread" — it doesn't explain why the signatures differ. The next contributor who adds a language with file-variant splits (e.g., a future .jsx vs .js grammar split) will likely update only one copy. A concrete safeguard: add a type-level satisfies assertion or a brief comment explaining the extra parameter to prevent silent divergence.
🔴 Pre-existing but now architecturally dangerous: Kotlin hard-imports invalidate the guard pattern
parser-loader.ts:11 and parse-worker.ts:12:
import Kotlin from 'tree-sitter-kotlin'; // hard static importtree-sitter-kotlin is in optionalDependencies. If it's not installed, the module fails to load entirely — the worker crashes before a single isLanguageAvailable call can execute. This PR correctly adds isLanguageAvailable guards, but for Kotlin those guards are unreachable because the import barrier comes first.
Swift is correctly guarded:
let Swift: any = null;
try { Swift = _require('tree-sitter-swift'); } catch {}Kotlin should follow the same pattern. This is pre-existing and not introduced by this PR, but the PR's stated goal is "complete unsupported language handling" — and Kotlin on an environment without the optional package is currently handled worse than Swift. A user without tree-sitter-kotlin installed hits a crash at import time, not a graceful skip.
Future-proofing: Structural recommendations
On ignore-service.ts — the "call-before-use" coupling is fragile at scale. Currently every consumer must remember to call loadUserIgnore(repoPath) before using shouldIgnorePath. fullGeneration does. incrementalUpdate now doesn't (gap above). The wiki's init flow does. Any new entry point (a future gitnexus lint or gitnexus check command) will silently bypass user patterns until someone notices.
A more future-proof pattern is lazy initialization inside shouldIgnorePath itself — accept repoPath as an optional parameter and auto-load if not yet loaded for that path. This eliminates the call-before-use requirement entirely:
// More robust API:
export const shouldIgnorePath = (filePath: string, repoPath?: string): boolean => {
if (repoPath) loadUserIgnore(repoPath);
// ... rest of function
};Or alternatively, a stronger version: throw (in dev/test) if shouldIgnorePath is called without a prior loadUserIgnore. Silent no-ops are harder to debug than loud failures.
On language extensibility: The languageMap pattern in two places (main thread + worker) means adding a new language requires exactly 2 coordinated edits — and for optional languages, exactly 4 coordinated changes (map entry + conditional spread + worker map + worker conditional spread). This is manageable now but error-prone at scale. A registry pattern in supported-languages.ts where each language declares its package name and whether it's optional would give a single source of truth.
…iscovery Add support for excluding files from indexing based on .gitignore and .gitnexusignore patterns. Previously, GitNexus used only a hardcoded ignore list, causing significant index pollution in repositories with git-ignored directories containing code (e.g., Docker-mounted volumes). Changes: - Add `ignore` package for gitignore-spec pattern matching - Add `loadIgnoreRules()` to parse .gitignore + .gitnexusignore - Add `createIgnoreFilter()` returning glob-compatible IgnoreLike object - Integrate filter into glob's `ignore` option for directory-level pruning - Remove post-glob `.filter()` call (now handled during traversal) The hardcoded DEFAULT_IGNORE_LIST remains as fallback for non-git repos. Closes abhigyanpatwari#228 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…port, wiki ignore, dead code
- Move loadUserIgnore() from fullGeneration() to run() so incrementalUpdate()
also loads .gitnexusignore patterns before calling shouldIgnorePath()
- Convert Kotlin from hard static import to try/catch dynamic require (matches
Swift pattern) in both parser-loader.ts and parse-worker.ts — prevents worker
crash when tree-sitter-kotlin is not installed
- Remove dead `if (result.skippedLanguages)` guard in parsing-processor.ts —
field is always {} (truthy), guard never filtered anything
- Improve isLanguageAvailable comment in parse-worker.ts explaining why the
worker version takes an extra filePath parameter vs main-thread version
|
Addressed the second review round in
All unit tests pass (47/49 — 2 pre-existing failures unrelated to this PR). |
- Distinguish ENOENT vs EACCES in loadIgnoreRules (warn on permission errors) - Add GITNEXUS_NO_GITIGNORE env var to bypass .gitignore parsing - Fix bare-name pattern matching in childrenIgnored (check both with/without trailing slash) - Rename isIgnoredDirectory to isHardcodedIgnoredDirectory for clarity - Add clarifying comments for design decisions (D2 negation, D3 dot:false redundancy) - Add tests for bare-name patterns, file-glob patterns, EACCES handling, env var Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- G1: Document GITNEXUS_NO_GITIGNORE in `analyze --help` and log when active - G2: Add comment clarifying path-scurry POSIX normalization contract - G3: Add IgnoreOptions interface — env var now falls back, callers can pass `noGitignore` explicitly for testability and future CLI flag - G4: Add integration test verifying walkRepositoryPaths respects the env var Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Would you mind rebasing this on top of #231? I think the two PRs are complementary, but #231 feels like the better base for the ignore-handling side, while this PR adds useful resilience for unsupported languages. Rebasing on top of that would help us combine the strengths of both and avoid having two overlapping |
|
@magyargergo Makes sense — just replied on #231 with a detailed breakdown of where each PR is stronger and a consolidation plan. Short version: I'll rebase #301 on top of #231 once it merges, dropping the overlapping
No file overlap with #231 after the rebase. I'll update this PR as soon as #231 lands. |
…rammars Port unsupported language resilience from PR abhigyanpatwari#301 by @jecanore. - Make Kotlin import optional (like Swift) in parser-loader and parse-worker - Add worker-local isLanguageAvailable() with filePath param for tsx distinction - Track and log skipped files per language in both sequential and worker paths - Add skippedLanguages to ParseWorkerResult for worker→main aggregation - Add isLanguageAvailable unit tests Refs: abhigyanpatwari#301, abhigyanpatwari#155, abhigyanpatwari#228 Co-Authored-By: jecanore <juan@housingbase.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a fixture repo (test/fixtures/ignore-and-skip-repo/) with .gitignore, .gitnexusignore, TypeScript source files, and a Swift file to exercise all three features end-to-end: - File discovery: verifies .gitignore excludes data/ and *.log, .gitnexusignore excludes vendor/, source files are discovered - Parsing: verifies TypeScript files produce Function nodes and DEFINES relationships, Swift files are skipped gracefully when grammar is unavailable Add the test to the standalone group in ci-integration.yml and coverage job. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Hi @jecanore! 👋 @magyargergo asked us to consolidate our PR #231 ( We intentionally kept our ignore-service implementation since it covers Wanted to check: are you okay with this consolidation? If you'd prefer a different attribution approach or have any concerns, happy to adjust. Your work is credited in the commit and referenced in the PR comment. |
Clarify that the filePath parameter handles .tsx distinction at the worker level, while the main-thread version does this inside loadLanguage() instead. Rebased on top of abhigyanpatwari#231 — all ignore and language resilience changes from abhigyanpatwari#301 were already ported by ivkond in 505fc9b. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cbf52c7 to
b6b5648
Compare
|
Rebased on top of #231 per @magyargergo's consolidation request. All language resilience work from this PR was already ported into #231 by @ivkond in This PR can be closed if the maintainers prefer — the substantive work now lives entirely in #231. |
…iscovery (#231) * feat(ingestion): respect .gitignore and .gitnexusignore during file discovery Add support for excluding files from indexing based on .gitignore and .gitnexusignore patterns. Previously, GitNexus used only a hardcoded ignore list, causing significant index pollution in repositories with git-ignored directories containing code (e.g., Docker-mounted volumes). Changes: - Add `ignore` package for gitignore-spec pattern matching - Add `loadIgnoreRules()` to parse .gitignore + .gitnexusignore - Add `createIgnoreFilter()` returning glob-compatible IgnoreLike object - Integrate filter into glob's `ignore` option for directory-level pruning - Remove post-glob `.filter()` call (now handled during traversal) The hardcoded DEFAULT_IGNORE_LIST remains as fallback for non-git repos. Closes #228 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ingestion): address review feedback on ignore filtering - Distinguish ENOENT vs EACCES in loadIgnoreRules (warn on permission errors) - Add GITNEXUS_NO_GITIGNORE env var to bypass .gitignore parsing - Fix bare-name pattern matching in childrenIgnored (check both with/without trailing slash) - Rename isIgnoredDirectory to isHardcodedIgnoredDirectory for clarity - Add clarifying comments for design decisions (D2 negation, D3 dot:false redundancy) - Add tests for bare-name patterns, file-glob patterns, EACCES handling, env var Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ingestion): address second round of review feedback - G1: Document GITNEXUS_NO_GITIGNORE in `analyze --help` and log when active - G2: Add comment clarifying path-scurry POSIX normalization contract - G3: Add IgnoreOptions interface — env var now falls back, callers can pass `noGitignore` explicitly for testability and future CLI flag - G4: Add integration test verifying walkRepositoryPaths respects the env var Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(ingestion): gracefully skip files with unavailable tree-sitter grammars Port unsupported language resilience from PR #301 by @jecanore. - Make Kotlin import optional (like Swift) in parser-loader and parse-worker - Add worker-local isLanguageAvailable() with filePath param for tsx distinction - Track and log skipped files per language in both sequential and worker paths - Add skippedLanguages to ParseWorkerResult for worker→main aggregation - Add isLanguageAvailable unit tests Refs: #301, #155, #228 Co-Authored-By: jecanore <juan@housingbase.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(e2e): add ignore + language-skip end-to-end test with fixture repo Add a fixture repo (test/fixtures/ignore-and-skip-repo/) with .gitignore, .gitnexusignore, TypeScript source files, and a Swift file to exercise all three features end-to-end: - File discovery: verifies .gitignore excludes data/ and *.log, .gitnexusignore excludes vendor/, source files are discovered - Parsing: verifies TypeScript files produce Function nodes and DEFINES relationships, Swift files are skipped gracefully when grammar is unavailable Add the test to the standalone group in ci-integration.yml and coverage job. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): move ignore-and-skip-e2e test to e2e group per review feedback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): use temp directory instead of fixture for e2e ignore test The fixture's .gitignore prevented data/seed.json and debug.log from being committed — these files would be missing after checkout in CI. Switch to creating the entire test structure in a temp directory via beforeAll (matching filesystem-walker.test.ts pattern). This ensures all files exist regardless of git ignore rules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): correct graph API usage in e2e ignore test Use graph.nodes property getter instead of graph.getNodes(), and check Function node filePath instead of non-existent File nodes (File nodes are created by processStructure, not processParsing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: add workflows permission to ci-integration.yml Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: change workflows permission to write per review Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: move workflows permission from ci-integration.yml to ci.yml caller Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): fix Claude workflows for fork PRs, remove misplaced workflows perm Three issues prevented Claude from running on fork PRs: 1. claude-code-review.yml lacked workflows:write — push failed when fork PRs modify .github/workflows/ files 2. claude.yml had no fork PR support — checked out main and couldn't fetch the fork's branch from origin 3. Cleanup step unconditionally deleted branches even when push failed, breaking the concurrent claude.yml workflow Also removes workflows:write from ci.yml's integration job — CI tests don't need that permission. The permission belongs on the claude workflows that push fork branches. Changes: - Add workflows:write to both claude workflow permissions blocks - Add fork PR detection + branch push/cleanup to claude.yml - Add step id to push-fork; cleanup only runs if push succeeded - Pass branch names via env vars to prevent shell injection (security) - Add concurrency groups to prevent race conditions between workflows - Remove misplaced workflows:write from ci.yml integration job * fix(ci): use GitHub API for fork branch refs instead of git push GITHUB_TOKEN cannot have 'workflows' permission — it's only valid for PATs and GitHub Apps. This means git push fails whenever a fork PR modifies .github/workflows/ files. Replace git push with the GitHub REST API (POST/PATCH /git/refs) to create temporary branch refs. The API creates a pointer to the already-existing PR head commit without triggering the workflow file push protection. Similarly, cleanup uses DELETE /git/refs instead of git push --delete. Also removes the invalid 'workflows: write' from permissions blocks. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: jecanore <juan@housingbase.io> Co-authored-by: Gergo Magyar <gergomagyar@icloud.com>
|
#231 has been merged and as dicussed we are closing this PR. |
|
Thank you for your contribution! |
…iscovery (abhigyanpatwari#231) * feat(ingestion): respect .gitignore and .gitnexusignore during file discovery Add support for excluding files from indexing based on .gitignore and .gitnexusignore patterns. Previously, GitNexus used only a hardcoded ignore list, causing significant index pollution in repositories with git-ignored directories containing code (e.g., Docker-mounted volumes). Changes: - Add `ignore` package for gitignore-spec pattern matching - Add `loadIgnoreRules()` to parse .gitignore + .gitnexusignore - Add `createIgnoreFilter()` returning glob-compatible IgnoreLike object - Integrate filter into glob's `ignore` option for directory-level pruning - Remove post-glob `.filter()` call (now handled during traversal) The hardcoded DEFAULT_IGNORE_LIST remains as fallback for non-git repos. Closes abhigyanpatwari#228 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ingestion): address review feedback on ignore filtering - Distinguish ENOENT vs EACCES in loadIgnoreRules (warn on permission errors) - Add GITNEXUS_NO_GITIGNORE env var to bypass .gitignore parsing - Fix bare-name pattern matching in childrenIgnored (check both with/without trailing slash) - Rename isIgnoredDirectory to isHardcodedIgnoredDirectory for clarity - Add clarifying comments for design decisions (D2 negation, D3 dot:false redundancy) - Add tests for bare-name patterns, file-glob patterns, EACCES handling, env var Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ingestion): address second round of review feedback - G1: Document GITNEXUS_NO_GITIGNORE in `analyze --help` and log when active - G2: Add comment clarifying path-scurry POSIX normalization contract - G3: Add IgnoreOptions interface — env var now falls back, callers can pass `noGitignore` explicitly for testability and future CLI flag - G4: Add integration test verifying walkRepositoryPaths respects the env var Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(ingestion): gracefully skip files with unavailable tree-sitter grammars Port unsupported language resilience from PR abhigyanpatwari#301 by @jecanore. - Make Kotlin import optional (like Swift) in parser-loader and parse-worker - Add worker-local isLanguageAvailable() with filePath param for tsx distinction - Track and log skipped files per language in both sequential and worker paths - Add skippedLanguages to ParseWorkerResult for worker→main aggregation - Add isLanguageAvailable unit tests Refs: abhigyanpatwari#301, abhigyanpatwari#155, abhigyanpatwari#228 Co-Authored-By: jecanore <juan@housingbase.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(e2e): add ignore + language-skip end-to-end test with fixture repo Add a fixture repo (test/fixtures/ignore-and-skip-repo/) with .gitignore, .gitnexusignore, TypeScript source files, and a Swift file to exercise all three features end-to-end: - File discovery: verifies .gitignore excludes data/ and *.log, .gitnexusignore excludes vendor/, source files are discovered - Parsing: verifies TypeScript files produce Function nodes and DEFINES relationships, Swift files are skipped gracefully when grammar is unavailable Add the test to the standalone group in ci-integration.yml and coverage job. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): move ignore-and-skip-e2e test to e2e group per review feedback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): use temp directory instead of fixture for e2e ignore test The fixture's .gitignore prevented data/seed.json and debug.log from being committed — these files would be missing after checkout in CI. Switch to creating the entire test structure in a temp directory via beforeAll (matching filesystem-walker.test.ts pattern). This ensures all files exist regardless of git ignore rules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): correct graph API usage in e2e ignore test Use graph.nodes property getter instead of graph.getNodes(), and check Function node filePath instead of non-existent File nodes (File nodes are created by processStructure, not processParsing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: add workflows permission to ci-integration.yml Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: change workflows permission to write per review Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: move workflows permission from ci-integration.yml to ci.yml caller Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): fix Claude workflows for fork PRs, remove misplaced workflows perm Three issues prevented Claude from running on fork PRs: 1. claude-code-review.yml lacked workflows:write — push failed when fork PRs modify .github/workflows/ files 2. claude.yml had no fork PR support — checked out main and couldn't fetch the fork's branch from origin 3. Cleanup step unconditionally deleted branches even when push failed, breaking the concurrent claude.yml workflow Also removes workflows:write from ci.yml's integration job — CI tests don't need that permission. The permission belongs on the claude workflows that push fork branches. Changes: - Add workflows:write to both claude workflow permissions blocks - Add fork PR detection + branch push/cleanup to claude.yml - Add step id to push-fork; cleanup only runs if push succeeded - Pass branch names via env vars to prevent shell injection (security) - Add concurrency groups to prevent race conditions between workflows - Remove misplaced workflows:write from ci.yml integration job * fix(ci): use GitHub API for fork branch refs instead of git push GITHUB_TOKEN cannot have 'workflows' permission — it's only valid for PATs and GitHub Apps. This means git push fails whenever a fork PR modifies .github/workflows/ files. Replace git push with the GitHub REST API (POST/PATCH /git/refs) to create temporary branch refs. The API creates a pointer to the already-existing PR head commit without triggering the workflow file push protection. Similarly, cleanup uses DELETE /git/refs instead of git push --delete. Also removes the invalid 'workflows: write' from permissions blocks. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: jecanore <juan@housingbase.io> Co-authored-by: Gergo Magyar <gergomagyar@icloud.com>
Summary
Minor JSDoc improvement for the
isLanguageAvailableworker function — clarifies that thefilePathparameter handles.tsxdistinction at the worker level, while the main-thread version does this insideloadLanguage()instead.Context
This PR was originally
fix: add .gitnexusignore support and complete unsupported language handling. After @magyargergo requested consolidation with #231, all substantive changes (ignore infrastructure + language resilience) were ported into #231 by @ivkond in 505fc9b.This branch has been rebased on top of #231 HEAD (fdbeb0b) and carries only the remaining JSDoc tweak.
Changes
parse-worker.tsisLanguageAvailable— added backticks, clarified main-thread vs worker distinctionTest plan
vitest run test/unit/language-skip.test.ts— 5 tests passtsc --noEmit— clean