refactor: split global BUILT_IN_NAMES into per-language provider fields#523
Conversation
…N_NAMES Add builtInNames field to LanguageProviderConfig. Rewrite noise-filter.ts to accept a LanguageProvider and check provider.builtInNames instead of a global Set. Update all 3 call sites to pass their existing provider. Built-in entries will be added per-language in subsequent commits.
…ests - Update ingestion-utils.test.ts to pass provider to isBuiltInOrNoise - Add noise-filter.test.ts with 15 cross-language isolation tests - Fix Java heritage test: serialize() is now correctly unfiltered for Java (was false-positive noise from global PHP serialize entry)
|
@mrwogu is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 4345 tests passed 1 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
|
@claude Can you check if the bult-ins are correclt for the languages? Find gaps and or anything that doesn't belong to a language built-ins. |
|
Claude finished @magyargergo's task in 3m 8s —— View job Built-in Names Audit
FindingsJavaScript / TypeScript (
|
| Language | Framework entries mixed in | Overly generic names | Notable gaps |
|---|---|---|---|
| JS/TS | React hooks + APIs | log, error, map, filter, etc. |
Web Crypto, modern globals |
| Python | None | append, update |
map, filter, any, all, open, input + ~30 others |
| Kotlin | kotlinx.coroutines, Flow |
emit, collect, catch |
— |
| C/C++ | Linux kernel macros/functions | get, put |
C++ stdlib, C++ specific |
| C# | None | Add, Remove, Count, Run |
Throw is not a method |
| PHP | dd, dump (Laravel/Symfony) |
— | — |
| Swift | UIKit, Combine | async, sync, post, prepare |
Server-side Swift has zero coverage |
| Rust | None | read, write |
— |
| Ruby | None | — | map, reduce, find aliases |
| Dart | Nearly all Flutter | push, pop, listen |
Core Dart built-ins almost absent |
|
Great audit. The per-language architecture is working as intended — this PR delivers the mechanism (provider-aware filtering, cross-language isolation, compile-time exhaustiveness). The audit now reveals that the content of each list needs work too. Three categories of issues surfaced: 1. Framework entries mixed with language built-ins — React hooks in JS/TS, kotlinx.coroutines/Flow in Kotlin, UIKit/Combine in Swift, Laravel/Symfony 2. Overly generic names — 3. Missing core built-ins — Python is missing ~30 stdlib functions ( @magyargergo — would you be OK with merging this PR as the architectural foundation and opening a follow-up for the content audit? The mechanism change (per-language isolation, provider field, call site updates) is clean and tested. The list curation is a separate concern that benefits from per-language domain expertise and can be iterated on incrementally. |
|
I want to remove the file I pointed out in an in-line comment. |
Per review feedback: delete noise-filter.ts entirely and move the check into LanguageProvider as isBuiltInName(name) method, generated by defineLanguage() from the builtInNames set. Call sites now use provider.isBuiltInName(calledName) directly.
|
Done — addressed both inline review comments:
|
* main: (114 commits) feat(csharp): C# MethodExtractor config (abhigyanpatwari#582) docs: add gitnexus-shared build step before gitnexus-web (abhigyanpatwari#585) chore: add enterprise offering section to README, ignore local_docs/ (abhigyanpatwari#579) fix(eval): exclude litellm 1.82.7 and 1.82.8 due to compatibility issues (abhigyanpatwari#580) feat(java,kotlin): MethodExtractor abstraction with per-language configs (abhigyanpatwari#576) feat: added skip-agents-md cli flag (abhigyanpatwari#517) feat(wiki): Azure OpenAI support for wiki command (abhigyanpatwari#562) refactor: reduce explicit any types (abhigyanpatwari#566) feat(java): method references, worker overload disambiguation, interface dispatch (abhigyanpatwari#540) feat: configure eslint with unused import removal (abhigyanpatwari#564) feat: configure prettier with pre-commit hook (abhigyanpatwari#563) feat: unify web and cli ingestion pipeline (abhigyanpatwari#536) fix/opencode mcp gitnexus timeout (abhigyanpatwari#363) chore: bump version to 1.4.10, update CHANGELOG fix: resolve tree-sitter peer dependency conflicts (abhigyanpatwari#538) chore: bump version to 1.4.9, add CHANGELOG.md refactor: Phase 8 & 9 — Field Types and Return-Type Binding (abhigyanpatwari#494) feat: add COBOL language support with regex extraction pipeline (abhigyanpatwari#498) fix: close remaining Dart language support gaps (abhigyanpatwari#524) refactor: split global BUILT_IN_NAMES into per-language provider fields (abhigyanpatwari#523) ... # Conflicts: # gitnexus/src/core/wiki/llm-client.ts
…ds (abhigyanpatwari#523) * refactor: make isBuiltInOrNoise provider-aware, remove global BUILT_IN_NAMES Add builtInNames field to LanguageProviderConfig. Rewrite noise-filter.ts to accept a LanguageProvider and check provider.builtInNames instead of a global Set. Update all 3 call sites to pass their existing provider. Built-in entries will be added per-language in subsequent commits. * refactor(js/ts): add per-language builtInNames to JS/TS providers * refactor(python): add per-language builtInNames * refactor(kotlin): add per-language builtInNames * refactor(c/cpp): add per-language builtInNames * refactor(csharp): add per-language builtInNames * refactor(php): add per-language builtInNames * refactor(swift): add per-language builtInNames * refactor(rust): add per-language builtInNames * refactor(ruby): add per-language builtInNames * refactor(dart): add per-language builtInNames * test: update noise-filter tests for per-language API, add isolation tests - Update ingestion-utils.test.ts to pass provider to isBuiltInOrNoise - Add noise-filter.test.ts with 15 cross-language isolation tests - Fix Java heritage test: serialize() is now correctly unfiltered for Java (was false-positive noise from global PHP serialize entry) * refactor: remove noise-filter.ts, add provider.isBuiltInName() method Per review feedback: delete noise-filter.ts entirely and move the check into LanguageProvider as isBuiltInName(name) method, generated by defineLanguage() from the builtInNames set. Call sites now use provider.isBuiltInName(calledName) directly.
Closes #522
Summary
builtInNames?: ReadonlySet<string>field toLanguageProviderConfigBUILT_IN_NAMESset into per-language provider definitions (languages/*.ts)isBuiltInOrNoise(name, provider)to checkprovider.builtInNamesinstead of a global setparse-worker.ts,call-processor.ts,type-env.ts) to pass their existingproviderEach language defines its own noise entries inline, following the same pattern as
exportChecker,typeConfig, andimportResolver. Languages without built-in noise (Java, Go) simply omit the field.Cross-language pollution fixed:
serializewas previously filtered globally (PHP section) and suppressed a legitimateuser.serialize()call in Java. The Java heritage test now correctly expects 3 CALLS edges instead of 2.Per-language commits
One commit per language for easy review:
isBuiltInOrNoisesignature change + call site updatesTest plan
tsc --noEmitcleannoise-filter.test.tsverifies cross-language isolation (e.g.,consolefiltered for JS but not Python)serialize()correctly unfiltered