feat: source-sink security scanning with BFS path discovery#578
feat: source-sink security scanning with BFS path discovery#578marxo126 wants to merge 3 commits into
Conversation
|
@TESTPERSONAL is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 6542 tests passed 97 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
xkonjin
left a comment
There was a problem hiding this comment.
I found one correctness issue in the new scanner.
sourceSinkScan() never loads or passes a node language, and patternMatches() only applies the language filter when a language value is present:
if (entry.languages && language && !entry.languages.includes(language)) return false;With language === undefined, every language-specific pattern is treated as eligible. In practice that means Java, Go, Ruby, Rust, PHP, etc. source/sink signatures can all match any function body that happens to contain the same token text, which will inflate findings quite a bit.
Example: a TypeScript function containing params[ or system( can be classified using the Rails-only rules, and a JS file containing Command::new in a string/comment could be treated as a Rust sink.
I think this needs one of these fixes before relying on the results:
- return
falsefor language-scoped entries when the node language is unknown, or - fetch/store the language on Function/Method nodes and pass it into
getMatchingSources()/getMatchingSinks().
I'd also add a regression test that proves a language-scoped pattern does not match when scanning content from a different language.
|
@claude is it ready to be merged? |
|
Claude finished @magyargergo's task in 1m 18s —— View job Merge Readiness Assessment
Short answer: Not yet — there is one open correctness bug that needs to be fixed first. Blocking Issue: Language Filter is Never AppliedThe reviewer In // local-backend.ts:2971-2981
MATCH (n:Function)
WHERE n.id IS NOT NULL
RETURN n.id AS id, n.name AS name, n.filePath AS filePath, n.content AS contentAs a result, // local-backend.ts:3022, 3037
const matchedSources = getMatchingSources(content, undefined, compiledSources);
const matchedSinks = getMatchingSinks(content, undefined, compiledSinks);And the guard in // catalogs.ts:577
if (entry.languages && language && !entry.languages.includes(language)) return false;
// ^^^^^^^^ — always undefined → filter never firesThis means every language-scoped pattern applies to every function, regardless of the actual file language. A TypeScript file containing The fix is straightforward (add Everything Else Looks Good
The coverage drop is minor and expected given the new code paths. The architecture (BFS over CALLS, catalog-based matching, risk ranking) is solid. The language filter bug is the only thing standing between this and a merge. |
|
Fixed the language filter bug — good catch. What changed:
Ready for merge. |
Add source_sink MCP tool that scans for security-relevant data paths. Identifies source functions (user input, env vars, file reads) and sink functions (SQL queries, command exec, file writes) using configurable pattern catalogs, then uses BFS over CALLS edges to find reachable paths. Supports OWASP category filtering, user-defined catalog extensions via .gitnexus/security.json, and risk-ranked findings (critical/high/medium). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Derive language from filePath via getLanguageFromFilename and pass it to getMatchingSources/getMatchingSinks. Without this, language-scoped catalog patterns (e.g. Python-only request.data, PHP-only $_GET) matched all files regardless of language, inflating findings. Adds 3 regression tests proving language-scoped patterns don't cross-match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
47198bd to
e6a295d
Compare
|
Please submit a new PR if this is still relevant |
Summary
source_sinkMCP tool for security-relevant data path scanningA03-injection,A07-xss,A10-ssrf).gitnexus/security.jsonSplit from #561 — this PR contains only the source-sink scanning feature. Parameter data flow tracking (PASSES_TO, DATA_FLOWS_TO) will be addressed separately as part of the PDG subsystem (#567).
Test plan
🤖 Generated with Claude Code