Skip to content

feat(wiki): support local Claude and Codex providers#1769

Merged
magyargergo merged 9 commits into
abhigyanpatwari:mainfrom
lianliankan007:feature/gitnexus-wiki-support-local-CLI
May 25, 2026
Merged

feat(wiki): support local Claude and Codex providers#1769
magyargergo merged 9 commits into
abhigyanpatwari:mainfrom
lianliankan007:feature/gitnexus-wiki-support-local-CLI

Conversation

@lianliankan007

Copy link
Copy Markdown
Contributor

Summary

Motivation / context

Areas touched

  • gitnexus/ (CLI / core / MCP server)
  • gitnexus-web/ (Vite / React UI)
  • .github/ (workflows, actions)
  • eval/ or other tooling
  • Docs / agent config only (AGENTS.md, CLAUDE.md, .cursor/, llms.txt, etc.)

Scope & constraints

In scope

Explicitly out of scope / not done here

Implementation notes

Testing & verification

  • cd gitnexus && npm test
  • cd gitnexus && npm run test:integration (if core/indexing/MCP paths changed)
  • cd gitnexus && npx tsc --noEmit
  • cd gitnexus-web && npm test (if web changed)
  • cd gitnexus-web && npx tsc -b --noEmit (if web changed)
  • Manual / Playwright E2E (note environment — see gitnexus-web/e2e/)

Risk & rollout

Checklist

  • PR body meets repo minimum length (workflow may label short descriptions)
  • If AGENTS.md / overlays changed: headers, scope block, and changelog updated per project conventions
  • No secrets, tokens, or machine-specific paths committed

command

  • gitnexus wiki --provider claude
  • gitnexus wiki --provider claude --model claude-sonnet-4-6
  • gitnexus wiki --provider codex
  • gitnexus wiki --provider codex --model gpt-5.4

@vercel

vercel Bot commented May 22, 2026

Copy link
Copy Markdown

Someone is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@magyargergo

Copy link
Copy Markdown
Collaborator

@lianliankan007 can you please rebase your changes on the tip of the main branch? 🙏

@magyargergo

Copy link
Copy Markdown
Collaborator

There's no need to make changes in the Express routes, as it's been fixed recently.

@magyargergo

Copy link
Copy Markdown
Collaborator

@lianliankan007 Can you please resolve the merge conflicets?

@lianliankan007 lianliankan007 force-pushed the feature/gitnexus-wiki-support-local-CLI branch from 07e11f0 to 349950c Compare May 25, 2026 01:37
@lianliankan007

Copy link
Copy Markdown
Contributor Author

@magyargergo
Thanks for the review!
I've rebased this branch on the latest main and resolved the merge conflicts

@github-actions

github-actions Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 5m 48s —— View job


PR #1769 — Production-Readiness Review

Tasks:

  • Post initial comment
  • Read required docs (DoD.md, GUARDRAILS.md, AGENTS.md)
  • Inspect git diff and branch hygiene
  • Read all changed files
  • Read related existing files (cursor-client.ts, types.ts, llm-client.ts, repo-manager.ts, generator.ts, wiki.ts)
  • Run Unicode/bidi hygiene checks
  • Assess CLI contracts (Claude + Codex invocation)
  • Assess security and subprocess safety
  • Assess timeout/retry/progress semantics
  • Assess config/model persistence
  • Assess test quality
  • Assess CI status
  • Write final review

Review bar for this PR

Applied against: DoD.md v2.0.0, GUARDRAILS.md, and the PR-specific DoD in the trigger comment.

Required for merge: TypeScript correctness, unit+integration tests on the real changed path, CI (quality+tests gate), no subprocess hang risk, no injection surfaces, no contract drift between providers, and branch hygiene.


Problem being solved

Extend gitnexus wiki to accept --provider claude and --provider codex, routing LLM calls to locally authenticated Claude Code CLI and OpenAI Codex CLI respectively, mirroring the existing --provider cursor pattern. Users with local CLI auth avoid needing API keys. The PR adds local-cli-client.ts (290 lines), extends config types, and adds tests.


Current PR state

  • One commit (349950c) on top of a clean rebase of main.
  • Changed files are all under gitnexus/ — no Express route changes, no web/server churn.
  • Branch hygiene is clean. No merge commits, no unrelated files.

Merge status and mergeability

CI did not run. This is a fork PR (lianliankan007:feature/...abhigyanpatwari:main). GitHub Actions requires explicit authorization for first-time fork contributors before workflows execute. The "0 checks" on the Checks tab is the result of workflows being skipped due to fork permissions, not path filters. The required quality + tests + e2e + scope-parity CI gate did not run for this PR.


Repository history considered


Branch hygiene assessment

Clean. One feature commit. Only gitnexus/ files. No generated artifacts, lockfile churn, version bumps, or unrelated cleanup. No hidden Unicode/bidi controls detected in any changed file.


Understanding of the change

The PR adds local-cli-client.ts which:

  1. Detects Claude/Codex binaries via execFileSync ... --version with a module-level cache (same pattern as cursor-client.ts)
  2. Builds command arrays and spawns subprocesses via spawn() with stdin piped
  3. Passes the full prompt (system + user) via stdin, avoiding command-line injection
  4. For Codex, writes --output-last-message to a temp file and reads it back
  5. Uses CI=1 env override to suppress interactive prompts

Config changes add claudeModel and codexModel to CLIConfig, with correct per-provider save/load routing in wiki.ts via localModelConfigKey(). resolveLLMConfig reads the right saved model per provider. Interactive setup correctly enumerates only available local CLIs and assigns incrementing choice numbers.

The changes to generator.ts, llm-client.ts, and the type union in LLMProvider are minimal and correct.


Findings


Finding 1: Local CLI subprocess has no timeout — --timeout is silently ignored

Risk: callClaudeLLM and callCodexLLM both call runLocalCLI, which spawns a subprocess with no timeout mechanism. If the CLI hangs (network stall, model loading, interactive prompt leak in non-CI environments), gitnexus wiki hangs indefinitely. The --timeout flag translates to llmConfig.requestTimeoutMs (wiki.ts:440-441), which is respected by the HTTP path via AbortSignal.timeout() (llm-client.ts:273-276), but LocalCLIConfig has no requestTimeoutMs field and runLocalCLI never sets a kill timer. The options?.onChunk callback is forwarded for progress reporting but carries no timeout information.

Evidence: local-cli-client.ts:153-221 (no AbortController, no setTimeout kill), llm-client.ts:273-276 (HTTP path uses AbortSignal.timeout), wiki.ts:440-445 (timeout set on llmConfig not forwarded to local path).

Recommended fix: Add a requestTimeoutMs field to LocalCLIConfig, pass it from invokeLLM when provider is claude or codex, and in runLocalCLI set a setTimeout that calls child.kill() and rejects with a timeout error matching the existing HTTP timeout message format.

Blocks merge: yes — local providers can hang the wiki process indefinitely with no user-visible error.


Finding 2: windowsHide: true missing from spawn() — Windows UX regression vs Cursor

Risk: On Windows, every LLM call to Claude or Codex during wiki generation flashes a console window. cursor-client.ts:109 explicitly sets windowsHide: true in spawn options. local-cli-client.ts:154 omits it. Windows is a confirmed supported platform (cross-platform test matrix runs windows-latest with cli-e2e.test.ts in SPAWN_CLI).

Evidence: cursor-client.ts:109 has windowsHide: true; local-cli-client.ts:154-161 does not.

Recommended fix: Add windowsHide: true to the spawn() options in runLocalCLI.

Blocks merge: maybe — Windows platform is supported; this causes visible UX regression on every wiki generation.


Finding 3: GITNEXUS_MODEL env var overrides provider-specific saved model

Risk: In resolveLLMConfig (llm-client.ts:81-84), precedence for model is: CLI --model override → process.env.GITNEXUS_MODELsavedLocalModel (claudeModel/codexModel). If a user has GITNEXUS_MODEL=gpt-4o set for their normal OpenAI workflow and then runs gitnexus wiki --provider claude, gpt-4o is passed as --model gpt-4o to the Claude CLI. The Claude CLI will reject unsupported OpenAI model names, producing an unhelpful error. The HTTP path also uses GITNEXUS_MODEL but it controls the OpenAI model name directly, so there the behavior is intentional. For local providers, the env var cross-contaminates.

Evidence: llm-client.ts:81-84:

model:
  overrides?.model ||
  process.env.GITNEXUS_MODEL ||
  savedLocalModel ||
  (localProvider ? '' : savedConfig.model || 'minimax/minimax-m2.5'),

Recommended fix: For local providers, skip GITNEXUS_MODEL in the precedence chain (or introduce a provider-specific env var), so the precedence is: --model CLI flag → savedLocalModel'' (CLI default).

Blocks merge: maybe — requires a common mixed-use environment to trigger, but when it does, the error message doesn't explain why.


Finding 4: No subprocess-contract tests — tests don't validate argv, stdin, or exit behavior

Risk: All tests in wiki-flags.test.ts mock child_process at the module level. They verify type correctness, caching, and config routing, but not the actual subprocess contract. No test asserts:

  • The exact argv sent to claude or codex (e.g., that -p --output-format text --no-session-persistence is present, that --model appears only when set)
  • That stdin receives the full prompt (systemPrompt + --- separator + userPrompt)
  • That a non-zero exit code produces a non-zero process.exitCode and actionable error message
  • That empty stdout produces a useful error (currently { content: '' } is returned via resolveOnce)
  • That temp file cleanup runs even when readFile fails (callCodexLLM uses .catch(() => '') which silently swallows read errors)
  • That the --output-last-message path is correctly written and read for Codex

The two cli-e2e.test.ts additions only check that API key: is absent in output — they don't assert exit code, process behavior, or correct provider routing. Per DoD §2.7: "Tests cover the real changed path — they would fail if behavior, wiring, or contracts were broken, not only if a mock were misconfigured."

Evidence: wiki-flags.test.ts uses vi.doMock('child_process', ...) throughout; cli-e2e.test.ts:1200-1215 checks string absence only.

Recommended fix: Add at least one test per provider that spawns a fake binary (a script that asserts its argv and stdin and exits 0 with known output), placed in SPAWN_CLI of cross-platform-tests.ts. These tests prove the subprocess contract and catch future arg drift.

Blocks merge: yes — the runtime contract for Claude and Codex invocation is entirely unverified by tests. A broken flag or wrong stdin behavior would ship undetected.


Finding 5: Codex CLI command contract is unverified

Risk: The constructed Codex command codex exec --cd <dir> --sandbox read-only -c approval_policy="never" --color never --output-last-message <path> [--model <model>] - relies on specific Codex CLI subcommand and flag names. The -c approval_policy="never" syntax (short flag + key="value") is non-standard and may not match the current @openai/codex CLI. The --output-last-message <path> flag also needs verification. If either flag has changed or is named differently, the feature silently fails on first real use despite tests passing.

Evidence: local-cli-client.ts:101-118.

Recommended fix: Verify the current @openai/codex CLI flag contract (subcommand exec, -c config format, --output-last-message) against current CLI documentation or a real installed binary. Lock the contract with a subprocess-level test (see Finding 4).

Blocks merge: yes if contract is wrong — unverifiable from source alone; requires real CLI validation or a stub that enforces the contract.


Finding 6: Claude CLI -p sends prompt via stdin — correct but untested

The Claude implementation uses stdin (child.stdin.end(stdinText)) rather than a command-line argument, which is correct (avoids argument injection). The flags -p --output-format text --no-session-persistence look consistent with current Claude Code CLI print-mode behavior. However, like Finding 5, no test enforces this contract. If --no-session-persistence is renamed or removed in a future Claude Code version, nothing catches the regression.

Evidence: local-cli-client.ts:77-83.

Recommended fix: Covered by the subprocess-level tests from Finding 4.

Blocks merge: no (flags appear correct) — but should be locked by tests before merge.


Finding 7: CI gate did not run

The required CI gate (quality + tests) has not executed for this PR. No test results, no TypeScript check output, and no coverage evidence exist for these changes. Per DoD §4.2, tsc --noEmit and npm test are required for any gitnexus/ change. Without CI evidence, correctness of even basic type-checking is unconfirmed.

Evidence: PR Checks tab shows 0 checks. Fork PR authorization was not granted. Confirming from workflow: ci.yml triggers on pull_request: branches: [main] — fork PRs require explicit authorization in the GitHub UI before workflows run.

Action required: Maintainer must either authorize the workflow run in the PR UI or provide evidence of local validation (npx tsc --noEmit + npm test outputs for the current HEAD).

Blocks merge: yes.


Finding 8: Empty stdout from Claude produces silent empty-content response

Risk: In runLocalCLI, if the subprocess exits 0 but writes nothing to stdout, resolveOnce({ content: '' }) is called. The caller receives { content: '' } with no error. In the wiki pipeline, this produces an empty markdown page for that module — silently corrupting the wiki output with no user-visible error.

Evidence: local-cli-client.ts:213: resolveOnce({ content: stdout.trim() }). No empty-content guard exists for local providers, unlike the HTTP path (llm-client.ts:331: if (!choice?.message?.content) { throw new Error('LLM returned empty response') }).

Recommended fix: Add a guard in runLocalCLI or in callClaudeLLM/callCodexLLM: if stdout.trim() is empty after a successful exit, throw new Error(\${provider} CLI returned empty output`)`.

Blocks merge: maybe — produces corrupted (silent, empty) wiki pages with no actionable error.


PR-specific assessment sections

1. Provider contract and scope: Correctly scoped to CLI wiki. Issue #97 (web bridge, Gemini) should remain open.

2. CLI parsing and help text: --provider claude|codex appears consistently in cli/index.ts, i18n/en.ts, and i18n/zh-CN.ts. Help text verified by tests.

3. Config resolution and persistence: CLIConfig type correctly adds claudeModel / codexModel. localModelConfigKey() routes saves correctly. resolveLLMConfig reads savedLocalModel correctly per provider. Gap: GITNEXUS_MODEL precedes savedLocalModel (Finding 3). No migration needed — new optional fields default to undefined.

4. Local CLI detection: Cache behavior is correct and consistent with Cursor. findWindowsCommand correctly uses where.exe. Windows executable resolution logic is reasonable. Gap: windowsHide: true missing from spawn (Finding 2).

5. Claude invocation: Prompt via stdin is correct. Flags look correct. Gap: No subprocess-contract test (Finding 4); no empty-stdout guard (Finding 8).

6. Codex invocation: Read-only sandbox won't block tmpdir writes. Output file path correctly uses os.tmpdir(). Cleanup is in finally. Gap: Command contract unverified (Finding 5); no subprocess-contract test (Finding 4).

7. Security and trust boundaries: Prompt content flows via stdin, not command args (good). maskPromptArgs masks stdin marker in logs. No shell injection risk (no shell: true). No secrets logged. verboseLog only logs on GITNEXUS_VERBOSE=1. workingDirectory comes from repoPath (trusted internal value). Temp dir uses os.tmpdir(). No obvious injection surfaces. Gap: windowsHide missing (Finding 2 — minor visual exposure).

8. Runtime reliability: Missing binary throws fast with an actionable message. Non-zero exit rejects with stderr+stdout context. stdinError tracked and surfaced. Gaps: No timeout (Finding 1); empty stdout silent (Finding 8); hung process (Finding 1).

9. Timeout, retry, rate-limit, progress: onChunk forwarded to local providers and wired to progress bar. Gap: requestTimeoutMs not forwarded — --timeout is silently ignored for local providers (Finding 1). No retry semantics for local providers (explicitly different from HTTP, but not documented or tested).

10. Existing provider regression: resolveLLMConfig type union is additive. generator.ts provider check order is explicit. HTTP path is untouched. Cursor path is untouched. No regression risk from the type change.

11. Test quality: See Finding 4. Tests are too mocked to validate runtime subprocess behavior.

12. CI and branch protection: See Finding 7. CI did not run.


Back-and-forth avoided by verifying

  • Express route changes: not present in this diff — confirmed clean.
  • Merge conflicts: HEAD is a clean rebase — no conflict artifacts.
  • Unicode/bidi: scanned all changed files — none found.
  • hasSavedConfig logic: correctly treats local providers as "has config" (line 220: isLocalProvider(savedConfig.provider)), so non-interactive mode works.
  • Interactive invalid choice: falls to OpenAI path with API key prompt and abort — acceptable UX, not a silent wrong-provider bug.

Open questions

None that can't be answered by inspecting source or running the CLI.


Final verdict

Not production-ready.

Five confirmed blockers: (1) Local CLI subprocesses have no timeout, causing gitnexus wiki --provider claude/codex to hang indefinitely when the CLI stalls — the --timeout flag is silently ignored for local providers. (2) Tests never validate the subprocess contract (argv, stdin, exit codes) and are fully mocked — the Codex and Claude invocation could be broken at runtime without any test catching it. (3) Empty stdout from a local CLI produces a silent empty page with no error. (4) The Codex CLI command contract (codex exec -c approval_policy="never" --output-last-message) is unverified against the current @openai/codex release. (5) CI did not run due to fork authorization, and no local validation evidence (tsc --noEmit output, npm test results) has been provided.

The implementation approach is well-structured and follows the Cursor precedent closely, but the missing timeout, the absence of real subprocess-level tests, and the empty-output silent failure are runtime risks that block merge under the project's own DoD bar.

@magyargergo

Copy link
Copy Markdown
Collaborator

@lianliankan007 could you please look into claude's findings above?

Repository owner deleted a comment from github-actions Bot May 25, 2026
@magyargergo

Copy link
Copy Markdown
Collaborator

/autofix

@github-actions

Copy link
Copy Markdown
Contributor

✅ Applied autofix and pushed a commit. (apply run)

@github-actions

github-actions Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
9767 9765 0 2 613s

✅ All 9765 tests passed

2 test(s) skipped — expand for details
  • PHP pipeline benchmark > scales with file count (workers enabled)
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 80.07% 33145/41394 N/A% 🟢 ████████████████░░░░
Branches 68.64% 21210/30900 N/A% 🟢 █████████████░░░░░░░
Functions 84.87% 3434/4046 N/A% 🟢 ████████████████░░░░
Lines 83.45% 29843/35758 N/A% 🟢 ████████████████░░░░

📋 View full run · Generated by CI

magyargergo and others added 7 commits May 25, 2026 07:14
- Add subprocess timeout: LocalCLIConfig gains requestTimeoutMs,
  runLocalCLI sets a kill timer that rejects with an actionable error
  matching the HTTP timeout message format. --timeout is no longer
  silently ignored for claude/codex providers.
- Add windowsHide: true to spawn() to prevent console window flash on
  Windows, matching cursor-client.ts behavior.
- Skip GITNEXUS_MODEL env var for local providers so a user's OpenAI
  model name doesn't cross-contaminate claude/codex CLI invocations.
  Precedence for local providers: --model → savedLocalModel → ''.
- Guard against empty stdout: reject with actionable error when CLI
  exits 0 but produces no output, preventing silent empty wiki pages.
- Move empty-output guard from runLocalCLI to per-provider callers so
  Codex can read --output-last-message file even when stdout is empty
- Merge existing config in interactive setup (local + Azure paths) to
  prevent saveCLIConfig from erasing previously saved API keys
- Use StringDecoder for stdout/stderr to handle multi-byte UTF-8 chars
  split across pipe chunk boundaries
- Distinguish ENOENT from non-zero exit in detectLocalCLI so users see
  auth guidance instead of misleading "CLI not found" when the binary
  exists but is not authenticated
Add 21 integration-level tests covering the Claude and Codex subprocess
contracts that wiki-flags.test.ts mocks out:

- Claude argv: -p, --output-format text, --no-session-persistence,
  --model conditional, stdin prompt content, CI=1, windowsHide:true
- Codex argv: exec subcommand, --sandbox read-only, -c approval_policy,
  --output-last-message temp path, --cd, stdin marker, --model
- Timeout: kill timer fires and rejects, no timer when unset
- Codex file fallback: stdout used when file missing, error when both empty
- detectLocalCLI: warn on non-ENOENT, silent on ENOENT
- onChunk: cumulative byte count forwarded

Also register the test in cross-platform-tests.ts SPAWN_CLI section and
fix detectLocalCLI ENOENT detection logic (invert the check so non-ENOENT
errors produce a warning).
- Add killChildTree helper that uses taskkill /T /F /PID on Windows to
  terminate the entire process tree (including cmd.exe grandchildren),
  with fallback to child.kill() if taskkill fails or on non-Windows
- Add Codex CLI flag contract snapshot test that locks the exact spawn
  args — any flag rename, reorder, or removal is caught immediately
- Add Windows taskkill tests: success path asserts taskkill called with
  correct PID and /T /F flags, failure path verifies child.kill() fallback
@magyargergo magyargergo merged commit 5ce448a into abhigyanpatwari:main May 25, 2026
25 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants