feat(hooks): auto-reindex after git commit with embeddings preservation#205
Conversation
feat(hooks): auto-reindex after git commit with embeddings preservation Add PostToolUse hook that re-runs `gitnexus analyze` after git commit/merge, automatically detecting and preserving embeddings via meta.json stats. - Persist embeddings count in meta.json stats.embeddings field - Add PostToolUse handler to both hook variants (cjs + plugin) - Register PostToolUse hook in setup.ts for Claude Code - Add "Keeping the Index Fresh" section to generated CLAUDE.md/AGENTS.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> revert running gitanalyz fix: address code review findings for auto-reindex hooks - Fix hook timeout units: seconds not milliseconds (8000->8, 120000->120) - Remove unused execFileSync import from gitnexus-hook.cjs - Remove unused `output` variable in PostToolUse handler - Remove spurious template interpolation in ai-context.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> refactor: clean up gitnexus-hook.cjs per review feedback - Hoist spawnSync import to module scope - Add shell: isWin for npx fallback on Windows - Extract findGitNexusDir helper, reuse in both PreToolUse and PostToolUse Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> fix(hooks): stricter git regex, proper spawnSync error handling, embeddings in recovery commands - Tighten commit/merge regex to not match git merge-base (require \s|$ after subcommand) - Replace try/catch with child.error/signal inspection for spawnSync timeout detection - Include --embeddings in manual recovery commands when embeddings were detected - Extract emitPostToolContext helper to reduce duplication - Apply all fixes to both hook variants (cjs + plugin) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> fix(hooks): single-launch CLI resolution, guard PreToolUse stderr on failure - Plugin: detect gitnexus binary via which/where once, then run exactly once (prevents double execution when binary exists but command fails) - Both hooks: only forward augment stderr as additionalContext when exit code is 0, preventing CLI error output from leaking into agent context Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> docs: update README and CLI skill for PostToolUse auto-reindex - README: editor support table now shows PreToolUse + PostToolUse - README: description mentions auto-reindex after commits - gitnexus-cli skill: document auto-reindex in "When to run" section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@L1nusB is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
- Remove shell:true from CJS hook npx fallback, use npx.cmd on Windows - Use sendHookResponse() consistently in both hook variants - Fix setup.ts path escaping with JSON.stringify for safe interpolation - Add path.isAbsolute(cwd) guards against crafted stdin input - Reduce PreToolUse CLI timeout from 8s to 7s - Truncate debug error messages to 200 chars - Add 73 regression tests covering shell injection, cwd validation, dispatch routing, staleness detection, and cross-platform spawning
Security & Cross-Platform Hardening (Review Round 2)After multi-agent code review, this commit addresses 7 findings across both hook variants and the setup installer: Critical Fix
Medium Fixes
Low-Risk Hardening
Regression TestsAdded 73 new tests in
All 921 tests pass (41 test files), build clean. |
Design Note: Notify-Only Staleness Detection (Not Auto-Reindex)This PR intentionally does not run Why this approach is betterRunning
Notify-only is safer and smarter:
How it worksThis pattern follows the same principle as LSP diagnostics — the tool reports, the user (or agent) acts. |
|
Thank you for your contribution @L1nusB ! |
…bhigyanpatwari#205) Adds PostToolUse hook that detects stale GitNexus index after git mutations (commit, merge, rebase, cherry-pick, pull) and notifies the agent to reindex. Uses lightweight staleness check (git rev-parse HEAD vs meta.json) instead of running gitnexus analyze synchronously, avoiding KuzuDB corruption and 120s blocks. Security and cross-platform hardening: remove shell:true from all spawnSync calls, use .cmd extensions on Windows, add path.isAbsolute(cwd) guards, fix setup.ts path escaping with JSON.stringify, use sendHookResponse() consistently. Includes 73 regression tests.
…bhigyanpatwari#205) Adds PostToolUse hook that detects stale GitNexus index after git mutations (commit, merge, rebase, cherry-pick, pull) and notifies the agent to reindex. Uses lightweight staleness check (git rev-parse HEAD vs meta.json) instead of running gitnexus analyze synchronously, avoiding KuzuDB corruption and 120s blocks. Security and cross-platform hardening: remove shell:true from all spawnSync calls, use .cmd extensions on Windows, add path.isAbsolute(cwd) guards, fix setup.ts path escaping with JSON.stringify, use sendHookResponse() consistently. Includes 73 regression tests.
Summary
gitnexus analyzeaftergit commitorgit merge, keeping the knowledge graph index fresh without manual intervention.gitnexus/meta.jsonto check if the previous index included embeddings (stats.embeddings > 0) and passes--embeddingsaccordingly — preventing accidental deletion of expensive vector embeddingsmeta.jsonso hooks and external tools can detect previous embeddings without opening KuzuDBWhy
The GitNexus index becomes stale after every commit. Currently, staleness is detected reactively — MCP tools warn the agent, and the CLAUDE.md says "run analyze if stale." This has two problems:
analyzewithout--embeddingswipes previously generated embeddings because KuzuDB is fully rebuilt every time (lines 196-200 in analyze.ts). There was no way to detect whether embeddings existed before, so a naive reindex would lose them.This PR solves both by automating reindex at the right moment (post-commit) with the right flags (auto-detecting embeddings).
How It Works
Bashtool execution completes/\bgit\s+(commit|merge)(\s|$)/— skips everything else instantly (includinggit merge-baseand similar subcommands).gitnexus/meta.json→ checksstats.embeddings > 0gitnexus analyze [--embeddings]synchronously (120s timeout)additionalContextto the agent confirming the index was updatedOn failure or timeout, the hook returns a recovery command that includes
--embeddingswhen appropriate, so the user can run it manually.For non-Claude-Code integrations (Cursor, Windsurf, etc.), the new "Keeping the Index Fresh" section in AGENTS.md provides equivalent guidance as instructions the agent can follow after committing.
Changes
gitnexus/src/cli/analyze.tsCodeEmbeddingcount after indexing, persist inmeta.jsonstats.embeddingsgitnexus/hooks/claude/gitnexus-hook.cjshandlePostToolUse()for auto-reindex; refactor into modular functionsgitnexus/src/cli/setup.ts~/.claude/settings.jsonduringgitnexus setupgitnexus/src/cli/ai-context.tsgitnexus-claude-plugin/hooks/gitnexus-hook.jsgitnexus-claude-plugin/hooks/hooks.jsonREADME.mdgitnexus/skills/gitnexus-cli.mdDesign Decisions
Why PostToolUse hook instead of git post-commit hook?
core.hooksPathor husky)Why blocking (synchronous) instead of background?
Why 120s timeout?
Why persist embeddings count in meta.json?
MATCH (e:CodeEmbedding) RETURN count(e)), which requires loading the native modulestats.embeddingsfield already existed in theRepoMetainterface but was never populatedRobustness
Several hardening measures were applied during review:
/\bgit\s+(commit|merge)(\s|$)/prevents false triggers ongit merge-base,git commit-tree, etc.child.error/child.signalinstead of relying on try/catch (spawnSync doesn't throw on timeout)additionalContextwhen exit code is 0 — CLI errors no longer leak into agent contextgitnexusbinary viawhich/whereonce, then runs exactly once (prevents double execution)shell: truefor npx fallback on Windows where.cmdfiles need a shell🤖 Generated with Claude Code