feat(search): per-phase timing instrumentation for the query pipeline#953
Conversation
The eval harness already measures search-pipeline latency per phase, but the *product* query() tool has no timing visibility. That leaves production latency opaque: - Is BM25 the tail, or vector search? - How much Promise.all overlap do concurrent searches actually save? - Does symbol_lookup dominate when per-symbol Cypher round-trips pile up? None of this is answerable from the outside, which blocks the latency-quality Pareto work tracked in abhigyanpatwari#546 / abhigyanpatwari#553. Changes: * New PhaseTimer class at src/core/search/phase-timer.ts. Supports three APIs: - start(phase) / stop() for sequential phases (per issue spec) - mark(phase, durationMs) for pre-measured durations - time(phase, promise) to wrap a promise inside Promise.all The issue's original spec was sequential-only, which doesn't work for BM25 + vector inside Promise.all — the second start() would auto-stop the first and only one phase would get timed. The mark() and time() variants resolve that without changing the sequential API for the other phases. * local-backend.ts query() instrumented across seven phase markers: bm25, vector (concurrent via timer.time inside Promise.all) merge (RRF reciprocal-rank-fusion) symbol_lookup (per-symbol process + cohesion + content Cypher) ranking (in-memory priority sort) formatting (response object construction + dedup) wall (end-to-end; separate mark so callers can compare sum(phases) vs wall and see Promise.all savings) * logQueryTiming() helper next to logQueryError(), same console-based pattern (repo has no structured logger). Emits GitNexus [query:timing] query="..." totalMs=N phases={...} to stdout — greppable prefix, JSON-parseable payload, no new deps. * timing: Record<string, number> added as a top-level field on the query() response. Strict superset of the previous shape — existing tests only assert field presence, so no regression. Other MCP tools use the same top-level-metadata convention (status, row_count, warning) rather than a nested _meta wrapper. Tests: - 6 new unit tests for PhaseTimer covering start/stop, implicit stop-on-start, additive mark(), Promise.all-safe time(), negative/NaN rejection, and totalMs auto-stop. - 3 new assertions on the existing query integration test verifying timing.wall is a non-negative number and at least one of bm25/vector fired. Verification: npx vitest run test/unit/phase-timer.test.ts -> 6 pass npx vitest run test/unit/calltool-dispatch.test.ts -> 65 pass npx vitest run test/integration/local-backend-calltool.test.ts -> 18 pass npm run test:unit -> 3777 pass (4 pre-existing env failures unchanged: skip-git-cli needs built dist/, git-utils tmpdir on Windows worktree) npx tsc --noEmit -> clean Scope declined for v1: - In-process histogram aggregation — the log line is enough for external tooling - Pareto curve generation — issue asks to enable it, not generate it - Sub-phases of symbol_lookup (process vs cohesion vs content) — issue lists them under one bucket; can split later if demand surfaces Closes abhigyanpatwari#553
|
@azizur100389 is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
CI Report✅ All checks passed Pipeline Status
Test Results
✅ All 6513 tests passed 98 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
…ontract CI (abhigyanpatwari#953) failed the `query: JSON appears on stdout, not stderr` e2e test in test/integration/cli-e2e.test.ts with: SyntaxError: Unexpected token 'G', "GitNexus [..." is not valid JSON Root cause: my initial logQueryTiming() in 63fbdc4 used console.log, which writes to stdout. The MCP stdio transport uses stdout exclusively for JSON-RPC responses (abhigyanpatwari#324), and the CLI e2e test guards that contract by asserting stdout parses as JSON on every tool invocation. The "GitNexus [query:timing] ..." line was interleaving with the response JSON and breaking the parse. Fix: route logQueryTiming through console.error instead. stderr is the correct channel for human-readable diagnostics and it is what the sibling logQueryError already uses for the same reason. The log line format is otherwise unchanged -- still greppable, still JSON-parseable payload. Verification (local, with dist built): npx vitest run test/integration/cli-e2e.test.ts -t "query: JSON" -> now passes (was failing across ubuntu/windows/macos in CI) npx tsc --noEmit -> clean Two unrelated pre-existing failures on non-git directory handling remain (same on upstream/main). Closes the CI regression introduced in 63fbdc4.
|
@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review. You are reviewing a PR in the GitNexus monorepo:
Your task has 2 phases, in this exact order: PHASE 1 — DEFINE THE BAR PHASE 2 — REVIEW THE PR AGAINST THAT BAR CONTEXT TO LOAD FIRST
Additional context:
PRIMARY OBJECTIVE Do not optimize for completeness at the expense of signal. REVIEW RULES
For each finding, assign one severity:
REPO-SPECIFIC REVIEW CHECKLIST 1. Correctness & functional completenessCheck:
2. Code clarity & clean codeCheck:
3. Test coverage & change safetyEvaluate against TESTING.md:
4. PerformanceInspect for:
5. Operational riskCheck:
6. MaintainabilityCheck:
NON-NEGOTIABLES TO VERIFY
OUTPUT FORMAT Production-ready bar for this PRGive a concise repo-specific definition in 5–10 bullets. FindingsFor each finding, use this structure exactly: [BLOCKING|NON-BLOCKING|NIT] Short title
Only include findings that are real and supported.
What I could not fully verifyList any behavior you could not prove from the diff, tests, or available context. Final verdictChoose exactly one:
If verdict is not ready, list the blocking items in priority order. IMPORTANT REVIEW BEHAVIOR
|
|
Claude finished @magyargergo's task in 4m 46s —— View job PR Review: feat(search): per-phase timing instrumentation for the query pipelineTasks:
Production-ready bar for this PRGrounded in AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md for a change in
Findings1. Correctness & functional completenessNo issues found with correctness of the phase wiring or data flow. All seven phases from the PR description are instrumented as claimed. Verified:
NIT —
NIT —
2. Code clarity & clean codeNo issues found in this area based on the diff and reviewed context.
3. Test coverage & change safetyNON-BLOCKING — Integration test does not independently assert all 7 phases are present
NON-BLOCKING —
Unit tests for CI confirms: 4. PerformanceNo issues found in this area based on the diff and reviewed context.
5. Operational riskNIT — PR description says "stdout" but code correctly uses stderr
All other operational risk vectors: ✅
6. MaintainabilityNo issues found in this area based on the diff and reviewed context.
Non-negotiables check:
What I could not fully verify
Final verdict✅ Ready to merge — no blocking issues. The implementation is correct, CI is green across 3 platforms, the MCP contract is strictly preserved, and the |
|
Nice work! Thanks for you PR! |
…abhigyanpatwari#953) * feat(search): per-phase timing instrumentation for the query pipeline The eval harness already measures search-pipeline latency per phase, but the *product* query() tool has no timing visibility. That leaves production latency opaque: - Is BM25 the tail, or vector search? - How much Promise.all overlap do concurrent searches actually save? - Does symbol_lookup dominate when per-symbol Cypher round-trips pile up? None of this is answerable from the outside, which blocks the latency-quality Pareto work tracked in abhigyanpatwari#546 / abhigyanpatwari#553. Changes: * New PhaseTimer class at src/core/search/phase-timer.ts. Supports three APIs: - start(phase) / stop() for sequential phases (per issue spec) - mark(phase, durationMs) for pre-measured durations - time(phase, promise) to wrap a promise inside Promise.all The issue's original spec was sequential-only, which doesn't work for BM25 + vector inside Promise.all — the second start() would auto-stop the first and only one phase would get timed. The mark() and time() variants resolve that without changing the sequential API for the other phases. * local-backend.ts query() instrumented across seven phase markers: bm25, vector (concurrent via timer.time inside Promise.all) merge (RRF reciprocal-rank-fusion) symbol_lookup (per-symbol process + cohesion + content Cypher) ranking (in-memory priority sort) formatting (response object construction + dedup) wall (end-to-end; separate mark so callers can compare sum(phases) vs wall and see Promise.all savings) * logQueryTiming() helper next to logQueryError(), same console-based pattern (repo has no structured logger). Emits GitNexus [query:timing] query="..." totalMs=N phases={...} to stdout — greppable prefix, JSON-parseable payload, no new deps. * timing: Record<string, number> added as a top-level field on the query() response. Strict superset of the previous shape — existing tests only assert field presence, so no regression. Other MCP tools use the same top-level-metadata convention (status, row_count, warning) rather than a nested _meta wrapper. Tests: - 6 new unit tests for PhaseTimer covering start/stop, implicit stop-on-start, additive mark(), Promise.all-safe time(), negative/NaN rejection, and totalMs auto-stop. - 3 new assertions on the existing query integration test verifying timing.wall is a non-negative number and at least one of bm25/vector fired. Verification: npx vitest run test/unit/phase-timer.test.ts -> 6 pass npx vitest run test/unit/calltool-dispatch.test.ts -> 65 pass npx vitest run test/integration/local-backend-calltool.test.ts -> 18 pass npm run test:unit -> 3777 pass (4 pre-existing env failures unchanged: skip-git-cli needs built dist/, git-utils tmpdir on Windows worktree) npx tsc --noEmit -> clean Scope declined for v1: - In-process histogram aggregation — the log line is enough for external tooling - Pareto curve generation — issue asks to enable it, not generate it - Sub-phases of symbol_lookup (process vs cohesion vs content) — issue lists them under one bucket; can split later if demand surfaces Closes abhigyanpatwari#553 * fix(search): route query:timing log to stderr to preserve stdio MCP contract CI (abhigyanpatwari#953) failed the `query: JSON appears on stdout, not stderr` e2e test in test/integration/cli-e2e.test.ts with: SyntaxError: Unexpected token 'G', "GitNexus [..." is not valid JSON Root cause: my initial logQueryTiming() in 63fbdc4 used console.log, which writes to stdout. The MCP stdio transport uses stdout exclusively for JSON-RPC responses (abhigyanpatwari#324), and the CLI e2e test guards that contract by asserting stdout parses as JSON on every tool invocation. The "GitNexus [query:timing] ..." line was interleaving with the response JSON and breaking the parse. Fix: route logQueryTiming through console.error instead. stderr is the correct channel for human-readable diagnostics and it is what the sibling logQueryError already uses for the same reason. The log line format is otherwise unchanged -- still greppable, still JSON-parseable payload. Verification (local, with dist built): npx vitest run test/integration/cli-e2e.test.ts -t "query: JSON" -> now passes (was failing across ubuntu/windows/macos in CI) npx tsc --noEmit -> clean Two unrelated pre-existing failures on non-git directory handling remain (same on upstream/main). Closes the CI regression introduced in 63fbdc4.
Closes #553.
Problem
The eval harness already measures search-pipeline latency per phase, but the product `query()` tool has no timing visibility. That leaves production latency opaque:
None of this is answerable from the outside today, which blocks the latency-quality Pareto work tracked in the parent RFC (#546 / #553).
What changed
New `PhaseTimer` class at gitnexus/src/core/search/phase-timer.ts:
Deviation from issue spec
The issue's original spec was sequential-only (`start` auto-stops the previous phase). That doesn't work for BM25 + vector inside `Promise.all` — the second `start` would auto-stop the first and only one phase would get timed. I added `mark(phase, durationMs)` + `time(phase, promise)` variants that record wall time without touching the current phase slot, so both concurrent searches get independently timed. The sequential `start`/`stop` API is unchanged and is still used for the other four phases.
`query()` instrumentation — seven phase markers in gitnexus/src/mcp/local/local-backend.ts:
`logQueryTiming()` helper next to `logQueryError()`, same plain-`console` pattern (repo has no structured logger). Emits:
```
GitNexus [query:timing] query="..." totalMs=402 phases={"bm25":42,"vector":315,"merge":2,"symbol_lookup":18,"ranking":5,"formatting":2,"wall":320}
```
Greppable prefix, JSON-parseable payload, no new dependencies.
Response shape — `timing: Record<string, number>` added as a top-level field alongside `processes` / `process_symbols` / `definitions` / `warning`. Strict superset; existing fields unchanged.
Backward compat
Scope declined for v1
Test plan