feat(search): per-phase timing instrumentation for the query pipeline by azizur100389 · Pull Request #953 · abhigyanpatwari/GitNexus

azizur100389 · 2026-04-18T11:29:46Z

Closes #553.

Problem

The eval harness already measures search-pipeline latency per phase, but the product `query()` tool has no timing visibility. That leaves production latency opaque:

Is BM25 the tail, or vector search?
How much `Promise.all` overlap do concurrent searches actually save?
Does `symbol_lookup` dominate when per-symbol Cypher round-trips pile up?

None of this is answerable from the outside today, which blocks the latency-quality Pareto work tracked in the parent RFC (#546 / #553).

What changed

New `PhaseTimer` class at gitnexus/src/core/search/phase-timer.ts:

const t = new PhaseTimer();
t.start('merge'); ...; t.stop();                 // sequential phases
await Promise.all([
  t.time('bm25', bm25Search(...)),               // concurrent phases
  t.time('vector', semanticSearch(...)),
]);
t.mark('wall', performance.now() - start);       // pre-measured durations
t.summary();  // -> { bm25, vector, merge, wall, ... } rounded to 0.1ms

Deviation from issue spec

The issue's original spec was sequential-only (`start` auto-stops the previous phase). That doesn't work for BM25 + vector inside `Promise.all` — the second `start` would auto-stop the first and only one phase would get timed. I added `mark(phase, durationMs)` + `time(phase, promise)` variants that record wall time without touching the current phase slot, so both concurrent searches get independently timed. The sequential `start`/`stop` API is unchanged and is still used for the other four phases.

`query()` instrumentation — seven phase markers in gitnexus/src/mcp/local/local-backend.ts:

Phase	Measured with	What it covers
`bm25`	`timer.time` (concurrent)	BM25 FTS search
`vector`	`timer.time` (concurrent)	Semantic/embedding search
`merge`	`timer.start/stop`	Inline RRF reciprocal-rank fusion
`symbol_lookup`	`timer.start/stop`	Per-symbol process + cohesion + content Cypher round-trips
`ranking`	`timer.start/stop`	In-memory process priority sort
`formatting`	`timer.start/stop`	Response object construction + dedup
`wall`	`timer.mark`	End-to-end wall time; compare vs sum(phases) to see `Promise.all` overlap savings

`logQueryTiming()` helper next to `logQueryError()`, same plain-`console` pattern (repo has no structured logger). Emits:

```
GitNexus [query:timing] query="..." totalMs=402 phases={"bm25":42,"vector":315,"merge":2,"symbol_lookup":18,"ranking":5,"formatting":2,"wall":320}
```

Greppable prefix, JSON-parseable payload, no new dependencies.

Response shape — `timing: Record<string, number>` added as a top-level field alongside `processes` / `process_symbols` / `definitions` / `warning`. Strict superset; existing fields unchanged.

Backward compat

Response shape: strict superset. Existing callers ignore the new `timing` field. Existing integration test asserted field presence only — no regression.
No new dependencies. `performance.now()` is a Node builtin.
No behavioural change to search results. PhaseTimer is passive — observes phase boundaries, never alters what they return.
Log output is additive. New `GitNexus [query:timing] …` lines on stdout alongside existing error lines on stderr; trivial to filter.
Rollback: pure code revert, no side effects, no data migration.

Scope declined for v1

In-process histogram aggregation — the structured log line is enough for external tooling to aggregate.
Pareto curve generation — issue asks to enable it, not generate it.
Sub-phases of `symbol_lookup` (process vs cohesion vs content) — the issue lists them under one bucket; can split later if demand surfaces.

Test plan

`npx vitest run test/unit/phase-timer.test.ts` → 6 pass (new — sequential / Promise.all / mark additivity / NaN rejection / auto-stop)
`npx vitest run test/unit/calltool-dispatch.test.ts` → 65 pass (no regression on mocked dispatch)
`npx vitest run test/integration/local-backend-calltool.test.ts` → 18 pass (includes 3 new `timing` assertions on the existing `query` test)
`npm run test:unit` → 3777 pass (same 4 pre-existing env failures: `skip-git-cli.test.ts` needs built `dist/`, `git-utils.test.ts` Windows worktree tmpdir)
`npx tsc --noEmit` → clean
Pre-commit hook (lint-staged + eslint + prettier + tsc) → passed

The eval harness already measures search-pipeline latency per phase, but the *product* query() tool has no timing visibility. That leaves production latency opaque: - Is BM25 the tail, or vector search? - How much Promise.all overlap do concurrent searches actually save? - Does symbol_lookup dominate when per-symbol Cypher round-trips pile up? None of this is answerable from the outside, which blocks the latency-quality Pareto work tracked in abhigyanpatwari#546 / abhigyanpatwari#553. Changes: * New PhaseTimer class at src/core/search/phase-timer.ts. Supports three APIs: - start(phase) / stop() for sequential phases (per issue spec) - mark(phase, durationMs) for pre-measured durations - time(phase, promise) to wrap a promise inside Promise.all The issue's original spec was sequential-only, which doesn't work for BM25 + vector inside Promise.all — the second start() would auto-stop the first and only one phase would get timed. The mark() and time() variants resolve that without changing the sequential API for the other phases. * local-backend.ts query() instrumented across seven phase markers: bm25, vector (concurrent via timer.time inside Promise.all) merge (RRF reciprocal-rank-fusion) symbol_lookup (per-symbol process + cohesion + content Cypher) ranking (in-memory priority sort) formatting (response object construction + dedup) wall (end-to-end; separate mark so callers can compare sum(phases) vs wall and see Promise.all savings) * logQueryTiming() helper next to logQueryError(), same console-based pattern (repo has no structured logger). Emits GitNexus [query:timing] query="..." totalMs=N phases={...} to stdout — greppable prefix, JSON-parseable payload, no new deps. * timing: Record<string, number> added as a top-level field on the query() response. Strict superset of the previous shape — existing tests only assert field presence, so no regression. Other MCP tools use the same top-level-metadata convention (status, row_count, warning) rather than a nested _meta wrapper. Tests: - 6 new unit tests for PhaseTimer covering start/stop, implicit stop-on-start, additive mark(), Promise.all-safe time(), negative/NaN rejection, and totalMs auto-stop. - 3 new assertions on the existing query integration test verifying timing.wall is a non-negative number and at least one of bm25/vector fired. Verification: npx vitest run test/unit/phase-timer.test.ts -> 6 pass npx vitest run test/unit/calltool-dispatch.test.ts -> 65 pass npx vitest run test/integration/local-backend-calltool.test.ts -> 18 pass npm run test:unit -> 3777 pass (4 pre-existing env failures unchanged: skip-git-cli needs built dist/, git-utils tmpdir on Windows worktree) npx tsc --noEmit -> clean Scope declined for v1: - In-process histogram aggregation — the log line is enough for external tooling - Pareto curve generation — issue asks to enable it, not generate it - Sub-phases of symbol_lookup (process vs cohesion vs content) — issue lists them under one bucket; can split later if demand surfaces Closes abhigyanpatwari#553

vercel · 2026-04-18T11:29:51Z

@azizur100389 is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

github-actions · 2026-04-18T11:41:46Z

CI Report

✅ All checks passed

Pipeline Status

Stage	Status	Details
✅ Typecheck	`success`	tsc --noEmit
✅ Tests	`success`	unit tests, 3 platforms
✅ E2E	`success`	gitnexus-web changes only

Test Results

Tests	Passed	Failed	Skipped	Duration
6612	6513	0	98	248s

✅ All 6513 tests passed

98 test(s) skipped — expand for details

withTestLbugDB(java-class-impact) > Bug 2 fix: context() on Class shows non-empty incoming > contrast: context() on a Method still works (regression check)
withTestLbugDB(java-class-impact) > Bug 2 fix: context() on Class shows non-empty incoming > RankPermissionHandler: incoming shows caller and importers
Swift MethodExtractor > isTypeDeclaration > recognizes class_declaration
Swift MethodExtractor > isTypeDeclaration > recognizes protocol_declaration
Swift MethodExtractor > isTypeDeclaration > rejects import_declaration
Swift MethodExtractor > visibility > extracts public method
Swift MethodExtractor > visibility > extracts private method
Swift MethodExtractor > visibility > defaults to internal when no modifier
Swift MethodExtractor > protocol methods > marks protocol method as abstract
Swift MethodExtractor > static and class methods > detects static func as isStatic
Swift MethodExtractor > static and class methods > detects class func as isStatic
Swift MethodExtractor > parameters > extracts parameters with types and default values
Swift MethodExtractor > return type > extracts return type from -> annotation
Swift MethodExtractor > annotations > extracts @objc attribute
Swift MethodExtractor > isFinal > detects final func
Swift MethodExtractor > isFinal > is false when not final
Swift MethodExtractor > isAsync > detects async func
Swift MethodExtractor > isOverride > detects override method
buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference uses lookupClassByName
buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift explicit init inference uses lookupClassByName
buildTypeEnv > constructor inference (Tier 1 fallback) > lookupClassByName regression coverage > Swift lookupClassByName regression coverage > Swift cross-file constructor inference does not bind plain functions
buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature
Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
Swift self resolution > detects User and Repo classes, each with a save function
Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
Swift parent resolution > emits EXTENDS edge: User → BaseModel
Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
Swift return type inference > detects User class and getUser function
Swift return type inference > detects save function on User (Swift class methods are Function nodes)
Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()
Swift implicit imports (cross-file visibility) > detects UserService class in Models.swift
Swift implicit imports (cross-file visibility) > resolves UserService() constructor call across files (no explicit import)
Swift implicit imports (cross-file visibility) > resolves service.fetchUser() member call across files
Swift implicit imports (cross-file visibility) > creates IMPORTS edges between files in the same module
Swift extension deduplication > detects Product class
Swift extension deduplication > resolves Product() constructor despite extension creating duplicate class node
Swift extension deduplication > resolves product.save() to Product.swift (primary definition)
Swift constructor call fallback (no new keyword) > resolves OCRService() as constructor call across files
Swift constructor call fallback (no new keyword) > resolves ocr.recognize() member call via constructor-inferred type
Swift export visibility (internal vs private) > resolves PublicService() constructor across files
Swift export visibility (internal vs private) > resolves internalHelper() across files (internal = module-scoped)
Swift if let / guard let binding resolution > detects User and Repo classes
Swift if let / guard let binding resolution > resolves user.save() inside if-let to User#save
Swift if let / guard let binding resolution > resolves repo.save() inside guard-let to Repo#save
Swift if let / guard let binding resolution > user.save() in if-let does NOT resolve to Repo#save
Swift await / try expression unwrapping > resolves user.save() via await fetchUser() return type
Swift await / try expression unwrapping > resolves repo.save() via try parseRepo() return type
Swift await / try expression unwrapping > detects fetchUser and parseRepo as functions
Swift for-in loop element type inference > detects User and Repo classes
Swift for-in loop element type inference > creates implicit import edges between files
Swift field-type resolution > detects classes and their properties
Swift field-type resolution > emits HAS_PROPERTY edges from class to field
Swift field-type resolution > resolves field-chain call user.address.save() → Address#save
Swift field-type resolution > emits ACCESSES edges for field reads in chains
Swift field-type resolution > populates field metadata (visibility, declaredType) on Property nodes
Swift call-result binding > resolves call-result-bound method call user.save() → User#save
Swift call-result binding > getUser() is present as a defined function
Swift call-result binding > emits processUser -> getUser CALLS edge for let-assigned free function call
Swift method enrichment > detects Animal protocol and Dog class
Swift method enrichment > emits IMPLEMENTS edge Dog -> Animal
Swift method enrichment > emits HAS_METHOD edges for Dog methods
Swift method enrichment > marks protocol Animal.speak as isAbstract
Swift method enrichment > marks Dog.speak as NOT isAbstract
Swift method enrichment > marks breathe as isFinal
Swift method enrichment > marks classify as isStatic
Swift method enrichment > captures @objc annotation on breathe
Swift method enrichment > populates parameterTypes for classify(_ name: String)
Swift method enrichment > records parameterCount for classify
Swift method enrichment > records returnType for speak
Swift method enrichment > resolves dog.speak() CALLS edge
Swift method enrichment > resolves Dog.classify("dog") CALLS edge
Swift abstract dispatch > detects Repository protocol and SqlRepository class
Swift abstract dispatch > emits IMPLEMENTS edge SqlRepository -> Repository
Swift abstract dispatch > emits HAS_METHOD edges for Repository.find and Repository.save
Swift abstract dispatch > emits HAS_METHOD edges for SqlRepository.find and SqlRepository.save
Swift abstract dispatch > marks base Repository.find as isAbstract
Swift abstract dispatch > marks base Repository.save as isAbstract
Swift abstract dispatch > marks concrete SqlRepository.find as NOT isAbstract
Swift abstract dispatch > resolves repo.find(id: 42) CALLS edge
Swift abstract dispatch > resolves repo.save(entity: user) CALLS edge
Swift abstract dispatch > populates parameterTypes for Repository.find
Swift abstract dispatch > populates parameterTypes for Repository.save
Swift abstract dispatch > records returnType for SqlRepository.find
Swift abstract dispatch > emits METHOD_IMPLEMENTS edges from SqlRepository methods → Repository protocol methods
Swift overloaded method disambiguation > detects 2 distinct find Method nodes on SqlRepository
Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edges for both find overloads
Swift overloaded method disambiguation > emits METHOD_IMPLEMENTS edge for save
Swift overloaded method disambiguation > emits exactly 3 METHOD_IMPLEMENTS edges total
Swift Child extends Parent — inherited method resolution (SM-9) > detects Parent and Child classes
Swift Child extends Parent — inherited method resolution (SM-9) > resolves c.parentMethod() to Parent.parentMethod via first-wins MRO walk

Code Coverage

Tests

Metric	Coverage	Covered	Base	Delta	Status
Statements	73.25%	17893/24424	73.22%	📈 +0.0	🟢 ██████████████░░░░░░
Branches	62.25%	11350/18232	62.24%	📈 +0.0	🟢 ████████████░░░░░░░░
Functions	77.85%	1695/2177	77.82%	📈 +0.0	🟢 ███████████████░░░░░
Lines	75.82%	16224/21396	75.78%	📈 +0.0	🟢 ███████████████░░░░░

_{📋 View full run · Generated by CI}

…ontract CI (abhigyanpatwari#953) failed the `query: JSON appears on stdout, not stderr` e2e test in test/integration/cli-e2e.test.ts with: SyntaxError: Unexpected token 'G', "GitNexus [..." is not valid JSON Root cause: my initial logQueryTiming() in 63fbdc4 used console.log, which writes to stdout. The MCP stdio transport uses stdout exclusively for JSON-RPC responses (abhigyanpatwari#324), and the CLI e2e test guards that contract by asserting stdout parses as JSON on every tool invocation. The "GitNexus [query:timing] ..." line was interleaving with the response JSON and breaking the parse. Fix: route logQueryTiming through console.error instead. stderr is the correct channel for human-readable diagnostics and it is what the sibling logQueryError already uses for the same reason. The log line format is otherwise unchanged -- still greppable, still JSON-parseable payload. Verification (local, with dist built): npx vitest run test/integration/cli-e2e.test.ts -t "query: JSON" -> now passes (was failing across ubuntu/windows/macos in CI) npx tsc --noEmit -> clean Two unrelated pre-existing failures on non-git directory handling remain (same on upstream/main). Closes the CI regression introduced in 63fbdc4.

magyargergo · 2026-04-18T15:23:05Z

@claude Act as a senior reviewer for GitNexus. Your job is to determine whether this PR is production-ready for this repo, not to give a generic code review.

You are reviewing a PR in the GitNexus monorepo:

gitnexus/ → CLI + MCP
gitnexus-web/
gitnexus-shared/

Your task has 2 phases, in this exact order:

PHASE 1 — DEFINE THE BAR
Before reviewing the diff, establish a concise repo-specific definition of “production-ready” for GitNexus, based only on the repo docs and the affected area.
Keep this definition practical and reviewable. Do not invent standards that are not grounded in the repo.

PHASE 2 — REVIEW THE PR AGAINST THAT BAR
Review the actual diff only after defining the bar.
Stay tightly scoped to the changed code and its direct consequences.

CONTEXT TO LOAD FIRST
Read these before reviewing:

AGENTS.md
GUARDRAILS.md
CONTRIBUTING.md
TESTING.md
ARCHITECTURE.md

Additional context:

If the diff touches ingestion, also read: ARCHITECTURE.md § Pipeline Phase DAG
PR [group/sync] Fix ManifestExtractor never called — config.links always produced 0 cross-links #827 diff area is primarily: gitnexus/src/core/group/ (sync + manifest extractor) plus associated tests

PRIMARY OBJECTIVE
Decide whether this PR is safe, correct, maintainable, and operationally acceptable to merge into production for GitNexus.

Do not optimize for completeness at the expense of signal.
Do not pad the review.
Do not propose unrelated refactors.
Do not restate the PR description unless needed for verification.

REVIEW RULES

Every finding must be grounded in specific evidence from the diff or directly relevant surrounding code.
Every finding must include path:line.
If you make a behavioral claim, cite the code that proves it.
If you make a performance claim, explain the mechanism.
If something cannot be verified from the diff alone, explicitly say so.
Distinguish clearly between:
- verified issue
- plausible risk
- unverified concern
Avoid vague wording like “might be better” or “could be improved” unless you explain exactly why.
Keep the review focused on this PR’s scope only.

For each finding, assign one severity:

BLOCKING → must be fixed before merge
NON-BLOCKING → valid issue, but merge may still be acceptable
NIT → stylistic/minor, not merge-relevant

REPO-SPECIFIC REVIEW CHECKLIST
Use these exact headings.

1. Correctness & functional completeness

Check:

Does the implementation actually satisfy the PR claim?
- ManifestExtractor is truly invoked
- config.links produces non-zero cross-links where expected
Resolver contracts are preserved:
- resolveSymbol remains exact-match
- label-scoped Cypher remains correct per contract type
- flag any regression toward fuzzy or unscoped matching
Graph schema integrity is preserved:
- no silent changes to node labels
- no silent changes to edge types
- no silent changes to ID generation (generateId)
Call out any missing wiring, partial integration, dead branch, or mismatch between tests and runtime behavior

2. Code clarity & clean code

Check:

naming quality
local cohesion
dead code
unnecessary abstraction
hidden control flow
confusing indirection
adherence to repo conventions:
- direct imports from gitnexus-shared
- no barrel re-export regressions
- no // removed comments
- no unused re-exports
no drive-by refactors outside stated scope per CONTRIBUTING.md and GUARDRAILS.md § Scope

3. Test coverage & change safety

Evaluate against TESTING.md:

Are there unit tests under gitnexus/test/unit/ covering the newly wired path?
Is there a regression guard for 0-link → N-link behavior?
Are assertions meaningful rather than tautological?
Are fixtures realistic for manifest inputs?
If memoization/cache was introduced, is there a test proving hit/miss behavior and correctness?
Is there evidence the expected validation path would pass for staged gitnexus/ files?
- tsc --noEmit
- vitest run --project default
  If not verifiable, say exactly what is missing.

4. Performance

Inspect for:

hot-path overhead in ingestion/group sync
excess allocations per manifest entry
redundant Cypher round-trips
missed batching or missed parallelism (Promise.all) where it materially matters
O(n²) or repeated lookup patterns on large repos
memoization tradeoffs:
- correctness
- invalidation
- bounded vs unbounded memory growth
  Do not speculate casually; explain the mechanism and likely impact.

5. Operational risk

Check:

Windows/cross-platform safety:
- stream lifecycle
- FD/file handle lifecycle
- path separator assumptions
- anything resembling prior ENOTEMPTY-style lifecycle regressions
LadybugDB single-writer invariant is preserved
Embeddings preservation:
- no silent breakage of --embeddings
- .gitnexus/meta.json.stats.embeddings not silently zeroed by changed paths
MCP contracts remain compatible:
- group_*
- query
- context
- impact
- detect_changes
- rename
- cypher
  Flag any schema or contract break without migration note
staleness behavior still triggers correctly (gitnexus/src/mcp/staleness.ts)
rollback safety:
- can this PR be reverted safely without re-analyze?
- if not, explain why

6. Maintainability

Check:

Does the change respect Pipeline Phase DAG boundaries?
Is cross-phase coupling being introduced?
Are manifest-extractor entry points discoverable from group/sync.ts?
Would the next contributor know where to extend this?
Is shared logic placed appropriately:
- in gitnexus-shared/
- or gitnexus/src/core/**/utils.ts
- and not duplicated between CLI and web bridge (server/api.ts)?

NON-NEGOTIABLES TO VERIFY
From GUARDRAILS.md / CONTRIBUTING.md:

no secrets, tokens, or machine-specific paths committed
no blind find-and-replace renames
if a symbol rename occurred, rename MCP tool or equivalent careful review was used
upstream impact was considered for modified shared symbols
no HIGH/CRITICAL impact left unaddressed
PR title follows conventional commit style + [area] Short description
PolyForm Noncommercial license headers/metadata untouched

OUTPUT FORMAT

Production-ready bar for this PR

Give a concise repo-specific definition in 5–10 bullets.

Findings

For each finding, use this structure exactly:

[BLOCKING|NON-BLOCKING|NIT] Short title

Evidence: path:line, path:line
Why it matters: ...
Recommendation: ...
Confidence: High / Medium / Low

Only include findings that are real and supported.
If there are no findings in a section, say:

No issues found in this area based on the diff and reviewed context.

What I could not fully verify

List any behavior you could not prove from the diff, tests, or available context.

Final verdict

Choose exactly one:

✅ Ready to merge — no blocking issues.
⚠️ Merge after addressing blocking items.
❌ Not ready — fundamental concerns.

If verdict is not ready, list the blocking items in priority order.

IMPORTANT REVIEW BEHAVIOR

Be strict on correctness and contracts.
Be conservative about operational risk.
Be explicit about uncertainty.
Do not suggest unrelated refactors.
Do not turn nits into blockers.
Do not reward intent; review actual implementation.

github-actions · 2026-04-18T15:23:19Z

Claude finished @magyargergo's task in 4m 46s —— View job

PR Review: feat(search): per-phase timing instrumentation for the query pipeline

Tasks:

Initialize review
Read repo docs: AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md
Read changed files: phase-timer.ts, local-backend.ts, test files
Define production-ready bar (Phase 1)
Review PR against that bar (Phase 2)
Post final verdict

Production-ready bar for this PR

Grounded in AGENTS.md, GUARDRAILS.md, CONTRIBUTING.md, TESTING.md, ARCHITECTURE.md for a change in gitnexus/src/core/search/ (new utility class) + gitnexus/src/mcp/local/local-backend.ts (MCP query() handler):

TypeScript compiles cleanly (tsc --noEmit), unit + integration tests pass in CI
MCP query tool response contract is strictly additive — no existing fields removed or type-changed
Log output is routed to stderr, not stdout, to preserve the MCP stdio JSON-RPC transport (ARCHITECTURE.md § MCP; #324)
No new runtime dependencies; performance.now() is a Node.js ≥16 global (project requires ≥20)
Change is scoped to stated purpose — no drive-by refactors (GUARDRAILS.md § Scope, CONTRIBUTING.md)
No secrets, tokens, or machine-specific paths committed
PR title follows conventional-commit format (feat(scope): subject)
PhaseTimer is passive — never mutates the data flowing through query(), only observes phase boundaries
Rollback is a pure code revert: no data migration, no schema change, no side effects

Findings

1. Correctness & functional completeness

No issues found with correctness of the phase wiring or data flow. All seven phases from the PR description are instrumented as claimed. Verified:

bm25 / vector via timer.time() inside Promise.all — correct choice; avoids the single-slot conflict that start/stop would have ✅
merge, symbol_lookup, ranking, formatting via start/stop — called unconditionally, so they always appear in summary() ✅
wall via timer.mark(...) before summary() — correctly recorded last ✅
timing added to return object as a strict superset (local-backend.ts:799) ✅

NIT — totalMs is redundant in the log line

Evidence: local-backend.ts:175 — const totalMs = phases.wall ?? Object.values(phases).reduce(...). Since wall is unconditionally marked before summary(), phases.wall is always defined, so totalMs always equals phases.wall. The log emits totalMs=X phases={...,wall:X}, making totalMs a duplicate of wall in the output.
Why it matters: Log consumers expecting totalMs to represent the arithmetic sum of phases (to compute overlap savings) will get wall instead. The reduce fallback path is dead.
Recommendation: Either (a) compute totalMs as Object.values(phases).reduce(...) excluding wall, or (b) remove totalMs from the log line and document that phases.wall is the end-to-end figure. Not merge-blocking.
Confidence: High

NIT — time() misses the synchronous preamble of each async helper

Evidence: local-backend.ts:573-575 — JS evaluates this.bm25Search(...) (argument) before calling timer.time(...), so the async function's synchronous pre-await lines execute before t0 = performance.now() is captured in phase-timer.ts:74.
Why it matters: The timer slightly under-reports both phases. In practice the preamble is negligible (a handful of lines before the first DB I/O), so this does not materially affect the Pareto analysis the feature is designed for.
Recommendation: No code change needed; document the caveat in the time() JSDoc if future maintainers extend the helper to phases with heavier synchronous setup.
Confidence: High

2. Code clarity & clean code

No issues found in this area based on the diff and reviewed context.

PhaseTimer lives in src/core/search/ — correct placement per ARCHITECTURE.md (Search ranking concern)
logQueryTiming follows the exact pattern of logQueryError directly above it (same prefix scheme, same stderr, same file)
No dead code from the PR (the totalMs() method is exercised in phase-timer.test.ts:63)
No // removed comments, no unused re-exports, no barrel regressions

3. Test coverage & change safety

NON-BLOCKING — Integration test does not independently assert all 7 phases are present

Evidence: test/integration/local-backend-calltool.test.ts:108-114 — asserts timing defined, wall is a number, and bm25 ?? vector >= 0. The sequential phases (merge, symbol_lookup, ranking, formatting) are never asserted.
Why it matters: If a future refactor silently drops a phase timer call (e.g., wraps the merge block in an early-return path), the integration test would not catch it. The unit tests cover PhaseTimer mechanics exhaustively, but the integration-level wiring of all 7 phases has no assertion.
Recommendation: Add expect(typeof result.timing.merge).toBe('number') etc. for the four sequential phases — one line each. Merge-acceptable without this.
Confidence: High

NON-BLOCKING — bm25 ?? vector assertion doesn't independently verify both

Evidence: local-backend-calltool.test.ts:114 — expect(result.timing.bm25 ?? result.timing.vector).toBeGreaterThanOrEqual(0). Because bm25 is always defined (it always fires), the ?? vector branch is never exercised by this assertion.
Why it matters: If vector were accidentally omitted, this assertion would not catch it.
Recommendation: Use expect(result.timing.bm25).toBeGreaterThanOrEqual(0) and expect(result.timing.vector).toBeGreaterThanOrEqual(0) as separate assertions.
Confidence: High

Unit tests for PhaseTimer itself are solid: 6 cases covering all API surface including concurrent usage, additive marking, NaN/negative rejection, and auto-stop behaviour. ✅

CI confirms: tsc --noEmit clean, 6513 tests pass across 3 platforms. ✅

4. Performance

No issues found in this area based on the diff and reviewed context.

PhaseTimer overhead per query() call: ~7 performance.now() calls + ~7 Map.set/get operations. Negligible vs the sequential Cypher round-trips in symbol_lookup that dominate the pipeline (up to ~100 DB round-trips for searchLimit=50 results). The pre-existing sequential-per-symbol Cypher pattern is outside scope of this PR; notably, this timer is precisely the tool that will expose it in production.

5. Operational risk

NIT — PR description says "stdout" but code correctly uses stderr

Evidence: local-backend.ts:177 uses console.error; PR body says "New GitNexus [query:timing] … lines on stdout". The code is correct (commit f605a26 fixed an initial console.log). The description was not updated after the fix.
Why it matters: No runtime impact. Misleads future readers of the PR description into thinking the log lands on stdout, which would corrupt the MCP stdio transport if replicated.
Recommendation: Update the PR description for accuracy before merge (cosmetic only).
Confidence: High

All other operational risk vectors: ✅

Windows/cross-platform: performance.now() and Map are cross-platform. No path separators, no file handles, no stream lifecycle.
LadybugDB single-writer: Not touched.
Embeddings: Not touched.
MCP contracts: query response is a strict superset. group_*, context, impact, detect_changes, rename, cypher are not touched.
Staleness (staleness.ts): Not touched.
Rollback: Pure code revert, zero side effects.

6. Maintainability

No issues found in this area based on the diff and reviewed context.

PhaseTimer is self-contained with clear JSDoc covering all three usage patterns. A future contributor extending the pipeline can drop in timer.start/stop or timer.time with no additional plumbing.
No cross-phase DAG coupling: PhaseTimer is local to query() and never touches PipelinePhase or pipeline-phase DAG types.
Correctly placed in src/core/search/ (not gitnexus-shared/ — search-pipeline-specific, not a shared type), not duplicated in server/api.ts.

Non-negotiables check:

No secrets, tokens, or machine-specific paths ✅
No blind find-and-replace renames ✅
No symbol renames (additive only) ✅
No HIGH/CRITICAL impact unaddressed ✅
PR title feat(search): ... — conventional-commit compliant ✅
PolyForm license untouched ✅

What I could not fully verify

Whether downstream consumers outside this repo (e.g. eval harness, external integrations parsing query responses) access the response by field and could be affected by the new timing key. Grep found no hits in gitnexus-web/ for process_symbols/definitions as typed consumers; the web UI does not appear to consume the MCP query tool response directly. Low confidence this is a risk.
Accurate bm25/vector phase reporting in production — whether the ~microsecond synchronous preamble miss in timer.time() is perceptible in flame-graph-style aggregation. Almost certainly not, but unverifiable without profiling data.
The logQueryTiming log line's totalMs vs sum(phases) divergence in practice — specifically, whether any log-scraping pipeline built on the documented format will misinterpret totalMs as the sum rather than wall.

Final verdict

✅ Ready to merge — no blocking issues.

The implementation is correct, CI is green across 3 platforms, the MCP contract is strictly preserved, and the PhaseTimer class is a clean, testable abstraction that matches repo conventions. The two nits on test assertions (phases not fully enumerated in the integration test) and the totalMs/wall redundancy are acceptable for an additive observability feature. The PR description's "stdout" claim is stale and should be corrected, but does not affect the code.

magyargergo · 2026-04-18T15:30:01Z

Nice work! Thanks for you PR!

…abhigyanpatwari#953) * feat(search): per-phase timing instrumentation for the query pipeline The eval harness already measures search-pipeline latency per phase, but the *product* query() tool has no timing visibility. That leaves production latency opaque: - Is BM25 the tail, or vector search? - How much Promise.all overlap do concurrent searches actually save? - Does symbol_lookup dominate when per-symbol Cypher round-trips pile up? None of this is answerable from the outside, which blocks the latency-quality Pareto work tracked in abhigyanpatwari#546 / abhigyanpatwari#553. Changes: * New PhaseTimer class at src/core/search/phase-timer.ts. Supports three APIs: - start(phase) / stop() for sequential phases (per issue spec) - mark(phase, durationMs) for pre-measured durations - time(phase, promise) to wrap a promise inside Promise.all The issue's original spec was sequential-only, which doesn't work for BM25 + vector inside Promise.all — the second start() would auto-stop the first and only one phase would get timed. The mark() and time() variants resolve that without changing the sequential API for the other phases. * local-backend.ts query() instrumented across seven phase markers: bm25, vector (concurrent via timer.time inside Promise.all) merge (RRF reciprocal-rank-fusion) symbol_lookup (per-symbol process + cohesion + content Cypher) ranking (in-memory priority sort) formatting (response object construction + dedup) wall (end-to-end; separate mark so callers can compare sum(phases) vs wall and see Promise.all savings) * logQueryTiming() helper next to logQueryError(), same console-based pattern (repo has no structured logger). Emits GitNexus [query:timing] query="..." totalMs=N phases={...} to stdout — greppable prefix, JSON-parseable payload, no new deps. * timing: Record<string, number> added as a top-level field on the query() response. Strict superset of the previous shape — existing tests only assert field presence, so no regression. Other MCP tools use the same top-level-metadata convention (status, row_count, warning) rather than a nested _meta wrapper. Tests: - 6 new unit tests for PhaseTimer covering start/stop, implicit stop-on-start, additive mark(), Promise.all-safe time(), negative/NaN rejection, and totalMs auto-stop. - 3 new assertions on the existing query integration test verifying timing.wall is a non-negative number and at least one of bm25/vector fired. Verification: npx vitest run test/unit/phase-timer.test.ts -> 6 pass npx vitest run test/unit/calltool-dispatch.test.ts -> 65 pass npx vitest run test/integration/local-backend-calltool.test.ts -> 18 pass npm run test:unit -> 3777 pass (4 pre-existing env failures unchanged: skip-git-cli needs built dist/, git-utils tmpdir on Windows worktree) npx tsc --noEmit -> clean Scope declined for v1: - In-process histogram aggregation — the log line is enough for external tooling - Pareto curve generation — issue asks to enable it, not generate it - Sub-phases of symbol_lookup (process vs cohesion vs content) — issue lists them under one bucket; can split later if demand surfaces Closes abhigyanpatwari#553 * fix(search): route query:timing log to stderr to preserve stdio MCP contract CI (abhigyanpatwari#953) failed the `query: JSON appears on stdout, not stderr` e2e test in test/integration/cli-e2e.test.ts with: SyntaxError: Unexpected token 'G', "GitNexus [..." is not valid JSON Root cause: my initial logQueryTiming() in 63fbdc4 used console.log, which writes to stdout. The MCP stdio transport uses stdout exclusively for JSON-RPC responses (abhigyanpatwari#324), and the CLI e2e test guards that contract by asserting stdout parses as JSON on every tool invocation. The "GitNexus [query:timing] ..." line was interleaving with the response JSON and breaking the parse. Fix: route logQueryTiming through console.error instead. stderr is the correct channel for human-readable diagnostics and it is what the sibling logQueryError already uses for the same reason. The log line format is otherwise unchanged -- still greppable, still JSON-parseable payload. Verification (local, with dist built): npx vitest run test/integration/cli-e2e.test.ts -t "query: JSON" -> now passes (was failing across ubuntu/windows/macos in CI) npx tsc --noEmit -> clean Two unrelated pre-existing failures on non-git directory handling remain (same on upstream/main). Closes the CI regression introduced in 63fbdc4.

magyargergo merged commit ac14861 into abhigyanpatwari:main Apr 18, 2026
12 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(search): per-phase timing instrumentation for the query pipeline#953

feat(search): per-phase timing instrumentation for the query pipeline#953
magyargergo merged 2 commits into
abhigyanpatwari:mainfrom
azizur100389:feat/search-phase-timer

azizur100389 commented Apr 18, 2026

Uh oh!

vercel Bot commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

magyargergo commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

magyargergo commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

azizur100389 commented Apr 18, 2026

Problem

What changed

Deviation from issue spec

Backward compat

Scope declined for v1

Test plan

Uh oh!

vercel Bot commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI Report

Pipeline Status

Test Results

Code Coverage

Tests

Uh oh!

magyargergo commented Apr 18, 2026

1. Correctness & functional completeness

2. Code clarity & clean code

3. Test coverage & change safety

4. Performance

5. Operational risk

6. Maintainability

Production-ready bar for this PR

Findings

[BLOCKING|NON-BLOCKING|NIT] Short title

What I could not fully verify

Final verdict

Uh oh!

github-actions Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat(search): per-phase timing instrumentation for the query pipeline

Production-ready bar for this PR

Findings

1. Correctness & functional completeness

2. Code clarity & clean code

3. Test coverage & change safety

4. Performance

5. Operational risk

6. Maintainability

What I could not fully verify

Final verdict

Uh oh!

magyargergo commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Apr 18, 2026 •

edited

Loading

github-actions Bot commented Apr 18, 2026 •

edited

Loading