Skip to content

perf(inmemory): index-based deep run search + benchmark#82

Merged
hoangsnowy merged 3 commits into
mainfrom
perf/inmemory-deep-search-index
May 24, 2026
Merged

perf(inmemory): index-based deep run search + benchmark#82
hoangsnowy merged 3 commits into
mainfrom
perf/inmemory-deep-search-index

Conversation

@hoangsnowy
Copy link
Copy Markdown
Owner

Summary

Follow-up to #81. A new BenchmarkDotNet case (RunSearchBenchmarks) revealed that in-memory deep run search was quadratic: InMemoryFlowRunStore.MatchesRunSearch (deep branch) scanned the global _steps dictionary once per candidate run, filtering by RunId inside the scan — O(runs × total_steps).

  • Fix: the deep branch now enumerates each run's own steps via the existing _stepKeysByRun index + direct _steps lookups — O(runs × steps_in_run), mirroring GetRunDetailAsync. Behaviour is identical (same matches); only the scan strategy changed.
  • Benchmark: added tests/benchmarks/.../RunSearchBenchmarks.cs (in-process, [MemoryDiagnoser]) + results doc.

Measured (InMemoryFlowRunStore, GetRunsPageAsync(search, take:20))

Runs Deep BEFORE Deep AFTER Gain
1,000 91 ms / 46 MB 1.46 ms / 263 KB ~62×
10,000 ~24,966 ms / 4.6 GB ~24 ms / 2.6 MB ~1,040× faster, ~1,800× less alloc

Deep 1k→10k scaling: 274× (quadratic) → ~16× (linear).

Note: the stale .claude/worktrees/ copies of the benchmark project break BenchmarkDotNet's default toolchain (duplicate project name); the benchmark pins the in-process toolchain to sidestep it. git worktree remove on those abandoned worktrees would be a tidy-up.

Test plan

  • dotnet build 0/0 (net8/9/10)
  • Full unit suite green (InMemory deep/quick parity + scope tests unchanged)
  • Benchmark builds + runs (before/after captured in the results doc)
  • CI: unit + integration

🤖 Generated with Claude Code

hoangsnowy and others added 2 commits May 24, 2026 11:38
InMemoryFlowRunStore deep search (deepSearch:true) matched step content by
scanning the global _steps dictionary once per candidate run, filtering by
RunId inside the scan — O(runs x total_steps), quadratic in run history. It now
enumerates each run's own steps via the existing _stepKeysByRun index and
direct-looks-up _steps — O(runs x steps_in_run), mirroring GetRunDetailAsync.

Behaviour is unchanged (same matches); only the scan strategy differs.

Benchmark (tests/benchmarks/.../RunSearchBenchmarks.cs, in-process, MemoryDiagnoser):
at 10,000 runs deep search drops from ~24,966 ms / 4.6 GB to ~24 ms / 2.6 MB
(~1,040x faster, ~1,800x less alloc); deep 1k->10k scaling 274x -> ~16x (linear).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1.27.0 is already published (immutable); the in-memory deep-search index fix
ships as a patch. Bumps VersionPrefix to 1.27.1 and promotes the CHANGELOG
Performance entry into the 1.27.1 section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/FlowOrchestrator.InMemory/InMemoryFlowRunStore.cs Fixed
…issed-where)

The index-based deep-search loop tripped cs/linq/missed-where on the diff.
Rewrite the per-run step scan as stepKeys.Keys.Any(...) — same O(steps_in_run)
index lookup, explicit filter the analyzer accepts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hoangsnowy hoangsnowy merged commit b89dae3 into main May 24, 2026
10 checks passed
@hoangsnowy hoangsnowy deleted the perf/inmemory-deep-search-index branch May 27, 2026 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants