Skip to content

perf(core): Pre-initialize metrics worker pool to overlap tiktoken WASM loading#1302

Merged
yamadashy merged 3 commits intomainfrom
perf/pre-init-metrics-worker-pool
Mar 28, 2026
Merged

perf(core): Pre-initialize metrics worker pool to overlap tiktoken WASM loading#1302
yamadashy merged 3 commits intomainfrom
perf/pre-init-metrics-worker-pool

Conversation

@yamadashy
Copy link
Copy Markdown
Owner

@yamadashy yamadashy commented Mar 26, 2026

Summary

Cherry-picked from #1295. Pre-initializes the metrics worker pool during the file collection phase so that tiktoken WASM loading overlaps with security checks and file processing instead of blocking the metrics stage.

Changes

  • Pre-warm metrics worker pool (calculateMetrics.ts, packager.ts): Extract createMetricsTaskRunner factory that starts the worker pool and fires a warm-up task to trigger tiktoken WASM initialization in the background. By the time metrics calculation starts, the worker is already warm.
  • Lazy-load Jiti (configLoad.ts): Defer createJiti import to when TypeScript config loading is actually needed.
  • Reorder binary check before stat (fileRead.ts): Check binary extension (no I/O) before calling fs.stat (I/O), avoiding unnecessary stat calls for binary files.
  • O(n) file path regrouping (packager.ts): Replace O(n²) array spread in file path re-grouping with Map-based O(n) approach.
  • Parallelize split output writes (produceOutput.ts): Write multiple split output files concurrently with Promise.all instead of sequentially.

Benchmark (from automated perf suite)

  • Ubuntu: -4.2% to -8.9%
  • Combined with cumulative optimizations: 2.30s → 2.21s

Checklist

  • Run npm run test
  • Run npm run lint

Open with Devin

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 77dd6f3b-2131-41a7-b7f4-27396272da38

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces a metrics task runner warm-up system that initializes token-counting tasks before metrics calculation, refactors dependency injection in metrics handling, optimizes file output writes to run concurrently, moves binary-extension checking earlier in file reading, and converts top-level jiti imports to lazy-loaded initialization.

Changes

Cohort / File(s) Summary
Lazy loading optimization
src/config/configLoad.ts
Moved jiti import from top-level to lazy-loaded inside defaultJitiImport to defer TS toolchain resolution until a JS/TS config file is actually loaded; JSON/JSON5/JSONC continue using JSON5.parse.
File reading optimization
src/core/file/fileRead.ts
Moved binary-extension check to the start of readRawFile before filesystem stat I/O; binary files now return immediately without size or read evaluations.
Metrics dependency injection
src/core/metrics/calculateMetrics.ts
Added exported factory createMetricsTaskRunner(numOfTasks) to initialize a task runner; changed calculateMetrics parameter from deps to overrideDeps with internal defaultDeps constant for dependency composition.
Packager core integration
src/core/packager.ts
Integrated metrics task runner with pre-initialization warm-up, replaced rootDir regrouping logic using a Map<rootDir, Set<filePath>>, ensured warm-up completes before skill-generation early return and metrics calculation, and added lifecycle cleanup for the task runner.
Concurrent output writing
src/core/packager/produceOutput.ts
Changed split-output writing from sequential iteration with await inside loop to concurrent execution via Promise.all() mapping, eliminating sequential blocking while preserving per-part configuration.
Test mocks for task runner
tests/core/packager.test.ts, tests/core/packager/diffsFunctionality.test.ts, tests/core/packager/splitOutput.test.ts, tests/integration-tests/packager.test.ts
Added createMetricsTaskRunner mock dependency returning { run, cleanup } functions across all packager test files; updated calculateMetrics call assertions to expect taskRunner in options object.

Possibly Related PRs

  • PR #928: Directly related; both PRs modify src/config/configLoad.ts to introduce lazy-loaded jiti initialization and dependency-injectable imports.
  • PR #1155: Directly related; both PRs modify src/core/file/fileRead.ts binary-extension and file-decoding logic, with this PR moving the binary check earlier.
  • PR #645: Directly related; both PRs refactor src/core/metrics/calculateMetrics.ts dependency injection patterns, adding/modifying how dependencies are composed and injected into metrics calculation.

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main optimization: pre-initializing the metrics worker pool to overlap tiktoken WASM loading, which is the primary performance improvement.
Description check ✅ Passed The description includes all required template sections with a comprehensive summary of changes, detailed breakdown of modifications, benchmarks, and completed checklist items.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/pre-init-metrics-worker-pool

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

⚡ Performance Benchmark

Latest commit:67f6bcb chore(core): Add comment explaining unused worker pool in skill path
Status:✅ Benchmark complete!
Ubuntu:2.40s (±0.03s) → 2.33s (±0.03s) · -0.07s (-2.8%)
macOS:1.27s (±0.07s) → 1.27s (±0.10s) · +0.00s (+0.2%)
Windows:2.89s (±0.02s) → 2.79s (±0.04s) · -0.10s (-3.6%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded)
  • Measurement: 10 runs / 20 on macOS (median ± IQR)
  • Workflow run
History

59f425e chore(core): Add comment explaining unused worker pool in skill path

Ubuntu:2.38s (±0.03s) → 2.34s (±0.05s) · -0.03s (-1.4%)
macOS:1.22s (±0.03s) → 1.17s (±0.08s) · -0.05s (-4.0%)
Windows:2.96s (±0.12s) → 2.78s (±0.09s) · -0.18s (-6.0%)

e7dd6a0 fix(core): Ensure metrics worker pool cleanup on pipeline errors

Ubuntu:2.35s (±0.03s) → 2.29s (±0.03s) · -0.06s (-2.6%)
macOS:1.21s (±0.08s) → 1.18s (±0.02s) · -0.03s (-2.2%)
Windows:3.23s (±0.21s) → 2.94s (±0.07s) · -0.29s (-9.0%)

4d2bbcf perf(core): Pre-initialize metrics worker pool to overlap tiktoken WASM loading

Ubuntu:2.35s (±0.03s) → 2.27s (±0.03s) · -0.08s (-3.4%)
macOS:1.30s (±0.08s) → 1.42s (±0.33s) · +0.12s (+9.3%)
Windows:2.90s (±0.05s) → 2.79s (±0.01s) · -0.12s (-4.1%)

43c74d0 perf(core): Pre-initialize metrics worker pool to overlap tiktoken WASM loading

Ubuntu:2.33s (±0.03s) → 2.25s (±0.02s) · -0.08s (-3.4%)
macOS:1.29s (±0.07s) → 1.26s (±0.07s) · -0.04s (-2.7%)
Windows:3.12s (±0.08s) → 3.10s (±0.16s) · -0.01s (-0.5%)

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on significant performance enhancements across several core components of the application. By strategically pre-initializing the metrics worker pool, deferring non-essential module loading, optimizing file system interactions, and parallelizing I/O operations, the changes aim to reduce overall execution time and improve efficiency, particularly in scenarios involving large codebases or complex configurations.

Highlights

  • Metrics Worker Pool Pre-initialization: The metrics worker pool is now pre-initialized during the file collection phase. This allows the tiktoken WASM loading to overlap with other pipeline stages like security checks and file processing, preventing it from blocking the metrics calculation phase and improving overall performance.
  • Lazy-loading Jiti: The jiti module, which is used for loading TypeScript configuration files, is now lazy-loaded. This avoids importing its heavy TypeScript toolchain unless it's actually needed, benefiting cases where JSON/JSON5 config files or default configurations are used.
  • Optimized Binary File Check: The check for binary file extensions has been reordered to occur before calling fs.stat. This avoids unnecessary I/O operations for files identified as binary solely by their extension.
  • Efficient File Path Regrouping: The logic for regrouping sorted file paths by root directory has been optimized from an O(n²) array spread approach to a more efficient O(n) Map-based approach, reducing processing time for large numbers of files.
  • Parallelized Output Writes: The process of writing multiple split output files has been parallelized using Promise.all instead of sequential writes, significantly speeding up the output generation phase.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 89.58333% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.09%. Comparing base (005616a) to head (67f6bcb).
⚠️ Report is 22 commits behind head on main.

Files with missing lines Patch % Lines
src/core/packager.ts 88.57% 4 Missing ⚠️
src/core/metrics/calculateMetrics.ts 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1302      +/-   ##
==========================================
- Coverage   87.09%   87.09%   -0.01%     
==========================================
  Files         115      115              
  Lines        4341     4354      +13     
  Branches     1009     1010       +1     
==========================================
+ Hits         3781     3792      +11     
- Misses        560      562       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several performance optimizations. Key changes include lazy-loading the jiti module, optimizing binary file detection by checking extensions before I/O operations, and parallelizing the writing of split output files using Promise.all. The most significant change is the introduction of a pre-initialized metrics task runner with a warm-up mechanism to overlap tiktoken WASM loading with other pipeline stages, aiming to improve overall performance. A review comment highlights an inefficiency where the metrics task runner is unnecessarily initialized and cleaned up in the skillGenerate path, suggesting conditional initialization to avoid this overhead.

Comment on lines +138 to +139
await warmupPromise;
await metricsTaskRunner.cleanup();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In the skillGenerate path, the metricsTaskRunner is warmed up and then immediately cleaned up here without being used. This introduces unnecessary overhead from initializing and tearing down the worker pool.

To optimize this, consider making the creation of metricsTaskRunner on line 97 conditional, so it's only initialized when config.skillGenerate is not set. This would avoid the wasted work in this code path while preserving the pre-warming optimization for the main path.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 26, 2026

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: 67f6bcb
Status: ✅  Deploy successful!
Preview URL: https://172bbdee.repomix.pages.dev
Branch Preview URL: https://perf-pre-init-metrics-worker.repomix.pages.dev

View logs

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 26, 2026

Code Review

Nice set of focused performance optimizations! The benchmark numbers look solid. A few items worth addressing before merge:

1. Worker pool leak on pipeline errors (Recommended)

metricsTaskRunner is created at the top of pack() but cleanup() is only called on the two happy paths (skill path and normal path). If any intermediate stage throws (collectFiles, validateFileSafety, processFiles, produceOutput, etc.), the worker threads leak until Tinypool's idle timeout (5s). The warmup promise would also become an unhandled rejection.

Suggested fix: Wrap the section from pool creation through cleanup in a try/finally, similar to what calculateMetrics.ts already does internally (lines 129–133).

Example
const metricsTaskRunner = deps.createMetricsTaskRunner(allFilePaths.length);
const warmupPromise = metricsTaskRunner.run({ content: '', encoding: config.tokenCount.encoding });

try {
  // ... existing pipeline stages ...
} finally {
  await metricsTaskRunner.cleanup();
}

2. Warmup is wasted in the skillGenerate path (Minor)

The worker pool + warmup task are created unconditionally, but the skillGenerate early-exit branch (line ~137) immediately awaits and discards the warmup, then cleans up. Meanwhile, packSkill internally creates its own worker pool via calculateMetrics (without receiving the pre-initialized runner). This means the skill path pays for two pool initializations instead of one.

Consider either:

  • Conditionally creating the pool (skip when config.skillGenerate is set), or
  • Passing metricsTaskRunner through to packSkill so it reuses the warm pool

3. Missing test assertions for the core optimization (Minor)

The tests correctly mock createMetricsTaskRunner, but no test asserts that:

  • run() is called with a warmup task after creation
  • cleanup() is called after metrics calculation
  • The skill-path early-cleanup branch works correctly

These are the core behavioral invariants of this PR. If the warmup or cleanup calls were accidentally removed, no test would catch it.

4. Promise.all fail-fast behavior on split writes (Nit)

Promise.all rejects on the first failure and doesn't wait for in-flight writes to settle. On a write error with many split parts, the output directory could be left in a partially-written state. This is unlikely in practice (few parts, local FS), but Promise.allSettled with a subsequent error check would be strictly safer if you want to be thorough.


Everything else looks clean — the lazy jiti import, binary check reorder, and O(n) regrouping are all straightforward wins with no issues. The Partial<typeof defaultDeps> pattern in calculateMetrics.ts deviates slightly from the inline deps = {...} convention in CLAUDE.md, but it's consistent with how packager.ts and produceOutput.ts already handle deps, so this is fine.

Overall: solid perf work 👍. The try/finally for worker cleanup (item 1) is the most impactful fix.

🤖 Generated with Claude Code

devin-ai-integration[bot]

This comment was marked as resolved.

@yamadashy yamadashy marked this pull request as draft March 26, 2026 13:48
@yamadashy yamadashy marked this pull request as ready for review March 26, 2026 13:53
…SM loading

Pipeline-level optimizations that produce measurable end-to-end improvement:

- Pre-initialize metrics worker pool during file collection phase so tiktoken
  WASM loading overlaps with security checks and file processing. First token
  count task dropped from 381ms to 22ms (worker already warmed).
- Lazy-load Jiti via dynamic import — only loaded when TS/JS config files are
  detected, saving startup time for the common JSON/default config path.
- Fix O(n²) file path re-grouping in packager by using Map + Set for O(1)
  membership checks instead of .find() + .includes().
- Move binary extension check before fs.stat in fileRead to skip unnecessary
  stat syscalls for binary files.
- Parallelize split output file writes with Promise.all instead of sequential
  for-loop.

Benchmark (15 runs each, median ± IQR, packing repomix repo ~1000 files):

  main branch: 3515ms (P25: 3443, P75: 3581)
  perf branch: 3318ms (P25: 3215, P75: 3383)
  Improvement: -197ms (-5.6%)

Pipeline stage breakdown (instrumented):
  - Metrics first-file init: 381ms → 22ms (worker pre-warmed)
  - Total metrics stage: 793ms → ~450ms

All 1096 tests pass. Lint clean.

https://claude.ai/code/session_01JoNjFe7S2roMfHfNcw6bso
@yamadashy yamadashy force-pushed the perf/pre-init-metrics-worker-pool branch from 43c74d0 to 4d2bbcf Compare March 27, 2026 16:16
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 27, 2026

PR Review — perf(core): Pre-initialize metrics worker pool

Nice set of targeted performance optimizations. The O(n²)→O(n) regrouping, lazy jiti loading, and binary check reordering are all clean wins. The worker pre-warming is a clever idea, but has a few issues worth addressing before merge.

Worker pool leak on error paths

metricsTaskRunner.cleanup() is unreachable if warmupPromise rejects or if any intermediate step (produceOutput, validateFileSafety, etc.) throws between pool creation and the cleanup call.

Details

Normal path — if warmupPromise rejects at line 182, the cleanup() at line 191 is never reached. Similarly, if produceOutput throws, neither await warmupPromise nor cleanup() runs.

Skill path — if warmupPromise rejects at line 138, cleanup() at line 139 is skipped.

Suggested fix: wrap the region from pool creation to cleanup in try/finally:

const metricsTaskRunner = deps.createMetricsTaskRunner(allFilePaths.length);
const warmupPromise = metricsTaskRunner.run({ content: '', encoding: config.tokenCount.encoding });
try {
  // ... existing pipeline stages ...
  await warmupPromise;
  const metrics = await withMemoryLogging('Calculate Metrics', () =>
    deps.calculateMetrics(/* ... */, { taskRunner: metricsTaskRunner }),
  );
} finally {
  await metricsTaskRunner.cleanup();
}

The same pattern applies to the skill-generation early return path.

Floating warmup promise — unhandled rejection window

Between creation (line ~98) and the first await (line ~138 or ~182), warmupPromise is a floating promise. If the worker crashes quickly (e.g., WASM load failure), Node.js emits an unhandledRejection before the await is reached.

Details

A minimal fix is to attach a no-op catch at creation time so the rejection is deferred to the explicit await:

const warmupPromise = metricsTaskRunner.run({ content: '', encoding: config.tokenCount.encoding });
warmupPromise.catch(() => {}); // Rejection will be observed at await

Pre-warming is wasted on the skill-generation path

On the skill path, the pre-warmed runner is cleaned up at line 139, and then packSkillcalculateMetrics creates a brand-new cold pool internally (since no taskRunner is passed). The warmup work provides zero benefit for this code path — the pool spins up, runs one empty task, and is immediately torn down.

Details

Consider either:

  • Passing the pre-warmed metricsTaskRunner into packSkill so it can forward it to calculateMetrics
  • Or deferring pool creation to after the skill-generation branch check

Minor observations

  • Thread count over-provisioning: createMetricsTaskRunner(allFilePaths.length) uses the pre-security-filter count. After validateFileSafety, the actual processed file count may be smaller. Low impact but worth noting.
  • defaultDeps at module scope in calculateMetrics.ts: The taskRunner: undefined sentinel in a shared module-level object is a bit unusual. Consider Object.freeze(defaultDeps) to prevent accidental mutation.
  • DI pattern inconsistency: The change to overrideDeps: Partial<typeof defaultDeps> in calculateMetrics.ts follows the pattern used in packager.ts/produceOutput.ts, but diverges from the inline deps = { ... } convention documented in CLAUDE.md and used by most other files. Not blocking, but worth aligning one way or the other.

Test coverage gaps

  • No test covers the skill-generation early-exit path in packager.ts (warmup → cleanup → packSkill)
  • No test asserts warmupPromise is actually awaited before calculateMetrics is called (the core behavioral guarantee of this PR)
  • diffsFunctionality.test.ts doesn't assert that taskRunner is passed to calculateMetrics (unlike splitOutput.test.ts which does)
  • The createMetricsTaskRunner factory has no direct unit tests

What looks good

  • Binary extension check before fs.stat — clean, zero-risk optimization
  • Lazy jiti import — correct implementation, good comment explaining why
  • Map/Set-based regrouping — proper O(n) replacement
  • Promise.all for split output writes — sound parallelization
  • Tests properly mock the new createMetricsTaskRunner dep in all affected test files

Reviewed with Claude

Wrap the metrics worker pool lifecycle in try/finally to prevent worker
thread leaks when any pipeline stage throws. Also suppress unhandled
promise rejection on the warmup task with .catch() since it floats
between creation and await.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 28, 2026

Code Review

Nice set of performance optimizations — the pre-warm strategy, O(n) regrouping, lazy jiti, and binary check reorder are all solid improvements. A few findings worth addressing:

1. Skill path does not use the pre-warmed worker pool

In src/core/packager.ts, the skill generation branch (line ~138) awaits warmupPromise but never passes metricsTaskRunner to packSkill. Inside packSkill, calculateMetrics is called without a taskRunner override, so it creates a second cold worker pool internally. The pre-warmed pool sits idle until cleaned up in finally.

This means the entire warmup optimization is wasted on the skill generation path, and two worker pools are allocated instead of one.

Suggestion: Pass metricsTaskRunner through to packSkill so it can forward it to calculateMetrics, or guard pool creation with a condition that skips it when config.skillGenerate is set.

2. Misleading comment on .catch(() => 0)

src/core/packager.ts line ~98:

const warmupPromise = metricsTaskRunner.run({ content: '', encoding: config.tokenCount.encoding }).catch(() => 0);
// Suppress unhandled rejection; errors surface when awaited

The comment says "errors surface when awaited," but .catch(() => 0) converts the rejection into a resolved promise with value 0. When warmupPromise is later awaited, it will always succeed — the error is permanently swallowed, not deferred.

This is arguably fine (real errors will surface when metricsTaskRunner is used for actual work), but the comment is inaccurate. Consider either:

  • Fixing the comment: // Suppress unhandled rejection; errors will surface during actual metrics tasks
  • Or logging the error: .catch((err) => { logger.debug('Metrics warmup failed:', err); return 0; })

3. Tinypool idleTimeout may negate the warmup benefit on large repos

The warmup completes quickly, but worker threads have a 5-second idle timeout. If the pipeline stages between warmup and metrics calculation (file collection → security check → processing → output generation) take longer than 5 seconds on large repos, the warmed workers may be destroyed and need re-initialization anyway.

Worth benchmarking on large repos, or consider whether the idle timeout should be extended for pre-initialized pools.

Minor / informational items
  • Binary check reorder (fileRead.ts): Correct optimization. Subtly changes behavior — oversized binary files now get binary-extension skip reason instead of size-limit. Arguably more correct, but worth noting if any downstream code branches on the skip reason.

  • calculateMetrics deps pattern: Changed from default parameter to overrideDeps: Partial<typeof defaultDeps> with spread merge. This diverges from the CLAUDE.md DI convention but is consistent with the existing pattern in packager.ts and produceOutput.ts. No issue, but CLAUDE.md could be updated to acknowledge both patterns.

  • Test coverage gaps: No tests for the try/finally cleanup-on-error path, no test for warmup failure handling, and no tests covering the skill generation path with the new worker pool. Consider adding a test that rejects an intermediate step and verifies cleanup() is still called.

  • Worker pool sizing: Pool is sized with allFilePaths.length (pre-filter count), but after security checks the actual file count may be smaller. With tokenCountTree disabled (default), only a subset of files get token-counted. Acceptable trade-off since exact count isn't available at creation time.


Overall this is a well-structured PR with meaningful, benchmarked optimizations. The main actionable item is #1 (skill path not using the pre-warmed pool). Items #2 and #3 are lower priority but worth considering.

🤖 Generated with Claude Code

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 28, 2026

Incremental Code Review

Previous reviews covered the main items well. The try/finally cleanup has been addressed — nice. Here are new observations not raised before:

1. await warmupPromise in packSkill branch is unnecessary work (Low)

In src/core/packager.ts line ~141, the packSkill path awaits the warmup before returning. Since the warmup result is unused in this path and finally handles cleanup regardless, this await only delays the skill generation response for no benefit. It can be removed — the finally block will still wait for cleanup.

2. Pool creation + warmup dispatch are outside try block (Low)

metricsTaskRunner is created and .run() is called before the try. If .run() threw synchronously (unlikely with Tinypool, but an implicit assumption), the finally cleanup wouldn't execute. Moving const warmupPromise = ... inside the try would make cleanup unconditional regardless of how .run() fails.

Previously flagged items — status check

Item Status
Worker pool leak on error paths ✅ Fixed (try/finally added)
.catch(() => 0) swallows errors permanently (comment says "errors surface when awaited" but they don't) ⏳ Still present — consider at minimum fixing the comment
Warmup wasted on skill path (two pools created) ⏳ Still present — acknowledged as acceptable trade-off
Test coverage gaps (cleanup-on-error, warmup failure, skill path) ⏳ Still present
overrideDeps: Partial<> DI pattern deviation ⏳ Noted, not blocking

Overall the PR is in good shape. The remaining items are low severity — the .catch(() => 0) comment fix would be the most impactful small change.

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yamadashy yamadashy force-pushed the perf/pre-init-metrics-worker-pool branch from 59f425e to 67f6bcb Compare March 28, 2026 04:50
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 28, 2026

Code Review — perf(core): Pre-initialize metrics worker pool

Overall this is a well-structured performance PR with solid benchmarks. The worker pool pre-warming, lazy jiti import, binary check reordering, O(n) regrouping, and parallel split writes are all sensible optimizations. A few observations:

Noteworthy Findings

1. Worker pool created unnecessarily in skill path (Recommended)

In src/core/packager.ts:97, the metrics task runner is always created — even when config.skillGenerate is set and the skill path returns early without calculating metrics. The warmup is awaited (line 140) only to allow graceful shutdown, then the pool is cleaned up in finally without ever being used.

Consider guarding with a conditional or lazily initializing only when metrics are actually needed. The overhead is small, but it's wasted work and adds noise to the skill path.

2. Deps pattern deviation in calculateMetrics (Not needed — informational)

The change from deps = { fn1, fn2, ... } default parameter to overrideDeps: Partial<typeof defaultDeps> = {} with spread merge is cleaner for partial overrides. However, it deviates from the deps = { ... } pattern used elsewhere in the codebase (e.g., packager.ts, produceOutput.ts). This is fine as-is — the new pattern is arguably better — but worth noting for consistency if this becomes the new standard.

Additional observations (minor)
  • Warmup error suppression (packager.ts:98): .catch(() => 0) silently swallows warmup errors. This is acceptable since errors will surface when the task runner is actually used for real work, and the comment explains the intent well.

  • Parallel split writes (produceOutput.ts): The Promise.all change is correct. If one write fails, all others will still attempt but the first rejection propagates. This matches the previous sequential behavior where a failure would stop further writes — slightly different semantics but practically equivalent.

  • Test coverage: Tests properly mock the new createMetricsTaskRunner dep and verify the taskRunner is passed through. No missing critical test paths.

Looks good to merge. Nice performance win! 🎯

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants