feat(core): Improve error handling and performance optimizations by yamadashy · Pull Request #746 · yamadashy/repomix

yamadashy · 2025-07-22T15:08:51Z

Failed to compile template: Invalid string length #739

- Set TASKS_PER_THREAD to 100 for better balance between performance and resource usage - Add comment explaining that worker initialization is expensive - Update tests to match new thread allocation logic 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Add performance monitoring to track how long it takes to initialize the Tinypool worker pool. This helps identify potential bottlenecks during startup. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

…er log level Replace the environment variable approach for passing log levels to workers with Tinypool's workerData mechanism, which is more idiomatic for worker thread configuration. Changes: - Add setLogLevelByWorkerData() method to handle workerData-based log level setting - Update Tinypool configuration to use workerData instead of env variables - Update all 5 worker files to use setLogLevelByWorkerData() - Remove unused setLogLevelByEnv function and related test mocks - Update tests to reflect new workerData configuration This provides better isolation and follows Node.js worker thread best practices. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Fix setLogLevelByWorkerData to properly handle workerData when it comes as an array format like [{ workerId: 1 }, { logLevel: 3 }]. The logLevel is in the second element of the array. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

…comments Move setLogLevelByWorkerData() calls to the top of all worker files (immediately after imports) with clear documentation. This ensures logger configuration is set up before any other code execution in worker threads. - Move logger initialization to module load time in all 5 worker files - Add standardized comment explaining the importance of early initialization - Ensure consistent pattern across all worker implementations This improves debugging capabilities and ensures proper logging from worker startup. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

…-encoding support Create tokenCounterFactory.ts to centralize TokenCounter singleton management and add support for multiple encodings using a Map-based cache. Changes: - Add tokenCounterFactory.ts with Map<TiktokenEncoding, TokenCounter> cache - Support multiple encodings simultaneously in worker threads - Add getTokenCounter() and freeTokenCounter() functions for lifecycle management - Update fileMetricsWorker.ts and outputMetricsWorker.ts to use factory - Remove duplicate singleton implementations from worker files - Add timing measurement to TokenCounter initialization for performance monitoring This improves code organization, reduces duplication, and enables proper handling of multiple encoding types while maintaining memory efficiency. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Update dependency injection parameter names to be more descriptive of the actual functionality. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Enable error chaining by accepting ErrorOptions parameter in RepomixError and RepomixConfigValidationError constructors. Enhanced error handler to display cause information during debugging. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

…ssage Add specific error handling for JavaScript string size limit (~512MB) in Handlebars template compilation. Provides actionable guidance to use --include flag for processing specific directories. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

…re options Update 'Invalid string length' error message to suggest both --include and --ignore flags as solutions for handling large repositories. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai · 2025-07-22T15:08:57Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This update refactors logger initialization across worker modules to use setLogLevelByWorkerData instead of environment variables, ensuring logger configuration occurs at module load. It introduces a centralized tokenCounterFactory for managing TokenCounter instances, replaces singleton logic in metrics workers, and renames the output handler dependency in the packager and related tests. Error handling is enhanced with error causes and improved reporting.

Changes

File(s)	Change Summary
src/core/file/workers/fileCollectWorker.ts src/core/file/workers/fileProcessWorker.ts src/core/security/workers/securityCheckWorker.ts src/core/metrics/workers/fileMetricsWorker.ts src/core/metrics/workers/outputMetricsWorker.ts	Logger initialization refactored to use `setLogLevelByWorkerData` at module load, replacing environment-based setup.
src/core/metrics/tokenCounterFactory.ts	New module introduced to manage and cache `TokenCounter` instances per encoding; provides `getTokenCounter` and `freeTokenCounter` functions.
src/core/metrics/TokenCounter.ts	Logs initialization duration of encoding in the constructor for debugging/performance tracking.
src/core/metrics/workers/fileMetricsWorker.ts src/core/metrics/workers/outputMetricsWorker.ts	Replaces local singleton `TokenCounter` logic with imports from `tokenCounterFactory` for centralized management and cleanup.
src/core/output/outputGenerate.ts	Enhanced error handling: attaches error causes to `RepomixError`, and adds specific handling for string size `RangeError` in template output.
src/core/packager.ts tests/core/packager.test.ts tests/core/packager/diffsFunctionality.test.ts tests/integration-tests/packager.test.ts	Renames dependency property from `handleOutput` to `writeOutputToDisk` in packager and all related tests.
src/shared/errorHandle.ts	Updates error class constructors to accept `ErrorOptions` and log error causes; improves error reporting and guidance for users.
src/shared/logger.ts	Replaces `setLogLevelByEnv` with `setLogLevelByWorkerData`, configuring log level from worker data instead of environment variables.
src/shared/processConcurrency.ts	Introduces `TASKS_PER_THREAD` constant, logs worker pool initialization duration, and passes log level via `workerData` instead of environment variables.
tests/cli/cliRun.test.ts	Updates logger mock from `setLogLevelByEnv` to `setLogLevelByWorkerData`.
tests/shared/processConcurrency.test.ts	Updates comments and expected config shapes to reflect new concurrency logic and logger configuration via `workerData`.

Sequence Diagram(s)

sequenceDiagram
    participant MainProcess
    participant Worker
    participant Logger

    MainProcess->>Worker: Spawn worker with workerData (includes logLevel)
    Worker->>Logger: setLogLevelByWorkerData()
    Logger-->>Worker: Logger configured
    Worker->>TokenCounterFactory: getTokenCounter(encoding)
    TokenCounterFactory->>TokenCounter: (create if not cached)
    TokenCounter-->>TokenCounterFactory: instance
    TokenCounterFactory-->>Worker: TokenCounter instance
    Worker->>MainProcess: Processed results
    Worker->>TokenCounterFactory: freeTokenCounter() on exit
    TokenCounterFactory->>TokenCounter: free()

Estimated code review effort

3 (~45 minutes)

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/perf

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

claude · 2025-07-22T15:09:08Z

Claude finished @yamadashy's task —— View job

PR Review Analysis

I'll provide a comprehensive review of this PR focusing on error handling improvements and performance optimizations.

Review Tasks:

Analyze error handling improvements in errorHandle.ts
Review TokenCounter factory pattern implementation
Examine performance optimizations in processConcurrency.ts
Check function renaming and code clarity improvements
Review worker file changes
Evaluate test updates
Provide overall assessment and recommendations

Review Feedback

✅ Error Handling Improvements

Excellent work on error chaining! The addition of ErrorOptions support to RepomixError classes (errorHandle.ts:6, errorHandle.ts:13) is a solid improvement that enables proper error cause tracking.

Great specific error handling for the "Invalid string length" case in outputGenerate.ts:122-127. The error message provides actionable guidance mentioning both --include and --ignore options, which directly addresses the user pain point mentioned in the PR description.

Code Quality: The error handling logic in handleError function (errorHandle.ts:19-58) properly handles all error types and provides helpful debugging information.

✅ TokenCounter Factory Pattern

Well-designed factory pattern in tokenCounterFactory.ts. The implementation correctly:

Caches instances per encoding per worker thread (tokenCounterFactory.ts:5)
Provides proper resource cleanup with freeTokenCounter (tokenCounterFactory.ts:24-29)
Uses clear, documented API

Performance benefit: This addresses the expensive TokenCounter initialization by ensuring only one instance per encoding per worker.

Minor suggestion: Consider adding type guards or validation in getTokenCounter if invalid encodings could be passed.

✅ Performance Optimizations

Great addition of timing measurements in both TokenCounter.ts:8-16 and processConcurrency.ts:33-48. This provides valuable visibility into performance bottlenecks.

Proper worker initialization timing (processConcurrency.ts:26-51): The timing measurements for Tinypool initialization will help with future optimization efforts.

✅ Code Clarity Improvements

Excellent renaming: handleOutput → writeOutputToDisk (packager.ts:36, packager.ts:101) is much more semantically clear and accurately describes the function's purpose.

Worker initialization positioning: Moving setLogLevelByWorkerData() to module load time (fileMetricsWorker.ts:7-9) ensures logging is properly configured before any operations.

🔍 Areas for Consideration

Logger initialization in workers: While moving setLogLevelByWorkerData() to module level works, consider documenting this pattern across all worker files for consistency. Currently I see it in fileMetricsWorker.ts:9 but should verify it's consistent across all workers.
Error cause logging: In errorHandle.ts:31-33, the cause is logged but the formatting could be improved:

// Current
logger.note('Caused by:', error.cause);

// Consider
logger.note('Caused by:', error.cause instanceof Error ? error.cause.message : error.cause);

TokenCounter factory thread safety: While the current implementation is worker-scoped (good!), consider adding a comment clarifying this is safe because each worker has its own Map instance.

✅ Test Updates

The test updates properly reflect the function renaming and maintain coverage. Good attention to keeping tests in sync with implementation changes.

📋 Overall Assessment

This is a well-executed PR that delivers meaningful improvements:

Error handling: Proper error chaining and user-friendly messages
Performance: Smart caching strategy for TokenCounters and timing visibility
Code quality: Clear naming and better organization
Maintainability: Factory pattern reduces complexity and improves resource management

The changes follow the project's coding guidelines, maintain backward compatibility, and include appropriate test updates.

🏆 Recommendation: Approve

This PR successfully addresses the stated goals with clean, well-structured code. The improvements are meaningful and the implementation is solid. No blocking issues identified.

gemini-code-assist

Summary of Changes

Hello @yamadashy, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on making the application more robust and efficient. I've introduced significant improvements to how errors are handled, especially for large data outputs, by providing clearer messages and better error chaining. Concurrently, I've optimized performance by refining worker initialization processes and resource management, ensuring a smoother and faster user experience. A minor refactoring was also included to enhance code readability.

Highlights

Enhanced Error Handling: I've improved the error handling mechanisms, particularly for cases where generated output exceeds JavaScript's string size limits. The system now provides more user-friendly messages with actionable guidance (e.g., suggesting --include or --ignore options). Additionally, RepomixError classes now support ErrorOptions for better error chaining and debugging, allowing the original cause of an error to be preserved and displayed.
Performance Optimizations: I've implemented several performance enhancements. The TokenCounter now uses a factory pattern to ensure efficient management and reuse of instances across worker threads, reducing redundant initializations. I've also added timing measurements for TokenCounter and Tinypool worker pool initializations to better understand and optimize startup performance. Worker logger initialization has been refined to occur earlier in the worker lifecycle.
Code Clarity and Maintainability: For improved code clarity, I've renamed the handleOutput function to writeOutputToDisk. This change provides a more semantically accurate description of the function's purpose, making the codebase easier to understand and maintain.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

codecov · 2025-07-22T15:09:47Z

Codecov Report

Attention: Patch coverage is 51.04167% with 47 lines in your changes missing coverage. Please review.

Project coverage is 89.68%. Comparing base (ee9d71d) to head (3468504).

Files with missing lines	Patch %	Lines
src/core/output/outputGenerate.ts	24.13%	22 Missing ⚠️
src/shared/logger.ts	16.66%	10 Missing ⚠️
src/shared/errorHandle.ts	33.33%	8 Missing ⚠️
src/core/metrics/tokenCounterFactory.ts	68.75%	5 Missing ⚠️
src/core/metrics/workers/fileMetricsWorker.ts	50.00%	1 Missing ⚠️
src/core/metrics/workers/outputMetricsWorker.ts	75.00%	1 Missing ⚠️

❌ Your patch check has failed because the patch coverage (51.04%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #746      +/-   ##
==========================================
- Coverage   90.23%   89.68%   -0.56%     
==========================================
  Files         100      101       +1     
  Lines        5768     5806      +38     
  Branches     1205     1203       -2     
==========================================
+ Hits         5205     5207       +2     
- Misses        563      599      +36

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

cloudflare-workers-and-pages · 2025-07-22T15:09:51Z

Deploying repomix with Cloudflare Pages

Latest commit:	`3468504`
Status:	✅ Deploy successful!
Preview URL:	https://38310027.repomix.pages.dev
Branch Preview URL:	https://feat-perf.repomix.pages.dev

View logs

… guidance Enhanced error message to be more user-friendly: - Clearly explain the issue (repository contains files too large to process) - Provide concrete examples for --ignore usage - Structure solutions in easy-to-follow bullet points - Add option to process smaller portions of the repository 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

gemini-code-assist · 2025-07-22T15:11:45Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Changed RepomixError stack trace output from logger.note to logger.debug to reduce noise in standard error output. Stack traces are now only shown when verbose flag is used. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

When output size exceeds JavaScript string limit, now displays the top 5 largest files with their sizes to help users identify which files to exclude. This makes it easier to decide which --ignore patterns to use. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Copilot

Pull Request Overview

This PR improves error handling and implements performance optimizations across the codebase. The changes focus on better error reporting with cause chains, worker thread initialization optimizations, and resource management improvements.

Enhanced error handling with detailed debugging information and better error messages for common issues
Optimized worker thread initialization by switching from environment variables to workerData for configuration
Improved resource management with token counter factory pattern and performance timing metrics

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/shared/errorHandle.ts`	Enhanced error constructors to support error chaining and improved error logging
`src/shared/logger.ts`	Replaced environment-based configuration with workerData-based approach for workers
`src/shared/processConcurrency.ts`	Added performance timing and switched to workerData for worker configuration
`src/core/output/outputGenerate.ts`	Enhanced error handling with better messages for string length limits and added processedFiles parameter
`src/core/metrics/tokenCounterFactory.ts`	New factory for managing TokenCounter instances with proper resource cleanup
Multiple worker files	Updated to use workerData configuration and improved initialization patterns
Multiple test files	Updated function names and mock implementations to match API changes

Comments suppressed due to low confidence (1)

src/core/output/outputGenerate.ts:138

[nitpick] The error message spans multiple lines but is constructed as a template literal. Consider breaking this into multiple concatenated strings or ensuring consistent formatting across all lines for better readability.

        `Output size exceeds JavaScript string limit. The repository contains files that are too large to process.

src/shared/logger.ts

yamadashy and others added 10 commits July 23, 2025 00:05

refactor(core): Rename handleOutput to writeOutputToDisk for clarity

002bea3

Update dependency injection parameter names to be more descriptive of the actual functionality. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings July 22, 2025 15:08

gemini-code-assist bot reviewed Jul 22, 2025

View reviewed changes

This comment was marked as outdated.

Sign in to view

coderabbitai bot approved these changes Jul 22, 2025

View reviewed changes

yamadashy mentioned this pull request Jul 22, 2025

Failed to compile template: Invalid string length #739

Closed

yamadashy requested a review from Copilot July 22, 2025 15:32

Copilot AI reviewed Jul 22, 2025

View reviewed changes

src/shared/logger.ts Show resolved Hide resolved

src/shared/logger.ts Show resolved Hide resolved

yamadashy merged commit 2ecf7a8 into main Jul 22, 2025
53 checks passed

yamadashy deleted the feat/perf branch July 22, 2025 15:34

This was referenced Aug 31, 2025

refactor(core): Consolidate worker pool arguments into WorkerOptions interface #816

Merged

feat(core): Implement worker memory management with Tinypool teardown and reorganize memory testing tools #817

Merged

coderabbitai bot mentioned this pull request Feb 17, 2026

perf(core): Optimize file collection with UTF-8 fast path and promise pool #1155

Merged

2 tasks

coderabbitai bot mentioned this pull request Feb 20, 2026

feat(cli): add --no-color flag and fix color propagation to workers #1163

Open

6 tasks

Uh oh!

Conversation

yamadashy commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

claude bot commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Analysis

Review Tasks:

Review Feedback

✅ Error Handling Improvements

✅ TokenCounter Factory Pattern

✅ Performance Optimizations

✅ Code Clarity Improvements

🔍 Areas for Consideration

✅ Test Updates

📋 Overall Assessment

🏆 Recommendation: Approve

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov bot commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cloudflare-workers-and-pages bot commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying repomix with Cloudflare Pages

Uh oh!

gemini-code-assist bot commented Jul 22, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yamadashy commented Jul 22, 2025 •

edited

Loading

coderabbitai bot commented Jul 22, 2025 •

edited

Loading

claude bot commented Jul 22, 2025 •

edited

Loading

codecov bot commented Jul 22, 2025 •

edited

Loading

cloudflare-workers-and-pages bot commented Jul 22, 2025 •

edited

Loading