Skip to content

perf(core): Lazy-load jschardet and iconv-lite in fileRead#1401

Merged
yamadashy merged 2 commits intomainfrom
perf/lazy-load-encoding-libs
Apr 5, 2026
Merged

perf(core): Lazy-load jschardet and iconv-lite in fileRead#1401
yamadashy merged 2 commits intomainfrom
perf/lazy-load-encoding-libs

Conversation

@yamadashy
Copy link
Copy Markdown
Owner

@yamadashy yamadashy commented Apr 5, 2026

Defer importing jschardet and iconv-lite until first encounter of a non-UTF-8 file. The fast UTF-8 path (covers ~99% of source code files) never needs these libraries, saving ~25ms of combined import cost at startup.

Extracted from #1377. Pure performance optimization with no functional changes.

Checklist

  • Run npm run test
  • Run npm run lint

Open with Devin

Defer importing jschardet and iconv-lite until first encounter of a
non-UTF-8 file. The fast UTF-8 path (covers ~99% of source code files)
never needs these libraries, saving ~25ms of combined import cost at
startup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 5, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 085f925e-6beb-4645-ae11-eff42eadecd3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The src/core/file/fileRead.ts module is refactored to use lazy-loading for iconv-lite and jschardet dependencies instead of static imports. A getEncodingDeps() function dynamically imports both libraries on-demand and caches them. Subsequent calls to encoding detection and decoding use the cached lazy-loaded references instead of direct static imports.

Changes

Cohort / File(s) Summary
Dependency Lazy-Loading
src/core/file/fileRead.ts
Replaced static imports of iconv-lite and jschardet with a getEncodingDeps() function that dynamically imports and caches dependencies. Updated encoding detection and decoding calls to use lazy-loaded references via encodingDeps.jschardet.detect() and encodingDeps.iconv methods. UTF-8 fast path and fallback control flow remain unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: lazy-loading two encoding libraries in the fileRead module as a performance optimization.
Description check ✅ Passed The description covers the motivation, implementation details, and confirms both required checklist items were completed, meeting the template requirements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/lazy-load-encoding-libs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 5, 2026

⚡ Performance Benchmark

Latest commit:438fd99 refactor(core): Cache Promise instead of resolved values in getEncodingDeps
Status:✅ Benchmark complete!
Ubuntu:1.55s (±0.02s) → 1.53s (±0.01s) · -0.01s (-1.0%)
macOS:0.87s (±0.04s) → 0.86s (±0.02s) · -0.01s (-0.7%)
Windows:1.92s (±0.08s) → 1.88s (±0.07s) · -0.04s (-2.1%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded), interleaved execution
  • Measurement: 20 runs / 30 on macOS (median ± IQR)
  • Workflow run
History

1bbaa94 perf(core): Lazy-load jschardet and iconv-lite in fileRead

Ubuntu:1.57s (±0.04s) → 1.55s (±0.03s) · -0.02s (-1.2%)
macOS:0.90s (±0.08s) → 0.89s (±0.06s) · -0.01s (-1.3%)
Windows:1.92s (±0.03s) → 1.87s (±0.03s) · -0.04s (-2.3%)

gemini-code-assist[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/core/file/fileRead.ts (1)

9-16: Consider caching the import promise to avoid redundant imports under concurrent calls.

If multiple files fail UTF-8 decoding simultaneously, concurrent calls to getEncodingDeps() could each trigger Promise.all before the cache variables are set. This is a minor inefficiency rather than a correctness bug, but can be avoided by caching the promise itself:

♻️ Optional: Promise-based singleton pattern
 let _jschardet: typeof import('jschardet') | undefined;
 let _iconv: typeof import('iconv-lite') | undefined;
+let _encodingDepsPromise: Promise<{ jschardet: typeof import('jschardet'); iconv: typeof import('iconv-lite') }> | undefined;
+
 const getEncodingDeps = async () => {
-  if (!_jschardet || !_iconv) {
-    [_jschardet, _iconv] = await Promise.all([import('jschardet'), import('iconv-lite')]);
+  if (!_encodingDepsPromise) {
+    _encodingDepsPromise = Promise.all([import('jschardet'), import('iconv-lite')]).then(([jschardet, iconv]) => {
+      _jschardet = jschardet;
+      _iconv = iconv;
+      return { jschardet, iconv };
+    });
   }
-  return { jschardet: _jschardet, iconv: _iconv };
+  return _encodingDepsPromise;
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/file/fileRead.ts` around lines 9 - 16, The current getEncodingDeps
function can trigger multiple concurrent dynamic imports because it only caches
loaded modules (_jschardet, _iconv); add a cached promise (e.g.
_encodingDepsPromise) to act as a singleton for the in-flight import and change
getEncodingDeps to return/await that promise when present, creating and
assigning the Promise (which does the Promise.all([import('jschardet'),
import('iconv-lite')]) and sets _jschardet/_iconv) only once; this ensures
`_jschardet`, `_iconv`, and `getEncodingDeps` use the same import promise under
concurrent calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/core/file/fileRead.ts`:
- Around line 9-16: The current getEncodingDeps function can trigger multiple
concurrent dynamic imports because it only caches loaded modules (_jschardet,
_iconv); add a cached promise (e.g. _encodingDepsPromise) to act as a singleton
for the in-flight import and change getEncodingDeps to return/await that promise
when present, creating and assigning the Promise (which does the
Promise.all([import('jschardet'), import('iconv-lite')]) and sets
_jschardet/_iconv) only once; this ensures `_jschardet`, `_iconv`, and
`getEncodingDeps` use the same import promise under concurrent calls.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2370b451-d7f0-4847-a212-451e023c95a2

📥 Commits

Reviewing files that changed from the base of the PR and between 611b88a and 1bbaa94.

📒 Files selected for processing (1)
  • src/core/file/fileRead.ts

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.41%. Comparing base (611b88a) to head (438fd99).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1401      +/-   ##
==========================================
+ Coverage   87.40%   87.41%   +0.01%     
==========================================
  Files         116      116              
  Lines        4389     4393       +4     
  Branches     1018     1018              
==========================================
+ Hits         3836     3840       +4     
  Misses        553      553              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 5, 2026

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: 438fd99
Status: ✅  Deploy successful!
Preview URL: https://9c13da98.repomix.pages.dev
Branch Preview URL: https://perf-lazy-load-encoding-libs.repomix.pages.dev

View logs

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 5, 2026

Code Review — perf(core): Lazy-load jschardet and iconv-lite in fileRead

Verdict: Approve — Clean, well-scoped performance optimization. One optional suggestion below.

What's good

  • Well-isolated extraction from perf(core): Automated performance tuning by Claude #1377 — easy to review, revert, and merge independently
  • Clear comment block explaining the "why" (~25ms startup cost) and "when" (only non-UTF-8 files)
  • Promise.all correctly parallelizes the two independent imports
  • Existing test suite covers the slow path (should skip file with actual decode errors), providing regression protection

Suggestion: Cache the Promise, not the resolved values

Details

The current guard checks two separate let variables:

if (!_jschardet || !_iconv) {
  [_jschardet, _iconv] = await Promise.all([import('jschardet'), import('iconv-lite')]);
}

If two concurrent readRawFile calls hit the slow path before the first import() resolves (e.g., a repo with many non-UTF-8 files processed in parallel with FILE_COLLECT_CONCURRENCY = 50), both will enter the if body and kick off redundant Promise.all calls. This is harmless in practice — Node.js's module loader deduplicates dynamic import() — but caching the Promise itself would make the intent explicit and watertight:

let _encodingDepsPromise: Promise<{ jschardet: typeof import('jschardet'); iconv: typeof import('iconv-lite') }> | undefined;
const getEncodingDeps = () => {
  _encodingDepsPromise ??= Promise.all([import('jschardet'), import('iconv-lite')])
    .then(([jschardet, iconv]) => ({ jschardet, iconv }));
  return _encodingDepsPromise;
};

This guarantees exactly one Promise.all is ever created regardless of concurrency, and also simplifies the code from 6 lines + 2 variables down to 1 variable + a clean nullish-coalescing assignment.

Notes from detailed review

  • Security: No concerns. Dynamic import() resolves through the same lockfile-validated module cache as static imports. No new attack surface.
  • Performance: The ~25ms claim is plausible for CJS-style encoding libraries. Minor first-call latency addition on the slow path is an inherent and acceptable tradeoff.
  • Test coverage: Adequate for a pure refactor. The slow path is exercised by the decode-error test. A Shift-JIS/Latin-1 success-path test would be a nice future addition but is not a blocker.
  • Conventions: Commit message follows Conventional Commits correctly. File stays well under the 250-line limit at 87 lines.

LGTM 👍

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

…ngDeps

Guarantees exactly one import regardless of concurrent slow-path calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yamadashy yamadashy merged commit 2f56197 into main Apr 5, 2026
63 checks passed
@yamadashy yamadashy deleted the perf/lazy-load-encoding-libs branch April 5, 2026 05:59
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment thread src/core/file/fileRead.ts
Comment on lines +13 to +16
_encodingDepsPromise ??= Promise.all([import('jschardet'), import('iconv-lite')]).then(([jschardet, iconv]) => ({
jschardet,
iconv,
}));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Dynamic import of iconv-lite exposes only default, causing encodingExists/decode to be undefined at Node.js runtime

When iconv-lite (a CommonJS module) is loaded via dynamic import(), Node.js wraps module.exports under the .default property. Unlike jschardet (which Node.js can statically analyze for named exports), iconv-lite builds its exports dynamically (iconv = module.exports; iconv.encode = function…), so Node.js does not hoist named exports. The result is that encodingDeps.iconv.encodingExists and encodingDeps.iconv.decode are both undefined at runtime, even though TypeScript's type declarations make them appear valid.

I verified this empirically against the installed iconv-lite@0.7.x:

typeof deps.iconv.encodingExists → undefined
typeof deps.iconv.decode         → undefined
typeof deps.iconv.default.encodingExists → function
typeof deps.iconv.default.decode         → function

This means every non-UTF-8 file (the "slow path" at lines 73-77) will throw a TypeError: encodingDeps.iconv.encodingExists is not a function, which is caught by the outer catch at src/core/file/fileRead.ts:85 and silently returned as { content: null, skippedReason: 'encoding-error' }. Files that were previously decoded correctly (e.g. Shift-JIS, EUC-KR) are now incorrectly skipped. The existing tests pass because Vitest's CJS/ESM interop differs from Node.js runtime behavior, masking the bug.

Suggested change
_encodingDepsPromise ??= Promise.all([import('jschardet'), import('iconv-lite')]).then(([jschardet, iconv]) => ({
jschardet,
iconv,
}));
_encodingDepsPromise ??= Promise.all([import('jschardet'), import('iconv-lite')]).then(([jschardet, iconv]) => ({
jschardet,
iconv: iconv.default ?? iconv,
}));
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant