feat(worker): Add unified worker entry point for bundling support#1056
feat(worker): Add unified worker entry point for bundling support#1056
Conversation
Add a unified worker entry point that enables full bundling support by allowing bundled files to spawn workers using themselves. This is a prerequisite for bundling the website server to improve Cloud Run cold start times. Changes: - Add src/shared/unifiedWorker.ts as single entry point for all workers - Support both worker_threads and child_process runtimes - Add REPOMIX_WORKER_TYPE env var for child_process worker type detection - Add REPOMIX_WORKER_PATH env var for bundled environment worker path - Add REPOMIX_WASM_DIR env var for WASM file location override - Update processConcurrency.ts to use unified worker path - Add debug logging (REPOMIX_DEBUG_WORKER=1) for worker troubleshooting - Export unified worker handler from main index.ts Note: This is work in progress. There's a known issue with child_process runtime where nested worker pools (created inside a worker) may receive incorrect REPOMIX_WORKER_TYPE environment variable, causing task routing issues. Investigation ongoing.
Modify website server entry point to support being used as both server and worker entry in bundled environments: - Re-export unified worker handler from repomix for Tinypool - Add isTinypoolWorker() check to skip server startup when running as worker - Wrap server initialization in conditional block This enables esbuild bundling of the server while maintaining worker functionality for Cloud Run cold start optimization.
Fix regression where fileCollect tasks were incorrectly routed to defaultActionWorker due to REPOMIX_WORKER_TYPE environment variable inheritance in child_process mode. Changes: - Add getWorkerPath() that returns individual worker file paths - Only use unified worker when REPOMIX_WORKER_PATH is explicitly set - Move WorkerType definition to processConcurrency.ts to avoid circular import This ensures the regular CLI works correctly while still supporting bundled environments when REPOMIX_WORKER_PATH is set.
Fix issue where Tinypool reuses child processes across different worker pools in bundled environments, causing tasks to be routed to incorrect handlers. Changes: - Add inferWorkerTypeFromTask() to determine worker type from task structure - Add getWorkerTypeFromWorkerData() to handle Tinypool's array workerData format - Cache handlers by worker type instead of single loadedHandler - Dynamically select handler based on inferred or configured worker type This enables bundled website server to correctly handle all worker types (fileCollect, fileProcess, securityCheck, calculateMetrics, defaultAction) even when child processes are reused.
Remove code that was added for debugging during development: - Remove unused isTinypoolWorker function from unifiedWorker.ts - Remove REPOMIX_DEBUG_WORKER logging from unifiedWorker.ts - Remove debug logging from defaultActionWorker.ts - Remove unused getUnifiedWorkerPath export - Update tests to use workerType instead of workerPath
- Consolidate WorkerType definition to unifiedWorker.ts - Fix inferWorkerTypeFromTask order: check calculateMetrics before securityCheck - Simplify cliOptions handling in defaultActionWorker
Based on multi-agent code review: - Fix fileProcess inference: check for rawFile instead of filePath/content - Fix calculateMetrics inference: check for content/encoding (path is optional) - Fix securityCheck inference: add type field check for specificity - Remove unnecessary type assertion in defaultActionWorker
- Add comprehensive tests for unifiedWorker.ts covering task inference and worker termination cleanup - Unify onWorkerTermination to async signature across all worker files for consistency (fileCollect, securityCheck, calculateMetrics)
Ensure all onWorkerTermination exports have consistent `: Promise<void>` type annotations for better type safety.
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughThe PR transitions the worker system from file-path-based identification ( Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Task/Client
participant UW as UnifiedWorker
participant Disp as Type Dispatcher
participant Handler as Per-Type Handler<br/>(cached)
participant SpecWkr as Specific Worker<br/>(fileCollect/fileProcess/etc.)
rect rgb(200, 220, 255)
Note over Client,SpecWkr: Old Flow (workerPath-based)
Client->>Client: Task with workerPath:<br/>'./path/to/fileCollectWorker.js'
Client->>SpecWkr: Load & execute worker
SpecWkr->>SpecWkr: Process task
end
rect rgb(220, 255, 220)
Note over Client,SpecWkr: New Flow (workerType-based via UnifiedWorker)
Client->>UW: Task with workerType<br/>from env/workerData
UW->>Disp: inferWorkerTypeFromTask(task)
Disp-->>UW: Determined worker type<br/>(e.g., 'fileCollect')
UW->>Handler: Dynamic import cached<br/>per workerType
Handler->>SpecWkr: Handler delegates<br/>to specific worker logic
SpecWkr->>SpecWkr: Process task payload
SpecWkr-->>Handler: Task result
Handler-->>UW: Return result
UW-->>Client: Task output
end
rect rgb(255, 240, 200)
Note over UW,SpecWkr: Cleanup (onWorkerTermination)
Client->>UW: Termination signal
UW->>Handler: onWorkerTermination()
Handler->>SpecWkr: Per-type cleanup hook
SpecWkr->>SpecWkr: Cleanup (e.g., free<br/>token counters)
SpecWkr-->>Handler: Cleanup complete
Handler-->>UW: Promise<void> resolved
UW->>UW: Clear handler cache
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @yamadashy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the application's bundling capabilities, particularly for the website server, by introducing a unified worker entry point. This change aims to reduce Cloud Run cold start times by allowing all worker logic to be consolidated into a single bundled file. It refactors how worker tasks are dispatched, moving from explicit worker file paths to a type-based system that dynamically loads the correct handler. Additionally, it provides flexibility for WASM file locations in bundled contexts and maintains full backward compatibility for standard CLI operations. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1056 +/- ##
==========================================
- Coverage 90.34% 90.24% -0.11%
==========================================
Files 123 124 +1
Lines 9457 9605 +148
Branches 1709 1754 +45
==========================================
+ Hits 8544 8668 +124
- Misses 913 937 +24 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Deploying repomix with
|
| Latest commit: |
534b238
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://de73bc30.repomix.pages.dev |
| Branch Preview URL: | https://feat-unified-worker-bundling.repomix.pages.dev |
PR Review: feat(worker): Add unified worker entry point for bundling supportOverall AssessmentThis is a well-designed PR that solves a real problem - enabling bundling support for the website server to improve Cloud Run cold start times. The architecture is sound and backward compatible. ✅ Strengths
|
| Failure Scenario | Risk | Mitigation |
|---|---|---|
| Task inference misclassifies a task | Medium | Covered by the explicit property checks, but adding tests for ambiguous task structures would help |
| Environment variables not propagated in child_process | Low | Already handled via REPOMIX_WORKER_TYPE env var |
| WASM files not found in bundled env | Medium | Error message is clear, but docs should specify expected file layout |
| Tinypool worker reuse causes wrong handler | Low | Task inference takes precedence over workerData, mitigating this |
📋 Minor Suggestions
-
Type safety: The
biome-ignorefornoExplicitAnyis reasonable given the varying worker signatures, but consider using a generic or discriminated union in the future -
Documentation: Consider adding a brief section to
CLAUDE.mdor README about bundled environment setup if this feature is intended for external use
Verdict
Approve - The PR is well-implemented with good test coverage. The design is clean and the implementation is backward compatible. The minor issues noted above don't block merging.
🤖 Reviewed by Claude
There was a problem hiding this comment.
Code Review
This PR introduces a unified worker entry point to support bundling, which is a great improvement for deployment scenarios like Cloud Run. The implementation is well-executed, using a dynamic dispatcher in unifiedWorker.ts that correctly routes tasks to their respective handlers. The changes to support custom WASM paths and the fallback mechanism for inferring worker types show great attention to detail. The refactoring is applied consistently across the codebase, and the new tests for the unified worker are comprehensive.
I have one suggestion regarding the website/server/src/index.ts file to maintain testability by ensuring the Hono app instance is always exported.
| // Re-export unified worker for bundled environment | ||
| // When this file is used as a Tinypool worker, it needs to export the handler | ||
| export { unifiedWorkerHandler as default, unifiedWorkerTermination as onWorkerTermination } from 'repomix'; | ||
|
|
||
| // Log server metrics on startup | ||
| logInfo('Server starting', { | ||
| metrics: { | ||
| processConcurrency: getProcessConcurrency(), | ||
| }, | ||
| }); | ||
| // Check if running as a Tinypool worker (bundled environment) | ||
| // In bundled mode, this file is used both as server entry and worker entry | ||
| const isTinypoolWorker = (): boolean => { | ||
| const tinypoolState = (process as NodeJS.Process & { __tinypool_state__?: { isTinypoolWorker?: boolean } }) | ||
| .__tinypool_state__; | ||
| return tinypoolState?.isTinypoolWorker ?? false; | ||
| }; | ||
|
|
||
| // Skip server initialization if running as a Tinypool worker | ||
| if (!isTinypoolWorker()) { | ||
| const API_TIMEOUT_MS = 35_000; | ||
|
|
||
| // Log initial memory usage | ||
| logMemoryUsage('Server startup', { | ||
| processConcurrency: getProcessConcurrency(), | ||
| }); | ||
| // Log server metrics on startup | ||
| logInfo('Server starting', { | ||
| metrics: { | ||
| processConcurrency: getProcessConcurrency(), | ||
| }, | ||
| }); | ||
|
|
||
| const app = new Hono(); | ||
| // Log initial memory usage | ||
| logMemoryUsage('Server startup', { | ||
| processConcurrency: getProcessConcurrency(), | ||
| }); | ||
|
|
||
| // Configure CORS | ||
| app.use('/*', corsMiddleware); | ||
| const app = new Hono(); | ||
|
|
||
| // Enable compression | ||
| app.use(compress()); | ||
| // Configure CORS | ||
| app.use('/*', corsMiddleware); | ||
|
|
||
| // Set timeout for API routes | ||
| app.use('/api', timeout(API_TIMEOUT_MS)); | ||
| // Enable compression | ||
| app.use(compress()); | ||
|
|
||
| // Setup custom logger | ||
| app.use('*', cloudLoggerMiddleware()); | ||
| // Set timeout for API routes | ||
| app.use('/api', timeout(API_TIMEOUT_MS)); | ||
|
|
||
| // Apply rate limiting to API routes | ||
| app.use('/api/*', rateLimitMiddleware()); | ||
| // Setup custom logger | ||
| app.use('*', cloudLoggerMiddleware()); | ||
|
|
||
| // Health check endpoint | ||
| app.get('/health', (c) => c.text('OK')); | ||
| // Apply rate limiting to API routes | ||
| app.use('/api/*', rateLimitMiddleware()); | ||
|
|
||
| // Main packing endpoint | ||
| app.post('/api/pack', bodyLimitMiddleware, packAction); | ||
| // Health check endpoint | ||
| app.get('/health', (c) => c.text('OK')); | ||
|
|
||
| // Start server | ||
| const port = process.env.PORT ? Number.parseInt(process.env.PORT, 10) : 3000; | ||
| logInfo(`Server starting on port ${port}`); | ||
| // Main packing endpoint | ||
| app.post('/api/pack', bodyLimitMiddleware, packAction); | ||
|
|
||
| serve({ | ||
| fetch: app.fetch, | ||
| port, | ||
| }); | ||
| // Start server | ||
| const port = process.env.PORT ? Number.parseInt(process.env.PORT, 10) : 3000; | ||
| logInfo(`Server starting on port ${port}`); | ||
|
|
||
| // Export app for testing | ||
| export default app; | ||
| serve({ | ||
| fetch: app.fetch, | ||
| port, | ||
| }); | ||
| } |
There was a problem hiding this comment.
The recent changes have removed the export of the app instance, which was previously available for testing purposes. While the logic to differentiate between server and worker modes is correct, removing this export can break testing setups that rely on importing the app instance directly.
To maintain testability, I suggest refactoring to ensure the app instance is always created and exported, while the server-specific logic (like starting the server with serve) remains conditional.
import { serve } from '@hono/node-server';
import { Hono } from 'hono';
import { compress } from 'hono/compress';
import { timeout } from 'hono/timeout';
import { packAction } from './actions/packAction.js';
import { bodyLimitMiddleware } from './middlewares/bodyLimit.js';
import { cloudLoggerMiddleware } from './middlewares/cloudLogger.js';
import { corsMiddleware } from './middlewares/cors.js';
import { rateLimitMiddleware } from './middlewares/rateLimit.js';
import { logInfo, logMemoryUsage } from './utils/logger.js';
import { getProcessConcurrency } from './utils/processConcurrency.js';
// Re-export unified worker for bundled environment
// When this file is used as a Tinypool worker, it needs to export the handler
export { unifiedWorkerHandler as default, unifiedWorkerTermination as onWorkerTermination } from 'repomix';
// Check if running as a Tinypool worker (bundled environment)
// In bundled mode, this file is used both as server entry and worker entry
const isTinypoolWorker = (): boolean => {
const tinypoolState = (process as NodeJS.Process & { __tinypool_state__?: { isTinypoolWorker?: boolean } })
.__tinypool_state__;
return tinypoolState?.isTinypoolWorker ?? false;
};
const app = new Hono();
// Only configure and run the server if not in worker mode.
if (!isTinypoolWorker()) {
const API_TIMEOUT_MS = 35_000;
// Log server metrics on startup
logInfo('Server starting', {
metrics: {
processConcurrency: getProcessConcurrency(),
},
});
// Log initial memory usage
logMemoryUsage('Server startup', {
processConcurrency: getProcessConcurrency(),
});
// Configure CORS
app.use('/*', corsMiddleware);
// Enable compression
app.use(compress());
// Set timeout for API routes
app.use('/api', timeout(API_TIMEOUT_MS));
// Setup custom logger
app.use('*', cloudLoggerMiddleware());
// Apply rate limiting to API routes
app.use('/api/*', rateLimitMiddleware());
// Health check endpoint
app.get('/health', (c) => c.text('OK'));
// Main packing endpoint
app.post('/api/pack', bodyLimitMiddleware, packAction);
// Start server
const port = process.env.PORT ? Number.parseInt(process.env.PORT, 10) : 3000;
logInfo(`Server starting on port ${port}`);
serve({
fetch: app.fetch,
port,
});
}
// Export app for testing purposes
export default app;There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/cli/actions/workers/defaultActionWorker.ts (1)
12-17: Type definition inconsistency: cliOptions should be optional.The
DefaultActionTaskinterface declarescliOptions: CliOptions(not optional), but line 59 treats it as potentially undefined withcliOptions ?? {}. This creates a type system mismatch where TypeScript won't catch cases wherecliOptionsis actually undefined.🔎 Proposed fix
export interface DefaultActionTask { directories: string[]; cwd: string; config: RepomixConfigMerged; - cliOptions: CliOptions; + cliOptions?: CliOptions; stdinFilePaths?: string[]; }
🧹 Nitpick comments (2)
src/cli/actions/workers/defaultActionWorker.ts (1)
46-56: Consider consistency with other worker validation patterns.The task validation logic added here (checking for object type and array type) is not present in other workers like
fileProcessWorker.tsorfileCollectWorker.ts. While defensive validation is beneficial, consider whether this pattern should be applied consistently across all workers or if it's specifically needed for this worker due to the child_process runtime.website/server/src/index.ts (1)
17-23: Consider adding a comment about Tinypool internal API usage.The
isTinypoolWorkerfunction accesses Tinypool's internal__tinypool_state__property. While this works, it could break if Tinypool changes its internals. The safe fallback (?? false) is good defensive coding.🔎 Suggested documentation
// Check if running as a Tinypool worker (bundled environment) // In bundled mode, this file is used both as server entry and worker entry +// Note: Uses Tinypool's internal __tinypool_state__ property; may need updates if Tinypool changes const isTinypoolWorker = (): boolean => {
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
src/cli/actions/defaultAction.tssrc/cli/actions/workers/defaultActionWorker.tssrc/cli/cliSpinner.tssrc/core/file/fileCollect.tssrc/core/file/fileProcess.tssrc/core/file/workers/fileCollectWorker.tssrc/core/file/workers/fileProcessWorker.tssrc/core/metrics/calculateMetrics.tssrc/core/metrics/workers/calculateMetricsWorker.tssrc/core/security/securityCheck.tssrc/core/security/workers/securityCheckWorker.tssrc/core/treeSitter/loadLanguage.tssrc/index.tssrc/shared/processConcurrency.tssrc/shared/unifiedWorker.tstests/cli/actions/defaultAction.test.tstests/core/metrics/calculateGitDiffMetrics.test.tstests/core/metrics/calculateGitLogMetrics.test.tstests/core/metrics/calculateOutputMetrics.test.tstests/core/metrics/calculateSelectiveFileMetrics.test.tstests/shared/processConcurrency.test.tstests/shared/unifiedWorker.test.tswebsite/server/src/index.ts
🧰 Additional context used
🧬 Code graph analysis (10)
src/core/file/workers/fileCollectWorker.ts (4)
src/core/file/workers/fileProcessWorker.ts (1)
onWorkerTermination(25-27)src/core/metrics/workers/calculateMetricsWorker.ts (1)
onWorkerTermination(48-50)src/core/security/workers/securityCheckWorker.ts (1)
onWorkerTermination(88-90)src/index.ts (1)
onWorkerTermination(66-66)
src/core/file/workers/fileProcessWorker.ts (3)
src/cli/actions/workers/defaultActionWorker.ts (1)
onWorkerTermination(119-122)src/core/file/workers/fileCollectWorker.ts (1)
onWorkerTermination(54-56)src/core/security/workers/securityCheckWorker.ts (1)
onWorkerTermination(88-90)
src/core/treeSitter/loadLanguage.ts (1)
src/index.ts (1)
setWasmBasePath(27-27)
src/cli/cliSpinner.ts (1)
src/cli/types.ts (1)
CliOptions(4-72)
src/core/security/workers/securityCheckWorker.ts (3)
src/core/file/workers/fileProcessWorker.ts (1)
onWorkerTermination(25-27)src/core/file/workers/fileCollectWorker.ts (1)
onWorkerTermination(54-56)src/core/metrics/workers/calculateMetricsWorker.ts (1)
onWorkerTermination(48-50)
tests/shared/processConcurrency.test.ts (1)
src/shared/processConcurrency.ts (2)
createWorkerPool(66-113)initTaskRunner(141-147)
src/core/metrics/workers/calculateMetricsWorker.ts (2)
src/core/file/workers/fileCollectWorker.ts (1)
onWorkerTermination(54-56)src/core/security/workers/securityCheckWorker.ts (1)
onWorkerTermination(88-90)
website/server/src/index.ts (7)
website/server/src/utils/logger.ts (2)
logInfo(34-39)logMemoryUsage(64-72)src/shared/processConcurrency.ts (1)
getProcessConcurrency(48-50)website/server/src/middlewares/cors.ts (1)
corsMiddleware(3-21)website/server/src/middlewares/cloudLogger.ts (1)
cloudLoggerMiddleware(20-98)website/server/src/middlewares/rateLimit.ts (1)
rateLimitMiddleware(6-20)website/server/src/middlewares/bodyLimit.ts (1)
bodyLimitMiddleware(5-12)website/server/src/actions/packAction.ts (1)
packAction(72-159)
tests/shared/unifiedWorker.test.ts (3)
src/core/metrics/workers/calculateMetricsWorker.ts (2)
task(43-45)onWorkerTermination(48-50)src/core/file/workers/fileCollectWorker.ts (1)
onWorkerTermination(54-56)src/core/security/workers/securityCheckWorker.ts (1)
onWorkerTermination(88-90)
src/shared/processConcurrency.ts (2)
src/shared/unifiedWorker.ts (1)
WorkerType(15-15)src/index.ts (1)
WorkerType(67-67)
🪛 GitHub Actions: autofix.ci
src/index.ts
[error] 15-15: Module 'repomix' has no exported member 'unifiedWorkerHandler'.
website/server/src/index.ts
[error] 15-15: Module 'repomix' has no exported member 'unifiedWorkerHandler'.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Test (ubuntu-latest, 20.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Build and run (ubuntu-latest, 22.x)
- GitHub Check: Build and run (windows-latest, 25.x)
- GitHub Check: Test (windows-latest, 25.x)
- GitHub Check: Test (macos-latest, 24.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Build and run (macos-latest, 20.x)
- GitHub Check: Build and run (macos-latest, 25.x)
- GitHub Check: Test (windows-latest, 24.x)
- GitHub Check: Build and run (windows-latest, 24.x)
- GitHub Check: claude-review
- GitHub Check: Build and run with Bun (windows-latest, latest)
- GitHub Check: Cloudflare Pages
🔇 Additional comments (33)
src/core/file/workers/fileCollectWorker.ts (1)
54-56: LGTM! Termination signature standardized.The conversion of
onWorkerTerminationto an async function returningPromise<void>aligns with the unified worker termination pattern applied across all workers in this PR.src/core/security/workers/securityCheckWorker.ts (1)
88-90: LGTM! Termination signature standardized.The async termination hook signature is consistent with other workers and enables unified termination handling.
src/core/treeSitter/loadLanguage.ts (2)
8-28: Well-designed WASM path configuration for bundled environments.The custom WASM base path configuration with environment variable fallback provides flexibility for bundled deployments while maintaining backward compatibility with standard node_modules resolution.
44-54: LGTM! Clean path resolution with proper fallback.The WASM path resolution correctly prioritizes custom paths for bundled environments while falling back to standard
require.resolvefor node_modules.src/core/metrics/calculateMetrics.ts (1)
42-48: LGTM! Worker identification migrated to type-based approach.The migration from
workerPathtoworkerType: 'calculateMetrics'is consistent with the unified worker pattern.src/cli/actions/defaultAction.ts (1)
108-112: LGTM! Worker identification migrated to type-based approach.The migration to
workerType: 'defaultAction'is consistent with the unified worker pattern. Note that this worker useschild_processruntime rather thanworker_threads, which aligns with the defaultAction worker's requirements.src/core/metrics/workers/calculateMetricsWorker.ts (1)
48-50: LGTM! Termination signature standardized.The async termination hook signature is consistent with other workers. The
freeTokenCounters()call is synchronous based on the implementation pattern.tests/core/metrics/calculateSelectiveFileMetrics.test.ts (2)
39-40: LGTM! Test updated to reflect new workerType API.The test correctly uses
workerType: 'calculateMetrics'instead of the previousworkerPathapproach, aligning with the unified worker dispatch mechanism.
60-61: LGTM! Consistent test update.Both test cases now use the
workerTypeapproach consistently.src/core/security/securityCheck.ts (1)
58-62: LGTM! Worker identification migrated to type-based approach.The switch from
workerPathtoworkerType: 'securityCheck'aligns with the unified worker dispatch mechanism. The worker type is properly registered insrc/shared/unifiedWorker.ts(WorkerType union and handler dispatch) and correctly mapped insrc/shared/processConcurrency.ts.src/core/file/fileCollect.ts (1)
28-28: LGTM! Clean migration to workerType-based configuration.The change from
workerPathtoworkerType: 'fileCollect'aligns with the unified worker dispatch pattern introduced in this PR.src/core/file/workers/fileProcessWorker.ts (1)
25-27: LGTM! Consistent async termination signature.The updated
onWorkerTerminationsignature aligns with the unified worker termination pattern used across all workers in this PR.tests/cli/actions/defaultAction.test.ts (1)
150-154: LGTM! Test correctly reflects the new workerType-based API.The test expectation now validates
workerType: 'defaultAction'instead of checkingworkerPath, which is consistent with the API changes across the codebase.tests/core/metrics/calculateGitLogMetrics.test.ts (1)
70-74: LGTM! Consistent test setup with workerType-based configuration.The mock task runner initialization correctly uses
workerType: 'calculateMetrics', aligning with the unified worker dispatch pattern.tests/core/metrics/calculateGitDiffMetrics.test.ts (1)
70-74: LGTM! Consistent workerType-based test configuration.The test setup correctly mirrors the workerType-based initialization pattern used throughout the codebase.
src/core/file/fileProcess.ts (1)
26-26: LGTM! Clean migration to type-based worker selection.The change to
workerType: 'fileProcess'is consistent with the unified worker dispatch pattern.src/cli/actions/workers/defaultActionWorker.ts (1)
59-63: LGTM! Safe defaults and async termination.The introduction of
safeCliOptionsappropriately handles potentially undefinedcliOptionsin bundled environments, and the asynconWorkerTerminationsignature is consistent with other workers.Also applies to: 69-70, 119-122
src/cli/cliSpinner.ts (1)
15-20: LGTM! Appropriate optional parameter with safe defaults.Making
cliOptionsoptional and using optional chaining provides safe operation in bundled worker environments where options may be undefined. The fallback tofalseensures consistent behavior.tests/core/metrics/calculateOutputMetrics.test.ts (1)
27-27: LGTM!The test mocks correctly updated to use
workerType: 'calculateMetrics'instead ofworkerPath, aligning with the newWorkerOptionsinterface. Theruntime: 'worker_threads'addition is consistent across all test cases.Also applies to: 38-38, 62-62, 74-74, 85-85, 113-113, 138-138, 164-168
src/shared/processConcurrency.ts (2)
17-43: LGTM - well-structured worker path resolution.The
getWorkerPathfunction cleanly separates bundled vs. non-bundled environments. The exhaustive switch with a default throw ensures compile-time and runtime safety for unknown worker types.
66-105: LGTM - consistent workerType propagation.The
createWorkerPoolfunction correctly propagatesworkerTypethrough bothworkerData(forworker_threads) and environment variables (forchild_process), ensuring the unified worker can determine the appropriate handler in both runtime modes.tests/shared/unifiedWorker.test.ts (3)
1-36: LGTM - comprehensive mock setup.The test setup correctly mocks all worker modules and
worker_threads, enabling isolated testing of the unified worker's dispatch and inference logic. Usingvi.resetModules()inbeforeEachensures the handler cache is cleared between tests.
38-136: LGTM - thorough inference testing.The tests cover all
WorkerTypeinference paths and validate error handling for invalid task structures. Good coverage of edge cases includingnulland non-object tasks.
138-170: LGTM - termination flow properly validated.The tests correctly verify that
onWorkerTerminationtriggers cleanup on cached handlers and clears the cache, ensuring fresh handler loading on subsequent tasks.src/index.ts (1)
27-27: LGTM - useful export for bundled environments.Exporting
setWasmBasePathenables consumers to configure Tree-sitter WASM file locations, which is essential for bundled deployments.tests/shared/processConcurrency.test.ts (2)
74-113: LGTM - createWorkerPool tests correctly updated.The tests properly validate that:
workerTypeis passed tocreateWorkerPool- Tinypool is initialized with the correct filename derived from
workerTypeworkerDataincludesworkerTypefor worker identification
128-150: LGTM - initTaskRunner tests properly updated.Tests correctly verify that
initTaskRunnerpassesworkerTypethrough tocreateWorkerPooland that the resultingworkerDatacontains the correctworkerTypevalue.src/shared/unifiedWorker.ts (5)
1-23: LGTM - clean module structure and type definitions.The documentation clearly explains the module's purpose. The
WorkerTypeunion type provides type safety, and the handler cache usingMapis appropriate for this use case.
30-74: LGTM - efficient handler loading with caching.The
loadWorkerHandlerfunction correctly:
- Checks cache before dynamic imports
- Uses exhaustive switch with typed cases
- Caches both handler and cleanup functions
- Throws on unknown worker types for safety
81-119: Good inference logic with important ordering consideration.The inference function correctly distinguishes worker types using unique property combinations. The comment at line 108 about checking
calculateMetricsbeforesecurityCheckis critical since both havecontentbut differ in other required properties (encodingvstype).
155-176: LGTM - robust task routing with inference fallback.The default handler correctly prioritizes task-based inference over
workerData/env configuration, which is essential for bundled environments where Tinypool may reuse child processes across different worker pools. The descriptive error message helps with debugging.
182-189: LGTM - proper cleanup with cache clearing.The
onWorkerTerminationfunction correctly iterates through cached handlers, calls their cleanup functions (awaiting async cleanups), and clears the cache to ensure fresh state on next use.website/server/src/index.ts (1)
26-72: LGTM - clean server/worker dual-mode pattern.Wrapping the entire server initialization in the
!isTinypoolWorker()guard is an effective pattern for files that serve as both server entry points and Tinypool worker modules in bundled environments.
This allows CI to pass by using the PR branch which has the new exports. Should be changed back to #main after merging.
Code Review SummaryThis PR introduces a unified worker entry point that enables full bundling support for the website server, targeting Cloud Run cold start optimization. The implementation is well-structured and maintains backward compatibility. Overall Assessment: ✅ ApproveThe PR demonstrates solid engineering with careful attention to edge cases and backward compatibility. The architecture cleanly separates bundled vs. non-bundled environments through environment variable detection. Strengths
Potential Issues & Recommendations1. Website server package.json - Temporary branch referenceFile: website/server/package.json:19 The repomix dependency currently points to the PR branch. This is explicitly noted in the commit message as temporary. Reminder to change back to main after merging. 2. Task inference order sensitivityFile: src/shared/unifiedWorker.ts:81-118 The inferWorkerTypeFromTask function relies on checking specific field combinations in a particular order. The ordering is correct (e.g., calculateMetrics before securityCheck since both have content), but this coupling to task structure could be fragile if task interfaces change. Recommendation: Consider adding a comment documenting why the order matters, or add a discriminator field to tasks for more robust routing in future iterations. 3. Handler cache lifetimeFile: src/shared/unifiedWorker.ts:23 The handlerCache is a module-level Map that persists across the worker lifetime. This is intentional for performance. The cache is properly cleared in onWorkerTermination. This is fine as-is. Premortem Analysis
Minor Suggestions
Test VerificationThe test changes appropriately update from workerPath to workerType in mock expectations. The new unifiedWorker.test.ts provides good coverage of:
LGTM 👍 Ready to merge after ensuring the website server package.json is updated post-merge. 🤖 Generated with Claude Code |
Build and link local repomix package before running website server lint. This ensures the latest exports are available during type checking.
Code Review: PR #1056 - feat(worker): Add unified worker entry point for bundling supportSummaryThis PR introduces a unified worker entry point to enable full bundling support for the website server, which should improve Cloud Run cold start times. The implementation is well-structured and maintains backward compatibility. Overall Assessment: ✅ Approve with Minor SuggestionsThe PR is well-designed with clear separation of concerns. The approach of inferring worker types from task structure is clever for handling Tinypool's process reuse in bundled environments. ✅ Strengths
|
| Scenario | Risk Level | Mitigation |
|---|---|---|
| Task structure changes break inference | Medium | Tests cover current task structures; changes would fail tests |
| Handler cache memory leak | Low | Cache is cleared in onWorkerTermination; size is bounded by 5 worker types |
| WASM path misconfiguration in bundled env | Low | fs.access check provides clear error message with path |
| Circular import with WorkerType | Low | Already resolved by re-exporting from processConcurrency.ts |
| Tinypool process reuse edge cases | Medium | Task inference prioritized over env/workerData provides resilience |
📝 Code Quality Notes
- Code follows Airbnb style guide ✅
- No security vulnerabilities detected ✅
- Proper error handling with descriptive messages ✅
- The
biome-ignorecomment foranytypes is appropriate given the varying worker signatures ✅ - Good use of TypeScript discriminated unions and type guards ✅
Questions for Author
-
Post-merge action: The commit message mentions "Should be changed back to #main after merging" for the website server dependency. Is there a follow-up task tracked for this?
-
Performance testing: Have you measured the actual cold start improvement in Cloud Run with the bundled server?
🤖 Generated with Claude Code
- Update Dockerfile for bundled production mode with esbuild - Copy WASM files and tinypool for runtime - Set environment variables for bundled worker path - Update compose.yml to disable bundled mode for local development
PR Review: feat(worker): Add unified worker entry point for bundling supportThis PR introduces a unified worker entry point system to enable full bundling support for the website server, primarily aimed at improving Cloud Run cold start times. The implementation is well-designed with good backward compatibility considerations. Overall AssessmentRecommendation: Approve with minor suggestions The PR is well-structured and the implementation shows careful thought about the challenges of bundling workers. The approach of using task-structure inference to handle Tinypool's child process reuse is clever and practical. Code Quality & Best PracticesDetailsStrengths:
Minor observations:
Potential Issues & Edge CasesDetails
Performance ConsiderationsDetailsPositive:
Note:
Security ConsiderationsDetailsReviewed areas:
Test CoverageDetailsThe test coverage is good:
Consider adding:
Premortem Analysis: Potential Failure Scenarios
Suggested Improvements (Non-blocking)
SummaryThis is a solid PR that enables important bundling functionality for Cloud Run deployment optimization. The implementation:
Changes look good to merge! 🚀 |
Use specific commit hash in package-lock.json for reproducibility.
Code Review: feat(worker): Add unified worker entry point for bundling supportThis is a well-architected PR that enables full bundling support for the website server, which should significantly improve Cloud Run cold start times. The implementation is clean and maintains backward compatibility. Overall Assessment: ✅ Approved with minor suggestionsStrengths
Code Quality ReviewDetailsunifiedWorker.ts (src/shared/unifiedWorker.ts)
processConcurrency.ts (src/shared/processConcurrency.ts)
loadLanguage.ts (src/core/treeSitter/loadLanguage.ts)
website/server/src/index.ts
Minor SuggestionsDetails
Premortem AnalysisPotential Failure Scenarios & Mitigations
Security Considerations
Test Coverage
Checklist Verification
Note: Remember to change the repomix dependency in 🤖 Review by Claude |
Summary
This PR adds a unified worker entry point that enables full bundling support for the website server, improving Cloud Run cold start times.
Key Changes
src/shared/unifiedWorker.ts): Single entry point for all worker types that dynamically routes tasks to appropriate handlers based on task structure inferenceREPOMIX_WORKER_PATH,REPOMIX_WASM_DIR) enable bundled environments to override default pathsHow It Works
Files Changed
src/shared/unifiedWorker.tssrc/shared/processConcurrency.tssrc/core/treeSitter/loadLanguage.tssrc/index.tswebsite/server/src/index.tsTesting
unifiedWorker.tsChecklist
npm run testnpm run lint