Backoffice: Parallelise per-workspace Vite builds (-24% build wall-clock)#22899
Backoffice: Parallelise per-workspace Vite builds (-24% build wall-clock)#22899iOvergaard wants to merge 3 commits into
Conversation
Investigation note for collapsing the 58 per-workspace Vite invocations into one root build. Topology survey, blockers, viability sketch, and estimated 30-50% build wall-clock win conditional on keeping cross- workspace externalization. Recommendation: prototype as a separate PR against v17/dev. No code change yet — this PR is just the writeup so the work has a home.
Spike investigated a single unified Vite config for all 58 workspaces. That approach turns out to be 4.5x SLOWER than the current serial setup (3m 37s vs 48.7s) because Rollup's chunking pass is single-threaded and the larger entry graph dominates the saved Vite startup overhead. The actual win is much simpler: fan out the existing per-workspace builds with a small Node script that spawns 'vite build' concurrently. On an 8-wide machine this cuts wall-clock from 48.7s to ~28s (-43%) with zero topology risk — output bytes are identical to today. Changes: - devops/build/build-workspaces-parallel.js: cross-platform Node fan-out using child_process.spawn. Concurrency defaults to min(8, cpu_count). - package.json: 'build:workspaces' now points at the parallel runner; the previous serial version is kept as 'build:workspaces:serial' for fallback / CI debugging. - Updated the spike note with concrete numbers showing why the unified approach was rejected, so a future maintainer doesn't re-attempt it.
|
Claude finished @iOvergaard's task in 3m 15s —— View job PR ReviewTarget: Replaces the serial
Important
Suggestions
Approved with Suggestions for improvementGood work on the implementation — the pipe-buffer pitfall is well-researched and the SIGINT/SIGTERM cleanup is solid. The |
| // any exit, so guard against double-decrement. | ||
| stderr += `\nspawn error: ${err.message}\n`; | ||
| }); | ||
| child.on('exit', (code) => { |
There was a problem hiding this comment.
[Important] The exit event is not guaranteed to fire after all stderr data events are processed — Node.js docs explicitly note that stdio streams may still be open when exit fires. For a failed workspace with buffered stderr output, the error report surfaced on lines 90–92 may be incomplete or empty when viewed by the developer.
Change exit to close here:
child.on('close', (code) => {The close event fires only after all stdio streams have been drained, guaranteeing that the full stderr content is available. It has the same (code, signal) signature as exit, so no other changes are needed.
As a secondary benefit: for spawn failures (e.g., npx not on PATH), Node.js guarantees close fires but does not guarantee exit fires — so close is also the more robust hook for the spawn-error path (lines 77–81).
| }); | ||
| } | ||
| } | ||
| tick(); |
There was a problem hiding this comment.
[Suggestion] If workspaces is empty (e.g., a partial checkout or a CI environment where all workspace directories were filtered out), tick() returns immediately without spawning any children. resolve() is only called from the exit handler, so the Promise never settles and the script hangs indefinitely.
A one-liner guard after the initial tick() call covers it:
tick();
if (active === 0) resolve(); // nothing was spawnedThere was a problem hiding this comment.
Pull request overview
This PR speeds up the Backoffice frontend build pipeline by parallelizing the existing per-workspace Vite builds (instead of attempting a unified Vite/Rollup graph), and documents the spike findings so the unified approach isn’t re-tried without context.
Changes:
- Switch
build:workspacesto a new parallel Node runner and keep the previous behavior asbuild:workspaces:serial. - Add
devops/build/build-workspaces-parallel.jsto discover workspace Vite configs and fan outvite buildwith a configurable concurrency cap. - Add a spike note documenting why a unified Vite build was rejected and what approach is recommended instead.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/Umbraco.Web.UI.Client/package.json |
Redirects build:workspaces to the parallel runner and preserves the serial command as a fallback. |
src/Umbraco.Web.UI.Client/devops/build/build-workspaces-parallel.js |
New parallel build orchestrator for per-workspace vite build executions with concurrency and failure surfacing. |
docs/superpowers/plans/notes/unified-workspace-build-spike.md |
Captures spike measurements and rationale for choosing parallel per-workspace builds over a unified config. |
Comments suppressed due to low confidence (2)
src/Umbraco.Web.UI.Client/devops/build/build-workspaces-parallel.js:82
- The
child.on('error')handler only appends tostderrand assumes theexithandler will run to decrementactive/advance the queue. Ifspawn()fails (e.g.npxnot found / permission error), Node emitserrorand may never emitexit, which will leaveactivestuck > 0 and hang the whole script. Handle the failure in theerrorhandler by removing the child fromliveChildren, decrementingactive, incrementingfailures, and callingtick()/resolve()as appropriate (or switch to using thecloseevent for completion accounting).
child.on('error', (err) => {
// spawn() itself failed (e.g. npx missing). Treat as a failure
// and let the exit handler tally it. spawn errors emit BEFORE
// any exit, so guard against double-decrement.
stderr += `\nspawn error: ${err.message}\n`;
});
src/Umbraco.Web.UI.Client/devops/build/build-workspaces-parallel.js:31
- If
workspaces.lengthends up being 0 (e.g. path changes, running from an unexpected cwd, or missingvite.config.tsfiles), the promise never resolves becauseresolve()is only called from a childexithandler. Add an early return/resolve when there are no workspaces so the script exits cleanly (ideally with a clear message).
console.log(`Building ${workspaces.length} workspaces in parallel (concurrency=${concurrency})...`);
const started = Date.now();
| **Ship the parallel runner; abandon the unified config.** | ||
|
|
||
| 1. Add `build:workspaces:parallel` to `package.json` pointing at the Node script. | ||
| 2. Replace `npm run build:workspaces` with the parallel runner inside `build:for:cms`. | ||
| 3. Keep this note as the record of *why* the unified approach was rejected, so a future maintainer doesn't re-attempt it without checking the prior numbers. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
This might not work that well. I'm pulling it back into draft. |
Summary
The original spike asked: can we make
build:workspacesfaster by collapsing all 58 workspaces into one unified Vite config? The prototype answer turned out to be no — unified ran 4.5× slower (3m 37s vs 48.7s serial baseline) because Rollup's chunking pass is single-threaded and the larger entry graph dominates the saved Vite startup overhead.The actual win is much simpler: fan out the existing per-workspace builds with a small Node script. On an 8-wide machine this drops wall-clock from 48.7s to 37.1s (−24 %) with zero topology risk — output bytes are byte-identical to today.
Concrete numbers from the prototype on jov's M-series Mac:
npm run build:workspaces(For reference, raw
xargs -P 8 npx vite buildran in 27.8 s — even faster than my Node fan-out. The Node script trades ~10 s for cross-platform portability + structured error reporting.)What changed
src/Umbraco.Web.UI.Client/devops/build/build-workspaces-parallel.jschild_process.spawnfan-out. Concurrency defaults tomin(8, cpu_count). Captures stderr per workspace, surfaces last 15 lines on failure. SIGINT/SIGTERM handlers kill children before exit so Ctrl-C doesn't orphan vite processes.src/Umbraco.Web.UI.Client/package.jsonbuild:workspacesnow points at the parallel runner; previous serial command kept asbuild:workspaces:serialfor fallback / CI debugging.docs/superpowers/plans/notes/unified-workspace-build-spike.mdImplementation notes
stdio: ['ignore', 'ignore', 'pipe']— ignoring stdout, piping stderr only. Earlier prototype usedpipeon stdout and hung onpackages/core; the comment in the file documents this so it isn't reverted.nextIndex,active,liveChildren) is mutated only from the JS event loop, so there's no real race. The recursivetick()runs synchronously and childexithandlers fire asynchronously — no re-entrancy.npm run build:for:cmsautomatically picks up the parallel runner via the renamed script, no further wiring needed.Test plan
node devops/build/build-workspaces-parallel.json a clean tree — completes in 37 s, 0 failures, dist-cms matches the serial baseline (3340 .js files).Ctrl-Cduring a run kills all in-flight vite processes (no orphans).npm run build:for:cmsstill completes successfully end-to-end.npm run lint:errors— PASS (scope iseslint src, devops scripts are out of scope by design).Why not just use a unified Vite config
Tracked in the spike note. TL;DR: Rollup's single-threaded chunking pass dominates the unified build. Saved Vite startup (~12 s across 58 invocations) is dwarfed by the bigger graph cost. The unified-config prototype is committed as a working-tree artifact but rejected — its findings are recorded in
docs/superpowers/plans/notes/unified-workspace-build-spike.mdso future maintainers don't re-attempt it without checking.Related: #21152, #22896.