Build: Improve NX Cloud performance and flakiness with enterprise plan by kasperpeulen · Pull Request #34122 · storybookjs/storybook

kasperpeulen · 2026-03-13T07:51:10Z

Closes #

What I did

We're trying out NX Cloud as an alternative for Circle CI — we have the free enterprise plan for a month to evaluate. This PR improves the NX Cloud CI pipeline.

Removed uncacheable `prepare-sandbox` NX target

The sandbox preparation step (copying from cache, waiting for verdaccio, running yarn install) is now inlined into each downstream task's run() function. These targets were uncacheable because they moved cached output to the working directory — NX Cloud reruns work best when every target in the graph is cacheable. This also makes NX Cloud runs significantly faster.

Dynamic changeset-based agent distribution

Now that we're on the NX enterprise plan for a month, we can use more agents. Agents now scale based on changeset size (small/medium/large/extra-large) instead of a fixed 20-agent pool. A separate daily distribution config uses even more agents for the larger daily sandbox set — the daily run now takes less than 10 minutes.

Reduced agent disk usage

NX Cloud agents were running out of disk space. To fix this:

Switched to the global yarn cache (enableGlobalCache: true) so each sandbox doesn't duplicate its own .yarn/cache directory.
Removed node_modules from the NX Cloud cache paths (the yarn global cache is sufficient).
Removed Cypress browser cache and installed only Playwright Chromium instead of all browsers.

Because the global cache is shared across sandboxes, stale @storybook/* packages (same version, different contents from verdaccio) can cause yarn install to pick up published versions instead of local builds. A cleanup step purges these entries before each run to force fresh copies from verdaccio.

Core compile DTS retry logic

Found a way to make the core compile task less flaky. Parallel DTS generation processes can race when one reads another's partially-written .d.ts output (TS2306). Instead of immediately bailing and killing all processes on the first failure, failed entries are now retried sequentially after all parallel processes complete. By that point all dependency .d.ts files are fully written, so the retry almost always succeeds. This adds ~8s only when the race actually occurs.

NX Cloud remote caching disabled on Circle CI (temporary)

NX Cloud remote caching is disabled on Circle CI so we can measure the true cost of NX Cloud without also getting cache-hit costs from Circle CI runs. This will be reverted once the NX Cloud experiment concludes.

E2E test stabilization

Playwright setup project. Added a dedicated Playwright setup project that waits for Storybook readiness through /index.json, opens example-button--primary, verifies the manager selection, and waits for the preview to settle before the browser tests start.
Onboarding test fixes. Tightened E2E synchronization by clearing the checklist cache and using trySelectStory for reliable story selection after creation.

Misc

Removed cypress from ALL_TASKS in the NX workflow. We can likely remove cypress from CI altogether, but for now it's removed from the NX experiment.
Upgraded NX from 22.1.3 to 22.6.1.
Added isNxTaskExecution() helper replacing scattered NX_CLI_SET checks.
Added $schema references to project.json files.
Added Playwright result outputs to NX task cache configuration.

Checklist for Contributors

Testing

The changes in this PR are covered in the following automated tests:

stories
unit tests
integration tests
end-to-end tests

Manual testing

Verify in the NX Cloud dashboard that runs are faster and less flaky compared to previous runs.

Documentation

Add or update documentation reflecting your changes
If you are deprecating/removing a feature, make sure to update
MIGRATION.MD

Checklist for Maintainers

When this PR is ready for testing, make sure to add ci:normal, ci:merged or ci:daily GH label to it to run a specific set of sandboxes. The particular set of sandboxes can be found in code/lib/cli-storybook/src/sandbox-templates.ts
Make sure this PR contains one of the labels below:
Available labels
- bug: Internal changes that fixes incorrect behavior.
- maintenance: User-facing maintenance tasks.
- dependencies: Upgrading (sometimes downgrading) dependencies.
- build: Internal-facing build tooling & test updates. Will not show up in release changelog.
- cleanup: Minor cleanup style change. Will not show up in release changelog.
- documentation: Documentation only changes. Will not show up in release changelog.
- feature request: Introducing a new feature.
- BREAKING CHANGE: Changes that break compatibility in some way with current major version.
- other: Changes that don't fit in the above categories.

🦋 Canary release

This PR does not have a canary release associated. You can request a canary release of this pull request by mentioning the @storybookjs/core team here.

core team members can create a canary release here or locally with gh workflow run --repo storybookjs/storybook publish.yml --field pr=<PR_NUMBER>

coderabbitai · 2026-03-13T07:53:06Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new QA GitHub Actions workflow that runs Nx/Playwright e2e attempts, collects per-attempt artifacts and summaries; updates an onboarding e2e test to wait for story navigation and perform survey interactions inside the "Storybook user survey" dialog.

Changes

Cohort / File(s)	Summary
E2E Test Flow Adjustment `code/e2e-tests/addon-onboarding.spec.ts`	After creating a story, wait for and validate story URL and "Last" control visibility. Move subsequent user-input interactions (checkboxes, referrer selection, submit) into the "Storybook user survey" dialog, asserting dialog visibility before interacting.
QA CI Workflow `.github/workflows/qa.yml`	Add new "QA" workflow that prepares an attempts matrix and runs a matrix job per attempt: checkout, Node setup, install deps and Playwright deps, start Nx Cloud distributed run, run e2e target with production config, extract Nx Cloud run URL, upload nx-output.log and conditional failure artifacts, and write a QA summary to the workflow step summary.

Sequence Diagram(s)

sequenceDiagram
    participant GH as GitHub Actions
    participant Matrix as Matrix Coordinator
    participant Job as QA Job (attempt)
    participant Runner as Job Runner
    participant NxCloud as Nx Cloud
    participant Artifact as Artifact Storage

    GH->>Matrix: trigger QA workflow
    Matrix->>Job: spawn attempt jobs
    loop per attempt
      Job->>Runner: checkout + setup node + install deps + playwright deps
      Runner->>NxCloud: start nx distributed run (nx run-many ...)
      NxCloud-->>Runner: return nx_cloud_run link & status
      Runner->>Artifact: upload nx-output.log
      alt failure
        Runner->>Artifact: upload test-results and playwright artifacts
      end
      Runner->>GH: emit per-attempt qa-status/result.json
    end
    GH->>Artifact: collect per-attempt artifacts
    GH->>GH: write workflow step summary with links

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Onboarding: Fix navigation to first story when configure-your-project entry missing #33559: Modifies the same onboarding e2e test to relocate survey interactions into a dialog and relax post-survey navigation assertions.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/qa.yml:
- Line 155: The f-string passed to handle.write currently uses shell-style
`${...}` so it will not interpolate; update the string in the handle.write call
that references results (the expression "handle.write(f\"- Target:
`${results[0]['template']}:{results[0]['target']}`\\n\" if results else \"-
Target: unknown\\n\")") to use Python curly-brace interpolation (e.g.
`{results[0]['template']}` and `{results[0]['target']}`) so the template and
target values are correctly formatted when results is non-empty.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ebecf822-b5a5-4044-b71d-4a1017c020ea

📥 Commits

Reviewing files that changed from the base of the PR and between a8e8078 and 0e9c2ce.

📒 Files selected for processing (1)

.github/workflows/qa.yml

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.github/workflows/qa.yml (1)

59-59: Avoid nx@latest in CI to keep runs reproducible.

Line 59 pulls whatever Nx is latest at runtime, which can introduce sudden breakage or behavior drift across attempts.

♻️ Suggested change

-          npx nx@latest start-ci-run
+          yarn nx start-ci-run

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/qa.yml at line 59, The CI step currently runs "npx
nx@latest start-ci-run", which makes runs non-reproducible; update that step to
pin NX to a specific version or use the repository's pinned local binary (for
example replace "npx nx@latest start-ci-run" with a pinned invocation like "npx
nx@<SPECIFIC_VERSION> start-ci-run" or call the project script/local binary such
as "npm run start-ci-run" / "pnpm run start-ci-run" to ensure deterministic CI
behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/qa.yml:
- Around line 36-43: Job-level expressions use the env context incorrectly
(QA_TEMPLATE, QA_TARGET, QA_MAX_PARALLEL) and will fail evaluation; update the
job "name" and the "strategy.max-parallel" and "matrix.attempt" expressions to
use a valid context (e.g., vars.*, inputs.*, or needs.*) or hardcoded values:
replace `${{ env.QA_TEMPLATE }}`, `${{ env.QA_TARGET }}`, and `${{
fromJSON(env.QA_MAX_PARALLEL) }}` with appropriate `vars`/`inputs`/`needs`
references and ensure the matrix `attempt` source
(`needs.prepare.outputs.matrix`) is parsed in a valid context (use needs.prepare
outputs or vars instead of env). Locate these in the job definition where "name"
and "strategy" are defined and update the expressions accordingly.

---

Nitpick comments:
In @.github/workflows/qa.yml:
- Line 59: The CI step currently runs "npx nx@latest start-ci-run", which makes
runs non-reproducible; update that step to pin NX to a specific version or use
the repository's pinned local binary (for example replace "npx nx@latest
start-ci-run" with a pinned invocation like "npx nx@<SPECIFIC_VERSION>
start-ci-run" or call the project script/local binary such as "npm run
start-ci-run" / "pnpm run start-ci-run" to ensure deterministic CI behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 72dc380b-11f8-4b4c-b9b1-632012377286

📥 Commits

Reviewing files that changed from the base of the PR and between 0e9c2ce and 5ed7705.

📒 Files selected for processing (1)

.github/workflows/qa.yml

nx-cloud · 2026-03-13T08:17:36Z

View your CI Pipeline Execution ↗ for commit a471ffa

Command	Status	Duration	Result
`nx run-many -t compile,check,knip,test,pretty-d...`	✅ Succeeded	9m 50s	View ↗

☁️ Nx Cloud last updated this comment at 2026-03-23 13:11:29 UTC

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/qa.yml:
- Around line 4-8: The workflow trigger is hard-coded to a feature branch
("push" -> "branches" -> kasper/stabilize-onboarding-e2e-waits); update the
trigger so the workflow runs on the intended long-lived branch(s) instead of the
feature branch—for example replace or extend the branches list under on: push:
to include "next" (or "main"/"develop" as appropriate) and/or add a manual
trigger (workflow_dispatch) so the job runs post-merge; modify the on: push:
branches array in the QA workflow accordingly.
- Around line 35-39: Replace the runtime-installed Nx invocation "npx nx@latest
start-ci-run" with the workspace-pinned runner "yarn nx start-ci-run" so the CI
uses the repository's pinned Nx version (matching the later step that uses "yarn
nx"); keep the existing flags (--distribute-on and --stop-agents-after)
unchanged so the behavior is identical except for using the pinned Nx version.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a1316a82-88e3-4ef4-8386-4755fc5fc3b9

📥 Commits

Reviewing files that changed from the base of the PR and between 5ed7705 and 928a5bb.

📒 Files selected for processing (1)

.github/workflows/qa.yml

storybook-app-bot · 2026-03-13T15:32:25Z

Package Benchmarks

^{Commit: a471ffa, ran on 23 March 2026 at 13:10:30 UTC}

No significant changes detected, all good. 👏

Replace fixed 19-agent default with changeset-based scaling so smaller PRs use fewer agents and larger PRs get more.

…arding-e2e-waits

…arding-e2e-waits # Conflicts: # scripts/build/utils/generate-types.ts

valentinpalkovic

Pair-reviewed with @kasperpeulen. LGTM!

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread .github/workflows/qa.yml Outdated

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread .github/workflows/qa.yml Outdated

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread .github/workflows/qa.yml Outdated

Comment thread .github/workflows/qa.yml Outdated

kasperpeulen changed the title ~~Stabilize onboarding e2e waits~~ Onboarding: Stabilize survey wait in E2E flow Mar 13, 2026

kasperpeulen added build Internal-facing build tooling & test updates ci:merged Run the CI jobs that normally run when merged. labels Mar 13, 2026

kasperpeulen changed the title ~~Onboarding: Stabilize survey wait in E2E flow~~ Build: Stabilize Playwright readiness and onboarding waits Mar 13, 2026

kasperpeulen requested review from valentinpalkovic and yannbf March 14, 2026 06:02

kasperpeulen changed the title ~~Build: Stabilize Playwright readiness and onboarding waits~~ Onboarding: Stabilize E2E waits and Playwright setup Mar 14, 2026

kasperpeulen changed the title ~~Onboarding: Stabilize E2E waits and Playwright setup~~ Build: Improve NX flakiness Mar 14, 2026

valentinpalkovic reviewed Mar 14, 2026

View reviewed changes

Comment thread code/e2e-tests/storybook.setup.ts

valentinpalkovic approved these changes Mar 14, 2026

View reviewed changes

kasperpeulen added ci:daily Run the CI jobs that normally run in the daily job. and removed ci:merged Run the CI jobs that normally run when merged. labels Mar 14, 2026

kasperpeulen added 5 commits March 21, 2026 17:44

Stabilize onboarding e2e waits

613551d

Stabilize onboarding survey wait

ae59f99

Use Playwright setup project for Storybook readiness

e859b15

Fix Playwright setup test linting

10dfee3

Wrap Playwright setup in describe

dbbfabf

kasperpeulen force-pushed the kasper/stabilize-onboarding-e2e-waits branch from 7930d4d to dbbfabf Compare March 21, 2026 11:44

kasperpeulen added 6 commits March 21, 2026 18:47

Use dynamic changeset distribution for NX Cloud agents

bbc86af

Replace fixed 19-agent default with changeset-based scaling so smaller PRs use fewer agents and larger PRs get more.

Disable nx cache on circle CI for NX experiment

a237a28

Add daily distribution config and dynamic selection logic

3e326ab

Try more

71f5ffc

Upgrade NX

1ce147f

Clean up unused dependencies in yarn.lock

eaf17fd

kasperpeulen added 12 commits March 23, 2026 16:25

chore: bust nx cache 2026-03-23T09:25:08Z

794f617

chore: bust nx cache 2026-03-23T09:40:11Z

9bdfa46

chore: bust nx cache 2026-03-23T09:55:15Z

b93fd6f

chore: bust nx cache 2026-03-23T10:10:19Z

236ec3f

chore: bust nx cache 2026-03-23T10:25:23Z

cc43f3e

update story selection logic and clear cache in addon-onboarding.spec.ts

ef2a2d0

use void for async calls, simplify cache clearing

d6a34dc

chore: bust nx cache 2026-03-23T10:56:49Z

275e8d8

Fix

59c70da

chore: bust nx cache 2026-03-23T11:26:58Z

2c92a67

simplify checklist cache clearing in addon-onboarding tests

d6bd85e

chore: bust nx cache 2026-03-23T11:42:04Z

af4d1b1

kasperpeulen commented Mar 23, 2026

View reviewed changes

Comment thread .nx/workflows/agents.yaml

kasperpeulen commented Mar 23, 2026

View reviewed changes

Comment thread scripts/build/utils/generate-types.ts Outdated

chore: bust nx cache 2026-03-23T11:57:07Z

e789350

kasperpeulen commented Mar 23, 2026

View reviewed changes

Comment thread scripts/tasks/compile.ts

chore: bust nx cache 2026-03-23T12:12:11Z

502dd93

kasperpeulen changed the title ~~Build: Improve NX flakiness~~ Build: Improve NX Cloud performance and flakiness with enterprise plan Mar 23, 2026

kasperpeulen added 4 commits March 23, 2026 19:22

write comments

598486d

Try to improve perf

f762ba8

chore: bust nx cache 2026-03-23T12:27:15Z

f959fd2

chore: bust nx cache 2026-03-23T12:42:18Z

e2205c2

kasperpeulen added ci:normal and removed ci:daily Run the CI jobs that normally run in the daily job. labels Mar 23, 2026

kasperpeulen added 2 commits March 23, 2026 19:54

Merge remote-tracking branch 'origin/next' into kasper/stabilize-onbo…

63b8c09

…arding-e2e-waits

Merge remote-tracking branch 'origin/next' into kasper/stabilize-onbo…

a471ffa

…arding-e2e-waits # Conflicts: # scripts/build/utils/generate-types.ts

valentinpalkovic approved these changes Mar 23, 2026

View reviewed changes

kasperpeulen merged commit 2649513 into next Mar 23, 2026
123 checks passed

kasperpeulen deleted the kasper/stabilize-onboarding-e2e-waits branch March 23, 2026 13:54

github-actions Bot mentioned this pull request Mar 23, 2026

Release: Prerelease 10.4.0-alpha.3 #34230

Merged

17 tasks

Uh oh!

Conversation

kasperpeulen commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What I did

Removed uncacheable prepare-sandbox NX target

Dynamic changeset-based agent distribution

Reduced agent disk usage

Core compile DTS retry logic

NX Cloud remote caching disabled on Circle CI (temporary)

E2E test stabilization

Misc

Checklist for Contributors

Testing

The changes in this PR are covered in the following automated tests:

Manual testing

Documentation

Checklist for Maintainers

🦋 Canary release

Uh oh!

coderabbitai Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nx-cloud Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

storybook-app-bot Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Package Benchmarks

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

valentinpalkovic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kasperpeulen commented Mar 13, 2026 •

edited

Loading

Removed uncacheable `prepare-sandbox` NX target

coderabbitai Bot commented Mar 13, 2026 •

edited

Loading

nx-cloud Bot commented Mar 13, 2026 •

edited

Loading

storybook-app-bot Bot commented Mar 13, 2026 •

edited

Loading