Skip to content

Build: NX Cloud vs CircleCI evaluation experiment#34282

Draft
kasperpeulen wants to merge 114 commits into
nextfrom
kasper/nx-ai
Draft

Build: NX Cloud vs CircleCI evaluation experiment#34282
kasperpeulen wants to merge 114 commits into
nextfrom
kasper/nx-ai

Conversation

@kasperpeulen
Copy link
Copy Markdown
Member

@kasperpeulen kasperpeulen commented Mar 23, 2026

Closes #

What I did

Run a controlled experiment comparing NX Cloud vs CircleCI on identical workloads to evaluate flakiness, speed, and cost.

Changes in this PR:

  • NX_EXPERIMENT flag in scripts/ci/main.ts — disables chromatic, benchmark, Windows, and init-empty jobs in CircleCI so both systems run the exact same workload
  • Downgrade NX Cloud linux-browsers-js agents from extra_large+ (60 credits/min) to medium+ (15 credits/min) — 4x cost reduction per agent-minute
  • Add concurrency: cancel-in-progress: false to the NX GitHub Actions workflow so rapid pushes queue instead of cancel
  • Fix tag cadence: react-vitest-3ci:merged (was ci:normal), yarn-pnpci:daily (was ci:normal)
  • scripts/evaluate-ci.ts — comparison dashboard with SQLite caching, using exact credits from CircleCI Insights API and NX Cloud dashboard API

Depends on:

Evaluation PRs (target this branch):

Early results from next branch (100 merged runs):

Metric CircleCI NX Cloud NX if medium+
Flake rate 22.0% 2.2%
Avg duration 19m 4m18s
P50 duration 19m 1m19s
Avg cost/run $2.99 $6.05 $1.92

NX Cloud with medium+ agents would be 36% cheaper, 4x faster, and 10x less flaky.

Checklist for Contributors

Testing

The changes in this PR are covered in the following automated tests:

  • stories
  • unit tests
  • integration tests
  • end-to-end tests

Manual testing

Run the evaluation script to verify data collection:

  1. source ~/.config/secrets.sh
  2. yarn jiti scripts/evaluate-ci.ts --workflow "next:merged" --limit 10 --show-runs
  3. Second run should show "X cached, 0 new" and complete in <3s

Documentation

  • Add or update documentation reflecting your changes
  • If you are deprecating/removing a feature, make sure to update
    MIGRATION.MD

Checklist for Maintainers

  • When this PR is ready for testing, make sure to add ci:normal, ci:merged or ci:daily GH label to it to run a specific set of sandboxes. The particular set of sandboxes can be found in code/lib/cli-storybook/src/sandbox-templates.ts

  • Make sure this PR contains one of the labels below:

    Available labels
    • build: Internal-facing build tooling & test updates. Will not show up in release changelog.

🦋 Canary release

This PR does not have a canary release associated. You can request a canary release of this pull request by mentioning the @storybookjs/core team here.

core team members can create a canary release here or locally with gh workflow run --repo storybookjs/storybook publish.yml --field pr=<PR_NUMBER>

@kasperpeulen kasperpeulen added documentation ci:docs Run the CI jobs for documentation checks only. ci:normal and removed ci:docs Run the CI jobs for documentation checks only. labels Mar 23, 2026
@nx-cloud
Copy link
Copy Markdown

nx-cloud Bot commented Mar 23, 2026

View your CI Pipeline Execution ↗ for commit 245a5b4

Command Status Duration Result
nx run-many -t compile,check,knip,test,lint,fmt... ✅ Succeeded 9m 39s View ↗

☁️ Nx Cloud last updated this comment at 2026-04-17 10:00:09 UTC

The test target on the code project was using "default" inputs which
only includes files owned by the code project itself, not child
projects. This caused stale cache hits when test files in child
projects (e.g. addon-a11y) were modified.
@kasperpeulen kasperpeulen added ci:daily Run the CI jobs that normally run in the daily job. and removed ci:normal labels Mar 23, 2026
@storybook-app-bot
Copy link
Copy Markdown

storybook-app-bot Bot commented Mar 23, 2026

Package Benchmarks

Commit: ec04fcf, ran on 26 March 2026 at 12:48:49 UTC

The following packages have significant changes to their size or dependencies:

storybook

Before After Difference
Dependency count 50 50 0
Self size 20.47 MB 20.46 MB 🎉 -12 KB 🎉
Dependency size 16.55 MB 16.55 MB 0 B
Bundle Size Analyzer Link Link

@storybook/nextjs-vite

Before After Difference
Dependency count 92 92 0
Self size 1.12 MB 1.12 MB 0 B
Dependency size 22.76 MB 22.73 MB 🎉 -33 KB 🎉
Bundle Size Analyzer Link Link

@storybook/react-native-web-vite

Before After Difference
Dependency count 121 121 0
Self size 30 KB 30 KB 🚨 +18 B 🚨
Dependency size 23.83 MB 23.80 MB 🎉 -32 KB 🎉
Bundle Size Analyzer Link Link

@storybook/react-vite

Before After Difference
Dependency count 82 82 0
Self size 35 KB 35 KB 0 B
Dependency size 20.54 MB 20.51 MB 🎉 -33 KB 🎉
Bundle Size Analyzer Link Link

@storybook/cli

Before After Difference
Dependency count 184 184 0
Self size 780 KB 780 KB 🎉 -27 B 🎉
Dependency size 67.69 MB 67.68 MB 🎉 -11 KB 🎉
Bundle Size Analyzer Link Link

@storybook/codemod

Before After Difference
Dependency count 177 177 0
Self size 32 KB 32 KB 🚨 +24 B 🚨
Dependency size 66.22 MB 66.20 MB 🎉 -12 KB 🎉
Bundle Size Analyzer Link Link

create-storybook

Before After Difference
Dependency count 51 51 0
Self size 1.04 MB 1.04 MB 0 B
Dependency size 37.03 MB 37.01 MB 🎉 -12 KB 🎉
Bundle Size Analyzer node node

- Uncomment NX Linux job in nx.yml, comment out NX Windows
- Add NX_EXPERIMENT flag to CircleCI config to disable chromatic,
  benchmark, and Windows jobs for fair comparison
- Add evaluate-ci.ts script with exact credits from both APIs
- Merge next branch
@kasperpeulen kasperpeulen added ci:normal and removed ci:daily Run the CI jobs that normally run in the daily job. labels Apr 16, 2026
- Fix react-vitest-3 tag: ci:normal → ci:merged (match CircleCI)
- Fix yarn-pnp tag: ci:normal → ci:daily (match CircleCI)
- Disable init-empty/init-features in both systems via NX_EXPERIMENT
- Remove init-empty/init-features from NX ALL_TASKS
- Add concurrency group with cancel-in-progress: false to nx.yml
- Fix ts-expect-error for ini module in evaluate-ci.ts
Keep only experiment-related changes on this branch:
- NX_EXPERIMENT flag in CircleCI config
- medium+ agent downgrade
- concurrency: cancel-in-progress: false
- evaluate-ci.ts dashboard script
- tag fixes for react-vitest-3 and yarn-pnp

The NX porting work (sandbox project.json files, nx.json target
defaults, agents.yaml, Windows CI, codemod fix, etc) will be moved
to a separate child PR.
@kasperpeulen kasperpeulen changed the title Build: Port remaining CircleCI jobs to NX Cloud Build: NX Cloud vs CircleCI evaluation experiment Apr 16, 2026
Prepares the evaluation tooling to compare measured NX runs across many
workflow variants (next:*, *:prs wild-branches, base nx-ai) without
re-fetching the CircleCI/NX APIs on every report:

- scripts/evaluate-ci.ts: SQLite cache (ci-eval.db), incremental run sync,
  --since / --days / --report-only / --flaky-range flags, new EVAL_BRANCHES
  keys (base (nx-ai), next:merged, next:daily, normal:prs, merged:prs,
  daily:prs) with MEDIUM_PLUS_BRANCHES filter for wild-branch workflows.
- scripts/ci-eval.db.README.md: schema + example queries.
- scripts/generate-canvas-data.ts: regenerates the inline data constants
  in nx-vs-circleci-findings.canvas.tsx from the DB.
- scripts/investigate-nx-cache.ts, scripts/run-backfill-only.ts: one-off
  helpers for cache field discovery and idempotent backfills.
Removes CircleCI entirely from kasper/nx-ai so every child eval branch
inherits a clean NX-only setup:

- Delete .circleci/config.yml. With no config file present, CircleCI
  webhooks arrive at an empty pipeline and do nothing (no billable
  compute, no confusing "canceled" entries in insights).
- Drop NX_EXPERIMENT flag from scripts/ci/main.ts and inline the
  !NX_EXPERIMENT branches. Windows, chromatic, benchmark and init-empty
  jobs are now unconditional in the generated config. This file becomes
  dead code (no config.yml to trigger it) but stays self-consistent for
  readability.

The scripts/ci/* generator code is intentionally left in place — widening
the scope to delete it belongs in a follow-up.
Verifies that after deleting .circleci/config.yml CircleCI no longer runs
on this branch — only NX Cloud should react to this push.
Deletes .github/workflows/trigger-circle-ci-workflow.yml so pushes to
kasper/nx-ai (and the rc-sweep child branches) no longer POST to
CircleCI's /pipeline endpoint. Previously that call timed out at 5s
and left a "failure" check on every commit, plus a red CircleCI
Pipeline commit status — all noise, since .circleci/config.yml was
already removed in the prior commit.

Only NX Cloud runs on these branches from now on.
The --with-deps flag forces apt-get update which frequently races against
Ubuntu mirror sync (1622846 vs 1622826 byte mismatch on Packages.gz) and
fails the CIPE. The NX base image ubuntu22.04-node20.19-v2 already has
enough X/audio/font libraries for Chromium to launch, so --with-deps
adds mostly redundant work.

If Chromium starts failing with missing .so at runtime, revert or switch
to a pre-baked image (mcr.microsoft.com/playwright:*-jammy).
Adds 12 workflow keys to EVAL_BRANCHES / WORKFLOW_NAMES / NX_TAG_MAP:
  rc:{xlarge-plus,xlarge,large-plus,large,medium-plus,medium}
  rc2:{xlarge-plus,xlarge,large-plus,large,medium-plus,medium}

All rc:* PRs carry the ci:normal label, so they map to ci:normal in
NX_TAG_MAP. Each workflow filters to its single dedicated branch.

Also adds RC_BRANCH_LINUX_BROWSERS_CPM: a hard-coded map from rc branch
name to the linux-browsers-js credit multiplier that branch's agents.yaml
sets. When the dashboard /cipes/{id}/analysis endpoint returns an empty
computeCreditUsages (observed ~50% of the time for the cheaper classes),
the fallback credit computation uses this per-branch rate instead of the
local worktree's agents.yaml — otherwise cross-branch syncs would always
bill linux-browsers-js at whatever class kasper/nx-ai is on.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant