Skip to content

ci: scope turbo cache by dependency graph to fix stale snapshot replay#2726

Closed
upupming wants to merge 1 commit into
mainfrom
ci/fix-turbo-cache-staleness
Closed

ci: scope turbo cache by dependency graph to fix stale snapshot replay#2726
upupming wants to merge 1 commit into
mainfrom
ci/fix-turbo-cache-staleness

Conversation

@upupming
Copy link
Copy Markdown
Collaborator

@upupming upupming commented May 27, 2026

Symptom

The Vitest jobs fail with 227 BackgroundSnapshot not found: __snapshot_* errors across 51 unrelated test files (51 failed | 262 passed), e.g. on #2712 — a PR that only adds the @lynx-js/genui packages and touches no test logic.

That error is thrown from packages/react/runtime/src/snapshot/snapshot/backgroundSnapshot.ts:158 only when a compiled __snapshot_* id is never registered into snapshotManager — i.e. the snapshot registration and reference were generated with different ids. It is a build-artifact consistency problem, not a logic regression.

Root cause: snapshot ids are derived from the file path, but the turbo cache is only invalidated by *.rs

The compiled snapshot id is generated in packages/react/transform/crates/swc_plugin_snapshot/lib.rs:

// lib.rs
filename_hash: calc_hash(&cfg.filename),           // sha1(filename)[0..5]
content_hash:  "test",                             // constant in test mode
let snapshot_uid = format!("__snapshot_{}_{}_{}", filename_hash, content_hash, snapshot_counter);
// swc_plugins_shared/utils.rs
pub fn calc_hash(s: &str) -> String {
  let mut hasher = Sha1::new();
  hasher.update(s.as_bytes());
  hex::encode(hasher.finalize())[0..5].to_string()
}

So an id like __snapshot_ed6f7_test_1 is sha1(<module filename>)[0..5] + test + per-file counter. The id is a function of the module's file path, not of the transform (*.rs) logic.

pnpm stores every dependency (and workspace package) under node_modules/.pnpm/<name>@<version>_<peer-hash>/.... The failing stacks all run through such a path:

node_modules/.pnpm/@lynx-js+react@0.121.0_@lynx-js+types@3.7.0_@types+react@19.2.14/.../testing-library/dist/pure.js
                                  └────────────── pnpm peer-deps hash ──────────────┘

The _<peer-hash> segment changes whenever the dependency graph changes. #2712 adds 9 packages and changes 394 lines of pnpm-lock.yaml, which shifts those .pnpm/... paths → the same module now hashes to a different filename_hash → a different snapshot id — even though no *.rs changed.

How the stale cache turns this into a failure

.turbo is produced by the build-all job and handed off (via fail-on-cache-miss: true) to every downstream test job at the same github.sha. The key was:

turbo-v4-${{ runner.os }}-${{ hashFiles('packages/**/src/**/*.rs') }}-${{ github.sha }}

with restore-keys falling back to the merge-base, the PR base, any cache with the same *.rs hash, and finally any cache for the same OS.

On the failing run, build-all (attempt #4) missed the exact github.sha key and restored the base/main cache via a restore-key (...-62c1af39..., the base commit without genui). Turbo replayed compiled modules whose ids were computed from the old .pnpm paths, while other modules were recompiled under the new paths → registration/reference ids diverge → BackgroundSnapshot not found. Re-runs kept failing because the poisoned cache was re-saved under this commit's github.sha and restored exactly on every retry.

The *.rs hash was never the right invalidation signal: the snapshot id depends on the resolved file path, which is determined by the dependency graph (pnpm-lock.yaml), not by the transform sources.

Fix

  • Add hashFiles('pnpm-lock.yaml') to the cache key. This is the input that actually determines the .pnpm/... paths the ids are hashed from, so a changed dependency graph no longer inherits a cache built against different paths.
  • Drop the over-broad restore-keys fallbacks (the *.rs-only line and the OS-only line) that allowed inheriting a cache from an unrelated graph. The most permissive remaining fallback requires the same *.rs and lockfile hash.
  • Align the glob in the consumer workflows (workflow-test/website/bench/bundle-analysis) from **/packages/**/src/**/*.rs to packages/**/src/**/*.rs so each key is byte-for-byte identical to workflow-build.yml — required by the fail-on-cache-miss handoff.
  • Bump turbo-v4 -> turbo-v5 to discard the currently-poisoned caches.

Trade-off

PRs that change pnpm-lock.yaml will cold-build the monorepo once (no warm turbo cache inherited across a dependency-graph change). This is intentional — correctness over reuse for dep-changing PRs. PRs that don't touch deps keep warm-cache reuse exactly as before.

Deeper follow-up (not in this PR)

The cleaner fix is to make the snapshot id stable across dependency-graph changes (e.g. derive filename_hash from a workspace-relative path that excludes the pnpm peer-hash segment) and/or to track the transform's effective inputs in turbo's hashing. This PR hardens the CI cache layer so the handoff stops producing inconsistent artifacts.

Evidence

  • Failing run 26498982771, build / Build (Ubuntu) attempt chore: remove wrong changeset #4: Unable to find cache with primary key: ...-9f1158750...Cache restored from key: ...-62c1af3921... (base/main, pre-genui) → Cache saved with key: ...-9f1158750.... Attempt docs: Polish README.md #5 then restored ...-9f1158750... exactly (the already-poisoned cache).
  • 9f1158750... is the PR merge commit (Merge ba716689 into 62c1af39); 62c1af39... is its base parent on main.
  • Build step was 58 cached / 61 total; the react runtime and all failing fixtures were cache hit, replaying logs.
  • refactor(genui): add @lynx-js/genui #2712 changes pnpm-lock.yaml (394 +/-) but no *.rs, so the old key did not invalidate.

The turbo cache key was only namespaced by the Rust transform sources
(packages/**/src/**/*.rs) and github.sha, with restore-keys that fell
back to any cache for the same OS. Turbo's per-task hash does not fully
capture the compiled ReactLynx snapshot output, so a PR that only changes
the dependency graph (e.g. adding a package) without touching *.rs would
inherit a base-branch .turbo and replay stale compiled snapshots. The
main-thread snapshot registrations and background-thread references then
came from different transform states, producing widespread
'BackgroundSnapshot not found: __snapshot_*' failures in the Vitest jobs.

- Add hashFiles('pnpm-lock.yaml') to the cache key so caches are scoped
  by dependency graph as well.
- Drop the over-broad restore-keys fallbacks (the *.rs-only and OS-only
  lines) that allowed inheriting a cache from an unrelated graph.
- Align the glob in consumer workflows to packages/**/src/**/*.rs so the
  key stays byte-for-byte identical to workflow-build.yml (required by
  fail-on-cache-miss).
- Bump turbo-v4 -> turbo-v5 to discard currently-poisoned caches.

Change-Id: I5fe87bae2346ff379340410aad4cc25be8001d85
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 27, 2026

⚠️ No Changeset found

Latest commit: 1b83b5e

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

📝 Walkthrough

Walkthrough

GitHub Actions workflows across the repository were updated from TurboCache v4 to v5 cache keys. The cache key structure now standardizes on runner OS, Rust source file hashes (packages/**/src/**/*.rs), the pnpm lockfile hash, and the commit SHA. Restore-keys logic was aligned across workflows to ensure consistent cache behavior and prevent stale cache hits when Rust sources or dependencies change.

Changes

TurboCache v5 standardization

Layer / File(s) Summary
TurboCache cache key contract and cross-workflow alignment
.github/workflows/workflow-build.yml, .github/workflows/workflow-bench.yml, .github/workflows/workflow-bundle-analysis.yml, .github/workflows/workflow-test.yml, .github/workflows/workflow-website.yml
Build workflow establishes the turbo-v5 cache key contract with runner OS, Rust source hashes, pnpm lockfile, and commit SHA; bench, bundle-analysis, test, and website workflows updated to match the same key structure and hash inputs for consistency.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Cache keys align, from v4 to v5,
Turbo builds dance, workflows alive!
Rust sources hashed, lockfiles in sync,
No stale caches, workflows blink—

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: updating the Turbo cache configuration to include dependency graph hashing (pnpm-lock.yaml) to resolve stale snapshot replay issues.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/fix-turbo-cache-staleness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
5283 1 5282 123
View the top 1 failed test(s) by shortest run time
packages/rspeedy/plugin-react/test/lazy.test.ts > Lazy > inlines lazy bundle background when inlineScripts is disabled
Stack Traces | 1.56s run time
Error: Rspack build failed.
 ❯ ../../../node_modules/.pnpm/@rsbuild+core@2.0.6_core-js@3.48.0/node_modules/@.../core/dist/753.js:4298:141
 ❯ ../../../node_modules/.pnpm/@rspack+core@2.0.3_@swc+helpers@0.5.21/node_modules/@.../core/dist/index.js:11563:54

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 27, 2026

Merging this PR will improve performance by 7.5%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 80 untouched benchmarks
⏩ 26 skipped benchmarks1

Performance Changes

Benchmark BASE HEAD Efficiency
transform 1000 view elements 43.3 ms 40.2 ms +7.5%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing ci/fix-turbo-cache-staleness (1b83b5e) with main (e9c8fb4)

Open in CodSpeed

Footnotes

  1. 26 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link
Copy Markdown
Contributor

UI Judge

GEQI weighted score: 61.8 / 100 across 8 examples.
Average visual-correctness score: 3.4 / 5.

Dimension Weight Average Results Status
Usability & Interaction 30% 3 / 5 8 OK
Visual & Aesthetics 25% 3.1 / 5 8 OK
Consistency & Standards 15% 3.4 / 5 8 OK
Architecture & UX Writing 15% 3.1 / 5 8 OK
Accessibility & Performance 15% 2.9 / 5 8 OK
# Example Visual Correctness Usability & Interaction (30%) Visual & Aesthetics (25%) Consistency & Standards (15%) Architecture & UX Writing (15%) Accessibility & Performance (15%) GEQI Page Status
1 recs 2 / 5 2 / 5 3 / 5 3 / 5 2 / 5 2 / 5 48 / 100 preview OK
2 cast-grid 5 / 5 3 / 5 4 / 5 5 / 5 5 / 5 4 / 5 80 / 100 preview OK
3 citywalk-list 2 / 5 2 / 5 3 / 5 3 / 5 2 / 5 2 / 5 48 / 100 preview OK
4 fridge-search 4 / 5 4 / 5 3 / 5 4 / 5 3 / 5 3 / 5 69 / 100 preview OK
5 trip-planner 2 / 5 2 / 5 2 / 5 2 / 5 2 / 5 2 / 5 40 / 100 preview OK
6 weather-current 5 / 5 5 / 5 4 / 5 4 / 5 4 / 5 4 / 5 86 / 100 preview OK
7 product-card 5 / 5 4 / 5 4 / 5 4 / 5 5 / 5 4 / 5 83 / 100 preview OK
8 workout-plan 2 / 5 2 / 5 2 / 5 2 / 5 2 / 5 2 / 5 40 / 100 preview OK
Details

Result 1

  • Example: recs
  • Dimension: visual-correctness
  • Visual correctness: 2 / 5
  • GEQI dimensions:
    • Usability & Interaction: 2 / 5 (30%)
    • Visual & Aesthetics: 3 / 5 (25%)
    • Consistency & Standards: 3 / 5 (15%)
    • Architecture & UX Writing: 2 / 5 (15%)
    • Accessibility & Performance: 2 / 5 (15%)
  • Task: The A2UI playground preview should show date-night dining recommendations for Moonlight Terrace, Pinewood Bistro, and Sea Breeze Kitchen.

Result 2

  • Example: cast-grid
  • Dimension: visual-correctness
  • Visual correctness: 5 / 5
  • GEQI dimensions:
    • Usability & Interaction: 3 / 5 (30%)
    • Visual & Aesthetics: 4 / 5 (25%)
    • Consistency & Standards: 5 / 5 (15%)
    • Architecture & UX Writing: 5 / 5 (15%)
    • Accessibility & Performance: 4 / 5 (15%)
  • Task: The A2UI playground preview should show a cast grid for the short film Night Notes, including Lin Xia and Zhou Ning cast cards.

Result 3

  • Example: citywalk-list
  • Dimension: visual-correctness
  • Visual correctness: 2 / 5
  • GEQI dimensions:
    • Usability & Interaction: 2 / 5 (30%)
    • Visual & Aesthetics: 3 / 5 (25%)
    • Consistency & Standards: 3 / 5 (15%)
    • Architecture & UX Writing: 2 / 5 (15%)
    • Accessibility & Performance: 2 / 5 (15%)
  • Task: The A2UI playground preview should show weekend citywalk coffee picks with Rooftop Brew Room, Corner Canvas Lab, and Late Sun Roastery.

Result 4

  • Example: fridge-search
  • Dimension: visual-correctness
  • Visual correctness: 4 / 5
  • GEQI dimensions:
    • Usability & Interaction: 4 / 5 (30%)
    • Visual & Aesthetics: 3 / 5 (25%)
    • Consistency & Standards: 4 / 5 (15%)
    • Architecture & UX Writing: 3 / 5 (15%)
    • Accessibility & Performance: 3 / 5 (15%)
  • Task: The A2UI playground preview should show refrigerator search results with Siemens, Hualing, Haier, and Midea product cards.

Result 5

  • Example: trip-planner
  • Dimension: visual-correctness
  • Visual correctness: 2 / 5
  • GEQI dimensions:
    • Usability & Interaction: 2 / 5 (30%)
    • Visual & Aesthetics: 2 / 5 (25%)
    • Consistency & Standards: 2 / 5 (15%)
    • Architecture & UX Writing: 2 / 5 (15%)
    • Accessibility & Performance: 2 / 5 (15%)
  • Task: The A2UI playground preview should show a Kyoto 48-hour trip planner with Day 1 and Day 2 itinerary sections, including Monkey Park Viewpoint.

Result 6

  • Example: weather-current
  • Dimension: visual-correctness
  • Visual correctness: 5 / 5
  • GEQI dimensions:
    • Usability & Interaction: 5 / 5 (30%)
    • Visual & Aesthetics: 4 / 5 (25%)
    • Consistency & Standards: 4 / 5 (15%)
    • Architecture & UX Writing: 4 / 5 (15%)
    • Accessibility & Performance: 4 / 5 (15%)
  • Task: The A2UI playground preview should show the current weather for Austin, TX, including clear skies with light breeze.

Result 7

  • Example: product-card
  • Dimension: visual-correctness
  • Visual correctness: 5 / 5
  • GEQI dimensions:
    • Usability & Interaction: 4 / 5 (30%)
    • Visual & Aesthetics: 4 / 5 (25%)
    • Consistency & Standards: 4 / 5 (15%)
    • Architecture & UX Writing: 5 / 5 (15%)
    • Accessibility & Performance: 4 / 5 (15%)
  • Task: The A2UI playground preview should show a Wireless Headphones Pro product card with a visible Add to Cart action.

Result 8

  • Example: workout-plan
  • Dimension: visual-correctness
  • Visual correctness: 2 / 5
  • GEQI dimensions:
    • Usability & Interaction: 2 / 5 (30%)
    • Visual & Aesthetics: 2 / 5 (25%)
    • Consistency & Standards: 2 / 5 (15%)
    • Architecture & UX Writing: 2 / 5 (15%)
    • Accessibility & Performance: 2 / 5 (15%)
  • Task: The A2UI playground preview should show a weekly workout plan with five days from Monday Ramp-Up through Friday Conditioning.

Workflow run

@upupming
Copy link
Copy Markdown
Collaborator Author

Closing: the root cause is NOT turbo cache staleness. A clean, cache-free local build of #2712 reproduces the exact failure (51 failed / 262 passed), while main passes in the same environment. The real cause is a duplicate @lynx-js/react copy: genui's published @lynx-js/lynx-ui-*@3.133.0 deps pull a published @lynx-js/react@0.121.0 alongside the workspace copy, producing two snapshotManager singletons -> 'BackgroundSnapshot not found'. Fix belongs in dependency resolution (dedupe @lynx-js/react), not the CI cache key.

@upupming upupming closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant