ci: scope turbo cache by dependency graph to fix stale snapshot replay#2726
ci: scope turbo cache by dependency graph to fix stale snapshot replay#2726upupming wants to merge 1 commit into
Conversation
The turbo cache key was only namespaced by the Rust transform sources
(packages/**/src/**/*.rs) and github.sha, with restore-keys that fell
back to any cache for the same OS. Turbo's per-task hash does not fully
capture the compiled ReactLynx snapshot output, so a PR that only changes
the dependency graph (e.g. adding a package) without touching *.rs would
inherit a base-branch .turbo and replay stale compiled snapshots. The
main-thread snapshot registrations and background-thread references then
came from different transform states, producing widespread
'BackgroundSnapshot not found: __snapshot_*' failures in the Vitest jobs.
- Add hashFiles('pnpm-lock.yaml') to the cache key so caches are scoped
by dependency graph as well.
- Drop the over-broad restore-keys fallbacks (the *.rs-only and OS-only
lines) that allowed inheriting a cache from an unrelated graph.
- Align the glob in consumer workflows to packages/**/src/**/*.rs so the
key stays byte-for-byte identical to workflow-build.yml (required by
fail-on-cache-miss).
- Bump turbo-v4 -> turbo-v5 to discard currently-poisoned caches.
Change-Id: I5fe87bae2346ff379340410aad4cc25be8001d85
|
📝 WalkthroughWalkthroughGitHub Actions workflows across the repository were updated from TurboCache v4 to v5 cache keys. The cache key structure now standardizes on runner OS, Rust source file hashes ( ChangesTurboCache v5 standardization
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Merging this PR will improve performance by 7.5%
|
| Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|
| ⚡ | transform 1000 view elements |
43.3 ms | 40.2 ms | +7.5% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ci/fix-turbo-cache-staleness (1b83b5e) with main (e9c8fb4)
Footnotes
-
26 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
UI JudgeGEQI weighted score: 61.8 / 100 across 8 examples.
DetailsResult 1
Result 2
Result 3
Result 4
Result 5
Result 6
Result 7
Result 8
|
|
Closing: the root cause is NOT turbo cache staleness. A clean, cache-free local build of #2712 reproduces the exact failure (51 failed / 262 passed), while main passes in the same environment. The real cause is a duplicate @lynx-js/react copy: genui's published @lynx-js/lynx-ui-*@3.133.0 deps pull a published @lynx-js/react@0.121.0 alongside the workspace copy, producing two snapshotManager singletons -> 'BackgroundSnapshot not found'. Fix belongs in dependency resolution (dedupe @lynx-js/react), not the CI cache key. |
Symptom
The Vitest jobs fail with 227
BackgroundSnapshot not found: __snapshot_*errors across 51 unrelated test files (51 failed | 262 passed), e.g. on #2712 — a PR that only adds the@lynx-js/genuipackages and touches no test logic.That error is thrown from
packages/react/runtime/src/snapshot/snapshot/backgroundSnapshot.ts:158only when a compiled__snapshot_*id is never registered intosnapshotManager— i.e. the snapshot registration and reference were generated with different ids. It is a build-artifact consistency problem, not a logic regression.Root cause: snapshot ids are derived from the file path, but the turbo cache is only invalidated by
*.rsThe compiled snapshot id is generated in
packages/react/transform/crates/swc_plugin_snapshot/lib.rs:So an id like
__snapshot_ed6f7_test_1issha1(<module filename>)[0..5]+test+ per-file counter. The id is a function of the module's file path, not of the transform (*.rs) logic.pnpm stores every dependency (and workspace package) under
node_modules/.pnpm/<name>@<version>_<peer-hash>/.... The failing stacks all run through such a path:The
_<peer-hash>segment changes whenever the dependency graph changes. #2712 adds 9 packages and changes 394 lines ofpnpm-lock.yaml, which shifts those.pnpm/...paths → the same module now hashes to a differentfilename_hash→ a different snapshot id — even though no*.rschanged.How the stale cache turns this into a failure
.turbois produced by thebuild-alljob and handed off (viafail-on-cache-miss: true) to every downstream test job at the samegithub.sha. The key was:with
restore-keysfalling back to the merge-base, the PR base, any cache with the same*.rshash, and finally any cache for the same OS.On the failing run,
build-all(attempt #4) missed the exactgithub.shakey and restored the base/main cache via a restore-key (...-62c1af39..., the base commit without genui). Turbo replayed compiled modules whose ids were computed from the old.pnpmpaths, while other modules were recompiled under the new paths → registration/reference ids diverge →BackgroundSnapshot not found. Re-runs kept failing because the poisoned cache was re-saved under this commit'sgithub.shaand restored exactly on every retry.The
*.rshash was never the right invalidation signal: the snapshot id depends on the resolved file path, which is determined by the dependency graph (pnpm-lock.yaml), not by the transform sources.Fix
hashFiles('pnpm-lock.yaml')to the cache key. This is the input that actually determines the.pnpm/...paths the ids are hashed from, so a changed dependency graph no longer inherits a cache built against different paths.restore-keysfallbacks (the*.rs-only line and the OS-only line) that allowed inheriting a cache from an unrelated graph. The most permissive remaining fallback requires the same*.rsand lockfile hash.workflow-test/website/bench/bundle-analysis) from**/packages/**/src/**/*.rstopackages/**/src/**/*.rsso each key is byte-for-byte identical toworkflow-build.yml— required by thefail-on-cache-misshandoff.turbo-v4->turbo-v5to discard the currently-poisoned caches.Trade-off
PRs that change
pnpm-lock.yamlwill cold-build the monorepo once (no warm turbo cache inherited across a dependency-graph change). This is intentional — correctness over reuse for dep-changing PRs. PRs that don't touch deps keep warm-cache reuse exactly as before.Deeper follow-up (not in this PR)
The cleaner fix is to make the snapshot id stable across dependency-graph changes (e.g. derive
filename_hashfrom a workspace-relative path that excludes the pnpm peer-hash segment) and/or to track the transform's effective inputs in turbo's hashing. This PR hardens the CI cache layer so the handoff stops producing inconsistent artifacts.Evidence
build / Build (Ubuntu)attempt chore: remove wrong changeset #4:Unable to find cache with primary key: ...-9f1158750...→Cache restored from key: ...-62c1af3921...(base/main, pre-genui) →Cache saved with key: ...-9f1158750.... Attempt docs: Polish README.md #5 then restored...-9f1158750...exactly (the already-poisoned cache).9f1158750...is the PR merge commit (Merge ba716689 into 62c1af39);62c1af39...is its base parent onmain.58 cached / 61 total; the react runtime and all failing fixtures werecache hit, replaying logs.pnpm-lock.yaml(394 +/-) but no*.rs, so the old key did not invalidate.