ci(bench): move CodSpeed to physical-exclusive runner and add walltime mode#2793
ci(bench): move CodSpeed to physical-exclusive runner and add walltime mode#2793upupming wants to merge 6 commits into
Conversation
…e mode Switch the benchmark job to the new physical-exclusive runner, plumb the OSS cache credentials (ACCESS_KEY / SECRET_KEY / ENDPOINT / BUCKET_NAME / REGION) at job level so lynx-infra/cache can authenticate, and add a second codspeed step in walltime mode alongside the existing simulation run. Draft — exploring whether the physical runner gives stabler walltime numbers than the xlarge VM.
|
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
The previous commit set ACCESS_KEY/SECRET_KEY/ENDPOINT/BUCKET_NAME/REGION
as job env from ${{ secrets.* }}, but workflow-bench.yml is invoked via
workflow_call from test.yml without a `secrets:` block, so the secrets
resolved to empty strings and lynx-infra/cache failed with
`lack params: accessKeyId, accessKeySecret, region`.
Declare the five secrets as optional inputs on the called workflow and
forward them explicitly from test.yml so the OSS cache restore can
authenticate on the new physical-exclusive runner (the xlarge image had
these credentials baked in; the physical runner doesn't).
UI JudgeGEQI weighted score: 60.6 / 100 across 8 examples.
DetailsResult 1
Result 2
Result 3
Result 4
Result 5
Result 6
Result 7
Result 8
|
The build job runs on lynx-ubuntu-24.04-xlarge whose OSS credentials point to a different bucket than the new physical-exclusive runner used by this benchmark job. With fail-on-cache-miss: true the cache restore hard-fails, since the cache key written by build can't be read by bench. Drop fail-on-cache-miss so the restore step is best-effort and let `pnpm turbo build` rebuild locally on the physical machine instead. Bump timeout-minutes to 45 to absorb the cold rust compile.
The physical-exclusive runner image doesn't ship cargo, so `pnpm turbo
build` fails at @lynx-js/swc-plugin-reactlynx-compat#build with
`/bin/sh: 1: cargo: not found` when it invokes the package's build.js
(which shells out to `cargo build` to compile the SWC plugin's Rust
crate). The lynx-ubuntu-24.04-xlarge image had Rust pre-installed; the
new physical runner doesn't.
Add the repo's ./.github/actions/rustup composite action (same one used
by workflow-build.yml) ahead of TurboCache. Reuse the action's
save-if='${{ github.ref_name == 'main' }}' gate so the rustup cache
isn't written from PR runs.
…runner
The reusable ./.github/actions/rustup action wrote `${HOME:-/.cargo}/bin`
to GITHUB_PATH, but \$HOME is not propagated between steps on the
physical-exclusive self-hosted runner (visible in setup-uv's
`Added undefined/.local/bin to the path`), so the resulting PATH entry
is empty and the next step fails with `rustup: command not found`.
Install rustup directly with explicit HOME=/root (the runner's true
home, evidenced by the existing /root/.rustup/settings.toml the
installer found) and pin /root/.cargo/bin onto GITHUB_PATH. Also export
HOME=/root in the simulation/walltime steps so
`. "$HOME/.cargo/env"` resolves correctly to pick up the codspeed CLI
that the prepare step installed.
… build After fixing PATH propagation, the next failure was @lynx-js/web-core#build:wasm aborting in rustup with `could not rename '...partial' file ... No such file or directory` (os error 2). The previous step installed the `stable` toolchain, but rust-toolchain.toml pins 1.92.0 and adds wasm32-unknown-unknown / wasm32-wasip1 targets. With turbo running many cargo invocations concurrently, multiple processes each tried to sync the 1.92.0 channel and races on /root/.rustup/downloads/*.partial corrupted the install. Install rustup with --default-toolchain none, then explicitly install the pinned toolchain and add the wasm targets up front so every parallel cargo invocation downstream sees a complete, ready-to-use install.
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| 🆕 | WallTime | transform 1000 view elements |
N/A | 5.7 ms | N/A |
| 🆕 | WallTime | basic-performance-large-css |
N/A | 1.1 ms | N/A |
| 🆕 | WallTime | basic-performance-nest-level-100 |
N/A | 945.8 µs | N/A |
| 🆕 | WallTime | basic-performance-small-css |
N/A | 972.2 µs | N/A |
| 🆕 | WallTime | basic-performance-div-1000 |
N/A | 8.8 ms | N/A |
| 🆕 | WallTime | basic-performance-image-100 |
N/A | 1.3 ms | N/A |
| 🆕 | WallTime | basic-performance-div-10000 |
N/A | 30.8 ms | N/A |
| 🆕 | WallTime | basic-performance-div-100 |
N/A | 1 ms | N/A |
| 🆕 | WallTime | basic-performance-text-200 |
N/A | 2 ms | N/A |
| 🆕 | WallTime | basic-performance-scroll-view-100 |
N/A | 1.5 ms | N/A |
Comparing ci/codspeed-physical-exclusive-walltime (89ede06) with main (e33c08f)
Footnotes
-
26 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
dc05dc8 to
89ede06
Compare
Summary
lynx-ubuntu-24.04-xlargeto the newphysical-exclusiverunner.ACCESS_KEY/SECRET_KEY/ENDPOINT/BUCKET_NAME/REGION) at job level solynx-infra/cachecan authenticate on the new machine (the xlarge VM had these baked into the image; the physical runner doesn't).codspeed run --mode walltimestep alongside the existing simulation run, so we get real-world timings on top of the deterministic instruction-count data.Draft — exploring whether the physical runner gives stabler walltime numbers than the shared VM. If the walltime variance is small enough, we can start treating it as a reportable signal next to simulation.
Test plan
Benchmark / nodejs-benchmarkjob runs to completion on the new runner (nolack params: accessKeyId/...errors fromlynx-infra/cache).