Skip to content

Rework CI to use Chopsticks servers to reduce RPC load#504

Merged
xlc merged 17 commits into
masterfrom
rework-ci
Feb 1, 2026
Merged

Rework CI to use Chopsticks servers to reduce RPC load#504
xlc merged 17 commits into
masterfrom
rework-ci

Conversation

@rockbmb

@rockbmb rockbmb commented Jan 28, 2026

Copy link
Copy Markdown
Collaborator

WIP, still need to create an issue describing what this solves.

Closes #506 .

TL;DR a Chopsticks DB is created in CI and given to all jobs to cache some data and avoid RPC timeouts..

@rockbmb rockbmb self-assigned this Jan 28, 2026
@rockbmb rockbmb added e2e tests Related to end-to-end tests ci labels Jan 28, 2026
Comment thread .github/workflows/ci.yml Fixed
Comment thread .github/workflows/ci.yml Fixed

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI rework effectively groups Asset Hub tests to share Chopsticks instances, which is good for resource usage. However, running tests sequentially within these groups is inefficient compared to Vitest's native parallelism. There is also a potential edge case in the test filtering logic that could cause some tests to be skipped entirely.

Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml Outdated
@rockbmb rockbmb changed the title Rework ci Rework CI to use Chopsticks servers to reduce RPC load Jan 28, 2026
Test matrix changes reverted, but chopsticks still used as a cache.
Instead of having a separate matrix for PAH/KAH, each job in the original
matrix will build its own chopsticks fork from a DB created in an early step
(only applies to PAH/KAH tests).
Previously, it was `system.applyAuthorizedUpgrade`, but this leads to
the occasional spurious error when PET's CI workflows are scheduled to
run at/near a runtime upgrade.

Now, `system.setHeapPages` is used: it requires a root origin and will
this fail, but recall that the purpose of the test is to check proxy
call filters work, and that call filters are enacted *before* origin
checks, which only occur within the extrinsic's execution.
@xlc xlc merged commit 5c49130 into master Feb 1, 2026
113 checks passed
@xlc xlc deleted the rework-ci branch February 1, 2026 20:36
@rockbmb rockbmb added this to the Refactors & redesigns milestone Feb 2, 2026
fellowship-merge-bot Bot pushed a commit to polkadot-fellows/runtimes that referenced this pull request Feb 19, 2026
Closes #1010 .

This PR will rework the PET job in `.github/workflows/test.yml`.

- [x] use a test matrix with individual retry/timeout configurations
rather than just `yarn test` to avoid restarting the whole suite on a
single suite's failure
- [x] To also avoid RPC timeouts due to throttling/stalling by test
network's production RPC endpoints, consider using
https://github.com/AcalaNetwork/subway or Chopsticks (as in
open-web3-stack/polkadot-ecosystem-tests#504)
- [x] Filter end-to-end test suites from PET to be used in finalized
workflow
- This repository is only concerned with the runtimes that are under the
Polkadot Fellowship's purview, while PET contains end-to-end tests for
other ecosystme chains; as such, it makes sense to filter what is
included in CI to further reduce load.
- [x] Re-enable parts of `test.yml` that were disabled to test PET
changes in isolation

# Review notes

Main changes:
* new folder in `.github/subway-configs`; it contains
[Subway](https://github.com/AcalaNetwork/subway) configs for the RPC
proxies that will be created during CI: one for each Polkadot/Kusama
relay + system parachain .
* PET reenabled in `.github/workflows/test.yml`, and completely
refactored
- it uses a matrix strategy with two jobs, one for each network:
Polkadot and Kusama

## Details

Each of the above P/K jobs:
- Fetches Rust toolchain + Subway, builds its binary, and starts a
Subway proxy for each of that network's relay/SP
       - Polkadot: 1 proxy for relay + 5 proxies for SPs
       - Kusama: 1 proxy for relay + 4 proxies for SPs
- Downloads WASM artifacts from a previous step of the CI pipeline
- Overrides PET's `.env` with said runtimes + proxies' endpoints
- That means PET will run using runtimes built against the code of
whichever PR triggered CI, and cache RPC responses, which are shared
between all chains' tests
- Gets `yarn`, checks out PET repo, builds it
- Runs PET's known block number command from within it, to use the
latest state in combination with built WASM runtimes
- Runs all end-to-end test suites for that network's chains
- only runs tests related to relay + SPs; ecosystem chains are skipped,
as this repo's purview is only for Fellowship-maintained runtimes
- Cleans up Subway proxies at the end

In particular, the critical change:

```bash
          yarn test -u $TEST_FILES --pool=threads --maxWorkers=3 --retry=3
          # `$TEST_FILES` contains all E2E modules for the network: relay + SPs included
          # 1. `-u` updates snapshots so that they don't fail spuriously e.g. proof size/ref time changes
          # 2. `--pool=threads` runs test files from `$TEST_FILES` in parallel using worker threads. Default
          #    is `--pool=forks`, which uses child processes - heavier.
          # 3. `--maxWorkers=3` combined with the above means at most 3 test files running concurrently, each in its
          #    own thread
          # 4. `--retry=3` attempts a failing test 3 times. This operates at the level of individual tests, not modules,
          #    or the whole test suite; one failure will *not* require a full rerun.
```

---

- [x] Does not require a CHANGELOG entry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci e2e tests Related to end-to-end tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use chopsticks as cache to avoid RPC stalling in CI workflow

3 participants