chore: deflake epochs mbps test by mrzeszutko · Pull Request #21003 · AztecProtocol/aztec-packages

mrzeszutko · 2026-03-02T14:23:43Z

Summary

Fixes flaky epochs_mbps.parallel test by adding a retryUntil poll to assertMultipleBlocksPerSlot, closing a race condition between two independently-syncing archivers

Details

The epochs_mbps.parallel test has been flaking in CI (9 recent failures across PRs 20562-20868) on the "checkpointed block" test case. The root cause is a race condition:

waitForTx polls the initial setup node's archiver and returns when it sees the tx as CHECKPOINTED.
assertMultipleBlocksPerSlot then queries the first validator node's (nodes[0]) archiver via archiver.getCheckpoints().
These are different nodes with independent L1 polling cycles (~50ms interval each).
The first validator's archiver may not have indexed the latest checkpoint yet (~200-400ms race window).

CI logs confirm: the checkpoint with the expected block count is always produced and published to L1, but the first validator's archiver hasn't indexed it when the assertion runs.

Fix

Added a retryUntil poll at the start of assertMultipleBlocksPerSlot that waits (up to L2_SLOT_DURATION_IN_S * 3 = 108s, polling every 0.5s) for nodes[0]'s archiver to index a checkpoint with at least targetBlockCount blocks. Once found, the existing validation logic runs as before.

Fixes A-594

spalladino

Good catch again!

BEGIN_COMMIT_OVERRIDE fix: track last seen nonce in case of stale fallback L1 RPC node (#20855) feat: Validate num txs in block proposals (#20850) fix(archiver): enforce checkpoint boundary on rollbackTo (#20908) fix: tps zero metrics (#20656) fix: handle scientific notation in bigintConfigHelper (#20929) feat(aztec): node enters standby mode on genesis root mismatch (#20938) fix: logging of class instances (#20807) feat(slasher): make slash grace period relative to rollup upgrade time (#20942) chore: add script to find PRs to backport (#20956) chore: remove unused prover-node dep (#20955) fix: increase minFeePadding in e2e_bot bridge resume tests and harden GasFees.mul() (#20962) feat(sequencer): (A-526) rotate publishers when send fails (#20888) chore: (A-554) bump reth version 1.6.0 -> 1.11.1 for eth devnet (#20889) chore: metric on how many epochs validator has been on committee (#20967) fix: set wallet minFeePadding in BotFactory constructor (#20992) chore: deflake epoch invalidate block test (#21001) chore(sequencer): e2e tests for invalid signature recovery in checkpoint attestations (#20971) chore: deflake duplicate proposals and attestations (#20990) chore: deflake epochs mbps test (#21003) feat: reenable function selectors in txPublicSetupAllowList (#20909) fix: limit offenses when voting in tally slashing mode by slashMaxPayloadSize (#20683) fix(spartan): wire SEQ_L1_PUBLISHING_TIME_ALLOWANCE_IN_SLOT env var (#21017) END_COMMIT_OVERRIDE

## Summary - Fixes flaky `epochs_mbps.parallel` test by adding a `retryUntil` poll to `assertMultipleBlocksPerSlot`, closing a race condition between two independently-syncing archivers ## Details The `epochs_mbps.parallel` test has been flaking in CI (9 recent failures across PRs 20562-20868) on the "checkpointed block" test case. The root cause is a race condition: 1. `waitForTx` polls the **initial setup node's** archiver and returns when it sees the tx as `CHECKPOINTED`. 2. `assertMultipleBlocksPerSlot` then queries the **first validator node's** (`nodes[0]`) archiver via `archiver.getCheckpoints()`. 3. These are different nodes with independent L1 polling cycles (~50ms interval each). 4. The first validator's archiver may not have indexed the latest checkpoint yet (~200-400ms race window). CI logs confirm: the checkpoint with the expected block count is always produced and published to L1, but the first validator's archiver hasn't indexed it when the assertion runs. ### Fix Added a `retryUntil` poll at the start of `assertMultipleBlocksPerSlot` that waits (up to `L2_SLOT_DURATION_IN_S * 3` = 108s, polling every 0.5s) for `nodes[0]`'s archiver to index a checkpoint with at least `targetBlockCount` blocks. Once found, the existing validation logic runs as before. Fixes A-594

chore: deflake epochs mbps test

332963b

spalladino approved these changes Mar 2, 2026

View reviewed changes

spalladino enabled auto-merge (squash) March 2, 2026 14:28

spalladino merged commit 1ccca76 into merge-train/spartan Mar 2, 2026
10 checks passed

spalladino deleted the mr/deflake-epochs-mbps branch March 2, 2026 14:39

AztecBot mentioned this pull request Mar 2, 2026

feat: merge-train/spartan #20899

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: deflake epochs mbps test#21003

chore: deflake epochs mbps test#21003
spalladino merged 1 commit intomerge-train/spartanfrom
mr/deflake-epochs-mbps

mrzeszutko commented Mar 2, 2026

Uh oh!

spalladino left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mrzeszutko commented Mar 2, 2026

Summary

Details

Fix

Uh oh!

spalladino left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants