Skip to content

op-acceptance-tests: add test to base gate for unsafe gossip / head progression#18031

Closed
geoknee wants to merge 5 commits intodevelopfrom
gk/gossip-base-at
Closed

op-acceptance-tests: add test to base gate for unsafe gossip / head progression#18031
geoknee wants to merge 5 commits intodevelopfrom
gk/gossip-base-at

Conversation

@geoknee
Copy link
Copy Markdown
Contributor

@geoknee geoknee commented Oct 27, 2025

We recently realised that acceptance tests do not give us good coverage for unsafe head progression via p2p gossip. See, for example, #17940 (a bug which wasn't caught by ATs) as well as a recent op-reth bug we only caught on devnet.

This test adds that coverage, and simply asserts that a) there are some non-sequencer verifier nodes and b) they catch up to the sequencer within a timeout period.

The test disables the sysgo batcher, because otherwise safe chain derivation would allow the test to pass even if gossip was broken. Therefore this test should not run against external networks.

I confirmed manually that introducing a bug, where op-node just rejects gossip payloads unconditionally, does indeed cause this test to fail.

Comment on lines +21 to +23
// Stop batcher so there is no sync from deriving from L1
// This won't work on external networks.
sys.L2Batcher.Stop()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this test doesn't make sense to run on external networks, how does the acceptance test framework know to skip it? Is there some helper we should call, or preset we should use for that? @scharissis

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd do this with presets.WithCompatibleTypes(compat.SysGo) in your TestMain.
Good call-out.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done per test or only per package?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe only per package currently.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok in that case I should probably pull this out into it's own package, since I don't want to avoid running the neighbouring tests on external devnets.

Copy link
Copy Markdown
Contributor

@joshklop joshklop Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What prevents it from being run on an external devnet?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the control of the batcher, see other thread #18031 (comment). I think we could potentially work towards it running on an external network, if we can be sure that it is safe (in case someone points it at a production network) and won't interfere with other tests.

func TestMain(m *testing.M) {
presets.DoMain(m,
presets.WithMinimal(),
presets.WithSingleChainMultiNode(),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using this for all base tests? Does it give us extra coverage in existing tests?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite possibly, yes!

Also, this is how you'd mark this test to be skipped on external devnets:

Suggested change
presets.WithSingleChainMultiNode(),
presets.WithSingleChainMultiNode(),
presets.WithCompatibleTypes(compat.SysGo),

@geoknee geoknee added the H-jovian Hardfork: change is planned for jovian upgrade label Oct 27, 2025
@geoknee geoknee changed the title acceptance test for unsafe gossip op-acceptance-tests: add test to base gate for unsafe gossip / head progression Oct 27, 2025
func TestMain(m *testing.M) {
presets.DoMain(m,
presets.WithMinimal(),
presets.WithSingleChainMultiNode(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite possibly, yes!

Also, this is how you'd mark this test to be skipped on external devnets:

Suggested change
presets.WithSingleChainMultiNode(),
presets.WithSingleChainMultiNode(),
presets.WithCompatibleTypes(compat.SysGo),

Comment on lines +21 to +23
// Stop batcher so there is no sync from deriving from L1
// This won't work on external networks.
sys.L2Batcher.Stop()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd do this with presets.WithCompatibleTypes(compat.SysGo) in your TestMain.
Good call-out.

indexOfSequencer = i
break
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Instead of doing that check manually, can't we add a special case in the preset for the L2ELSequencer and the L2ELValidator?
No need to do it in this PR, I don't think it's worth blocking that change

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match.WithSequencerActive+match.EngineFor is probably what we want here.

verifierUnsafeHead := el.ChainSyncStatus(l2chainID, types.LocalUnsafe)
if verifierUnsafeHead.Number < initialUnsafeHeadNumber+NUM_UNSAFE_BLOCKS {
return false
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use el.AdvancedFn or el.ReachedFn here, and wait in parallel using the CheckAll method?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quite possibly! I can have a play with those utils.

sys := presets.NewSingleChainMultiNode(t)

// Stop batcher so there is no sync from deriving from L1
// This won't work on external networks.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, it calls admin_stopBatcher. If we enable the admin namespace, can we run this on external networks?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, yes -- but I think that admin_stopBatcher is quite destructive and the batcher would need to be manually restarted (to restore the chain to a state where it can run additional ATs) with the current implementation.

const TIMEOUT = 5 * time.Second // time to wait for all verifier nodes to catch up to the sequencer

// In order for this test to be valid, we need at least 2 L2 EL nodes.
t.Require().Greater(len(sys.L2Chain.L2ELNodes()), 1, "expected at least 2 L2 EL nodes")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should already be guaranteed by the preset when DEVNET_EXPECT_PRECONDITIONS_MET is true (the default).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'm tempted to leave this in as a sanity check / protect against regressions though.

indexOfSequencer = i
break
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match.WithSequencerActive+match.EngineFor is probably what we want here.

continue
}
verifierUnsafeHead := el.ChainSyncStatus(l2chainID, types.LocalUnsafe)
if verifierUnsafeHead.Number < initialUnsafeHeadNumber+NUM_UNSAFE_BLOCKS {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if the hash matches? Not sure if it matters in this case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have Matched and MatchedFn that checks both hash and number. We can just use it.

@opgitgovernance opgitgovernance added the S-stale Status: Will be closed unless there is activity label Nov 14, 2025
@opgitgovernance
Copy link
Copy Markdown
Contributor

This pr has been automatically marked as stale and will be closed in 5 days if no updates

@geoknee geoknee removed the S-stale Status: Will be closed unless there is activity label Nov 19, 2025
@opgitgovernance opgitgovernance added the S-stale Status: Will be closed unless there is activity label Dec 3, 2025
@opgitgovernance
Copy link
Copy Markdown
Contributor

This pr has been automatically marked as stale and will be closed in 5 days if no updates

@geoknee geoknee removed the S-stale Status: Will be closed unless there is activity label Dec 8, 2025
@opgitgovernance opgitgovernance added the S-stale Status: Will be closed unless there is activity label Dec 22, 2025
@opgitgovernance
Copy link
Copy Markdown
Contributor

This pr has been automatically marked as stale and will be closed in 5 days if no updates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

H-jovian Hardfork: change is planned for jovian upgrade S-stale Status: Will be closed unless there is activity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants