op-node: fix l1 origin selector getting stuck on reorg#18233
op-node: fix l1 origin selector getting stuck on reorg#18233sebastianst merged 3 commits intoethereum-optimism:developfrom
Conversation
|
Hey @bearpebble thanks for this fix. I'll take a look at it in more detail soon. |
|
@sebastianst the reason I implemented it this way is because the sequencer has code to specifically detect a reorg, which is currently unreachable without the change from this PR optimism/op-node/rollup/sequencing/sequencer.go Lines 503 to 519 in a6b8553 |
|
This pr has been automatically marked as stale and will be closed in 5 days if no updates |
|
@axelKingsley @pcw109550 anyone? |
sebastianst
left a comment
There was a problem hiding this comment.
Upon deeper inspection I realize that the L1 origin selector already doesn't guarantee that the L1 origins returned by it are on the same L1 chain. So the fix is fine.
The whole L1 origin selector code has grown into quite the mess to be honest, so this is the most straight forward way to fix this situation without a larger refactor.
|
@sebastianst thanks for the review. I believe I addressed all comments |
|
/ci authorize 60ce112 |
|
@bearpebble needs linting |
done |
|
/ci authorize 566f834 |
|
@bearpebble You man need to merge in/rebase onto latest |
if current origin was reorged out
566f834 to
101655a
Compare
|
/ci authorize 101655a |
1316212
|
@bearpebble Could you elaborate a bit here - why is |
Yes 0 is not the default. It was just the setting where we noticed the bug and likely the only setting where you would currently notice it since the reorg depth seems to be limited to 1 block in practice post merge. See https://etherscan.io/blocks_forked |
|
@bearpebble ok, thanks, that makes sense! |
Description
An L1 reorg sometimes led to the sequencer getting stuck on the same origin until hitting the max sequencer drift.
What happens in this situation is that there is a small reorg on L1, usually a single block. Conf depth was 0, although this can technically happen with any conf depth, just very unlikely with what reorgs currently look like on ethereum.
The sequencer picks up the later reorged block to build on top of it. Now we get a reorg and the block is no longer part of the chain.
The code will try to get the next block by number. This block is part of the post-reorged chain and therefore the check
nextOrigin.ParentHash == los.currentOrigin.Hashfails and it never sets the origin. As a result, the sequencer can never even notice that it would be building on top of a wrong origin and realize the chain reorged.This is only problematic in the case where the L1 derivation misses the reorg, which can essentially only happen for very short reorgs (influenced by the conf depth). The origin selector queries for a new origin on every L2 block (2s), while the L1 traversal is usually every 12 seconds. If the L1 traversal misses the reorg, the sequencer won't get reset and get stuck in this loop.
Tests
Test to ensure the selector won't get stuck on an L1 origin due to reorgs.