Skip to content

op-node: fix l1 origin selector getting stuck on reorg#18233

Merged
sebastianst merged 3 commits intoethereum-optimism:developfrom
bearpebble:fix-sequencer-stuck-on-reorgs
Dec 3, 2025
Merged

op-node: fix l1 origin selector getting stuck on reorg#18233
sebastianst merged 3 commits intoethereum-optimism:developfrom
bearpebble:fix-sequencer-stuck-on-reorgs

Conversation

@bearpebble
Copy link
Copy Markdown
Contributor

Description

An L1 reorg sometimes led to the sequencer getting stuck on the same origin until hitting the max sequencer drift.
What happens in this situation is that there is a small reorg on L1, usually a single block. Conf depth was 0, although this can technically happen with any conf depth, just very unlikely with what reorgs currently look like on ethereum.
The sequencer picks up the later reorged block to build on top of it. Now we get a reorg and the block is no longer part of the chain.
The code will try to get the next block by number. This block is part of the post-reorged chain and therefore the check nextOrigin.ParentHash == los.currentOrigin.Hash fails and it never sets the origin. As a result, the sequencer can never even notice that it would be building on top of a wrong origin and realize the chain reorged.

This is only problematic in the case where the L1 derivation misses the reorg, which can essentially only happen for very short reorgs (influenced by the conf depth). The origin selector queries for a new origin on every L2 block (2s), while the L1 traversal is usually every 12 seconds. If the L1 traversal misses the reorg, the sequencer won't get reset and get stuck in this loop.

Tests

Test to ensure the selector won't get stuck on an L1 origin due to reorgs.

@bearpebble bearpebble requested review from a team and sebastianst November 10, 2025 14:54
@sebastianst
Copy link
Copy Markdown
Member

Hey @bearpebble thanks for this fix. I'll take a look at it in more detail soon.
But from a first look, I'm wondering if that's the right place to fix it. It doesn't seem right that we'd just select the next L1 origin if its parent isn't the current origin by hash. Wouldn't that mean that we'd create an invalid L1 origin chain? If this was detected, wouldn't this rather mean that we should perform an L2 reorg to the last canonical L1 origin?

@bearpebble
Copy link
Copy Markdown
Contributor Author

@sebastianst the reason I implemented it this way is because the sequencer has code to specifically detect a reorg, which is currently unreachable without the change from this PR

l1Origin, err := d.l1OriginSelector.FindL1Origin(ctx, l2Head)
if err != nil {
d.nextAction = d.timeNow().Add(time.Second)
d.nextActionOK = d.active.Load()
d.log.Error("Error finding next L1 Origin", "err", err)
d.emitter.Emit(d.ctx, rollup.L1TemporaryErrorEvent{Err: err})
return
}
if !(l2Head.L1Origin.Hash == l1Origin.ParentHash || l2Head.L1Origin.Hash == l1Origin.Hash) {
d.metrics.RecordSequencerInconsistentL1Origin(l2Head.L1Origin, l1Origin.ID())
d.emitter.Emit(d.ctx, rollup.ResetEvent{
Err: fmt.Errorf("cannot build new L2 block with L1 origin %s (parent L1 %s) on current L2 head %s with L1 origin %s",
l1Origin, l1Origin.ParentHash, l2Head, l2Head.L1Origin),
})
return
}

@opgitgovernance opgitgovernance added the S-stale Status: Will be closed unless there is activity label Dec 1, 2025
@opgitgovernance
Copy link
Copy Markdown
Contributor

This pr has been automatically marked as stale and will be closed in 5 days if no updates

@bearpebble
Copy link
Copy Markdown
Contributor Author

@axelKingsley @pcw109550 anyone?

@pcw109550 pcw109550 removed the S-stale Status: Will be closed unless there is activity label Dec 2, 2025
Copy link
Copy Markdown
Member

@sebastianst sebastianst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon deeper inspection I realize that the L1 origin selector already doesn't guarantee that the L1 origins returned by it are on the same L1 chain. So the fix is fine.

The whole L1 origin selector code has grown into quite the mess to be honest, so this is the most straight forward way to fix this situation without a larger refactor.

Comment thread op-node/rollup/sequencing/origin_selector_test.go Outdated
Comment thread op-node/rollup/sequencing/origin_selector_test.go Outdated
Comment thread op-node/rollup/sequencing/origin_selector_test.go Outdated
Comment thread op-node/rollup/sequencing/origin_selector.go Outdated
Comment thread op-node/rollup/sequencing/origin_selector.go
Comment thread op-node/rollup/sequencing/origin_selector.go
@bearpebble
Copy link
Copy Markdown
Contributor Author

@sebastianst thanks for the review. I believe I addressed all comments

@sebastianst
Copy link
Copy Markdown
Member

/ci authorize 60ce112

@sebastianst sebastianst self-assigned this Dec 2, 2025
@sebastianst
Copy link
Copy Markdown
Member

@bearpebble needs linting

@bearpebble
Copy link
Copy Markdown
Contributor Author

@bearpebble needs linting

done

@sebastianst
Copy link
Copy Markdown
Member

/ci authorize 566f834

@sebastianst sebastianst enabled auto-merge December 2, 2025 16:52
@sebastianst
Copy link
Copy Markdown
Member

sebastianst commented Dec 2, 2025

@bearpebble You man need to merge in/rebase onto latest develop changes. Hopefully fixes unrelated test failures.

@sebastianst sebastianst disabled auto-merge December 2, 2025 17:09
@bearpebble bearpebble force-pushed the fix-sequencer-stuck-on-reorgs branch from 566f834 to 101655a Compare December 2, 2025 17:34
@sebastianst
Copy link
Copy Markdown
Member

/ci authorize 101655a

@sebastianst sebastianst added this pull request to the merge queue Dec 3, 2025
Merged via the queue into ethereum-optimism:develop with commit 1316212 Dec 3, 2025
84 checks passed
@nonsense
Copy link
Copy Markdown
Contributor

nonsense commented Dec 3, 2025

@bearpebble Conf depth was 0, although this can technically happen with any conf depth, just very unlikely with what reorgs currently look like on ethereum.

Could you elaborate a bit here - why is conf depth equal to 0? Isn't it set to 2 by default across the codebase?

@bearpebble
Copy link
Copy Markdown
Contributor Author

@bearpebble Conf depth was 0, although this can technically happen with any conf depth, just very unlikely with what reorgs currently look like on ethereum.

Could you elaborate a bit here - why is conf depth equal to 0? Isn't it set to 2 by default across the codebase?

Yes 0 is not the default. It was just the setting where we noticed the bug and likely the only setting where you would currently notice it since the reorg depth seems to be limited to 1 block in practice post merge. See https://etherscan.io/blocks_forked

@nonsense
Copy link
Copy Markdown
Contributor

nonsense commented Dec 3, 2025

@bearpebble ok, thanks, that makes sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants