op-node: Light CL: Follow Source by pcw109550 · Pull Request #18365 · ethereum-optimism/optimism

pcw109550 · 2025-11-24T13:35:13Z

Description

This PR introduces a new upstream sync loop (Driver → Engine Controller) that (1) detects L1 reorgs in unsafe-only mode and resets the node appropriately, and (2) optionally mirrors external L2 safe/finalized state when --l2.follow.source is enabled. It replaces the derivation-pipeline-based reorg logic for unsafe-only nodes and ensures L2 always tracks L1 and external safe/finalized sources correctly.

~~This PR builds on #18290 and~~ implements two features:

Task A – Follow an external L2 source for safe and finalized blocks
Enabled with --l2.follow.source (requires --l2.unsafe-only).
Task B – Trigger an L2 reset when an L1 reorg occurs
Enabled with --l2.unsafe-only.

Note: --l2.follow.source can only be used when --l2.unsafe-only is enabled.

A new control-flow path is added:

Driver (op-node/rollup/driver/driver.go) → Engine Controller (op-node/rollup/engine/engine_controller.go)

`Driver` changes

We need a periodical ticker that is responsible to perform upper two tasks. Similar to the altSyncTicker which periodically tries to close the unsafe gap, I added the upstreamSyncTicker to do perform tasks. op-node/rollup/driver/driver.go. upstreamSyncTicker is only enabled when --l2.unsafe-only is enabled.

The ticker calls followUpstream() method periodically. The method implements Task B and calls Engine Controller to do Task A, by below steps

We do not interfere initial EL sync. Let the CL and EL finish the initial EL Sync.
We first check the L1 reorg by inspecting the unsafe head's L1Origin exists/has valid hash. If not, trigger a reset to fetch valid heads.
- Originally, the reorg was done at the derivation pipeline, at
  
  optimism/op-node/rollup/derive/pipeline.go
  
  Line 221 in ef97055
  
  return nil, dp.traversal.AdvanceL1Block(ctx)
- https://github.com/ethereum-optimism/optimism/tree/develop/op-node#l1-reorg was the logic inside derivation.
- Because we do not do derivation, we do not have the CurrentL1 view inside the CL. This means we cannot perform the same L1 reorg check.
- I chose to check the L1Origin of the unsafe head to detect L1 reorg because this still satisfies the goal; detect L1 reorg and if detected, trigger reset.
(Optional) After the L1 sanity check, if --l2.follow.source is enabled the followUpstream() fetches external L2 info (esafe, efinalized) and tries to apply to the Engine Controller by calling s.SyncDeriver.Engine.FollowSource(eSafe, eFinalized)

`Engine Controller` Changes

FollowSource(eSafe, eFinalized) is implemented at Engine Controller, implementing Task A. Engine Controller has the responsibility to update its internal state based on the injected external state. The Engine Controller performs mirroring by below steps:

First, we check that the external safe > local unsafe. If it is, we update the local unsafe then FCU it to the EL. By this, we may advance the safe and unsafe head together. The underlying EL may was still performing to the previous local unsafe, and we bump to the EL sync target to external safe. This covers the situation that the CLP2P is down, and op-node is advancing its unsafe head with safe head together. No harm since we must prioritize safe head.
If not, external safe <= local unsafe. In this case, we query the local EL using external safe's number. If the EL Sync is complete, the external safe must be queried, because external safe <= local unsafe. In other words, if the EL sync is complete, op-node's unsafe label and the actual EL's unsafe head must match. We queried the block number before the unsafe head (external safe's block number) so it must be queried.
- If the external safe block number is not queried, this means the EL is still EL Syncing to the unsafe head. Do not interrupt by NOT bumping the unsafe head of the op-node. In most cases, this EL syncing will complete shortly, because we only apply following source after the initial EL sync is complete (the long EL Sync that must not be interrupted).
- If the external safe is queried, this means the EL finished EL Syncing, and CL and EL is in sync.
We fetched block from local EL, using the external safe's block number. Now we can compare the external and the local blockref. If those match, consolidate. If not, trigger a reorg, and make local safe and unsafe equal to external safe.

Tests

Four tests are added for testing the --l2.follow.source:

TestFollowL2_ReorgRecovery: checks follow source seq / ver reorgs when L1 reorgs
TestFollowL2_SafeAndFinalized: happy case, external safe and finalized is mirrored when local unsafe head is ahead of external safe and finalized
TestFollowL2_WithoutCLP2P: CLP2P down, so the data source of safe / finalized is only the follow source. Unsafe and safe will advance together.
TestSyncTesterFollowL2ReachTips: Using the sync tester, test that the op-node can mirror safe / finalized of op-sepolia.

One test was added to test the --l2.unsafe-only:

TestUnsafeOnly_ReorgRecovery: checks unsafe only seq / ver reorgs when L1 reorgs

Additional context

Before this PR, when --l2.unsafe-only is enabled, the node did not L2 reorg when L1 reorg. We now always track the L1 reorg by the newly added upstreamSyncTicker and do a proper L2 reorg via reset.

Metadata

Part of [M2] op-node: Light CL: Follow External Source #18317

op-node/rollup/engine/engine_controller.go

op-acceptance-tests/tests/sync/unsafe_only/sync_test.go

karlfloersch

Didn't have too much but some thoughts

I didn't think about the fact that we can test kona-node light CL with the same sync tester tests!
Tests seem good, testing reorgs, p2p enabled and disabled. Tried to but couldn't think of extra cases
Great that we can reuse the sync tester ext configs
The reorg detection logic is MUCH cleaner now. Do you feel good about it?

Generally looks pretty dang good! I'll throw an approval even tho I think a second pair of eyes is probably good. But the logic looked pretty solid to me

axelKingsley

Testing is very comprehensive and for the most part I like the way the solution looks!

The main thing to solve is our ongoing reliance on an L1 source. Removing the L1 source from the node altogether is a huge payoff for a lite-node, and even though they're not deriving with it, they are still fully reliant on a highly available L1 source.

Rather, they should be able to utilize their L2 source to supply all derivation aspects of the Safe chain, which would fully eliminate the L1 connection.

How easy do you think it would be to eliminate the L1 connection through this feature?

EDIT: I see that this PR introduces a CL based approach which eliminates the L1: #18500

But @pcw109550 what is the reason for staring with EL based following only? Seems like a feature we would not want to use since it still requires L1.

Actually -- is there a reason we don't just say that sync_status is the required API to do safe following? This would enable safe following on basically all CLs by default.

axelKingsley · 2025-12-04T20:45:15Z

op-acceptance-tests/tests/sync/follow_l2/init_test.go

+		presets.WithReqRespSyncDisabled(),
+		presets.WithNoDiscovery(),
+		presets.WithCompatibleTypes(compat.SysGo),
+		presets.WithUnsafeOnly(),


Does WithUnsafeOnly only modify one of the two Verifiers, then?

So WithUnsafeOnly modifies every op-node CLs to use unsafe only.

func WithUnsafeOnly() stack.CommonOption { return stack.MakeCommon( sysgo.WithGlobalL2CLOption(sysgo.L2CLOptionFn( func(_ devtest.P, id stack.L2CLNodeID, cfg *sysgo.L2CLConfig) { cfg.SequencerUnsafeOnly = true cfg.VerifierUnsafeOnly = true }))) }

This is a global CL option and applies to every CL. However if every CL does not do derivation, there is no safe source. Therefore at least we need a single CL that does actual derivation.

To make this preset, I implemented DefaultSingleChainTwoVerifiersFollowL2System at

optimism/op-devstack/sysgo/system_singlechain_twoverifiers.go

Line 50 in 2202d0e

func DefaultSingleChainTwoVerifiersFollowL2System(dest *DefaultSingleChainTwoVerifiersSystemIDs) stack.Option[*Orchestrator] {

with the first verifier as

optimism/op-devstack/sysgo/system_singlechain_twoverifiers.go

Lines 71 to 73 in 2202d0e

// Specific options are applied after global options

// this means unsafeOnly is always disabled for the first verifier

opt.Add(WithL2CLNode(ids.L2CLB, ids.L1CL, ids.L1EL, ids.L2ELB, L2CLVerifierDisableUnsafeOnly()))

disabling unsafe only to perform derivation. This works because global option (WithUnsafeOnly) applies first and the node specific option(L2CLVerifierDisableUnsafeOnly()) applies after.

axelKingsley · 2025-12-04T20:47:16Z

op-acceptance-tests/tests/sync/follow_l2/sync_test.go

+	// Make sure L1 reorged
+	sys.L1EL.WaitForBlockNumber(l1BlockBeforeReorg.Number)
+	l1BlockAfterReorg := sys.L1EL.BlockRefByNumber(l1BlockBeforeReorg.Number)
+	logger.Info("Triggered L1 reorg", "l1", l1BlockAfterReorg)
+	require.NotEqual(l1BlockAfterReorg.Hash, l1BlockBeforeReorg.Hash)
+
+	// Need to poll until the L2CL detects L1 Reorg and trigger L2 Reorg
+	// What happens:
+	//  L2CL detects L1 Reorg and reset the pipeline. op-node example logs: "reset: detected L1 reorg"
+	//  L2ELB detects L2 Reorg and reorgs. op-geth example logs: "Chain reorg detected"
+	sys.L2ELB.ReorgTriggered(l2BlockBeforeReorg, 30)


Can we verify that L2ELB does not process the reorg on its own somehow?

The L2ELB never progresses or reorgs independently; it is always driven by the L2CLB via the Engine API. So the L2ELB cannot reorg on its own.

ReorgTriggered only checks that the canonical block at the divergence height has changed (same parent, different hash). This can only happen if the CL has processed the L1 reorg and sent a forkchoice update with the new head to the EL.

axelKingsley · 2025-12-04T21:34:14Z

op-node/rollup/driver/driver.go

+	// In this mode, the node does not derive from L1; instead, it uses L1 as a mandatory
+	// upstream anchor for its unsafe head, and may optionally import safe/finalized state


There is a lot of infrastructure value to not having these lite nodes use L1 connections at all. I need to read more closely to understand, but this comment leads me to think the L1 connection remains required?

I suggest we don't need a mandatory anchor from the L1, and can instead use the Safe Source for the L1 as well.

Responded at #18365 (comment)

axelKingsley · 2025-12-04T21:35:35Z

op-node/node/node.go

Same comment as earlier - we should be able to follow the L1 source through the L2 Follow Source.

Responded at #18365 (comment)

axelKingsley · 2025-12-04T21:37:44Z

op-node/rollup/driver/follow_source.go

nit: Consider that the naming flipped between the individual and composite interface. I suggest UpstreamFollowSource for the composite.

Fixed at eab29d5

op-node/rollup/driver/driver.go

pcw109550 · 2025-12-05T13:20:01Z

Let me share my thought process of the relation between the L1 source and the light CL feature.

So there are two features: --l2.unsafe-only and --l2.follow.source. Both must detect reorg, and follow the eventual canonical head.

As I mentioned at the PR description, L1 Reorg detection is embedded at the derivation pipeline:

optimism/op-node/rollup/derive/l1_traversal.go

Lines 60 to 71 in 532d1e4

    
           func (l1t *L1Traversal) AdvanceL1Block(ctx context.Context) error { 
        
           	origin := l1t.block 
        
           	nextL1Origin, err := l1t.l1Blocks.L1BlockRefByNumber(ctx, origin.Number+1) 
        
           	if errors.Is(err, ethereum.NotFound) { 
        
           		l1t.log.Debug("can't find next L1 block info (yet)", "number", origin.Number+1, "origin", origin) 
        
           		return io.EOF 
        
           	} else if err != nil { 
        
           		return NewTemporaryError(fmt.Errorf("failed to find L1 block info by number, at origin %s next %d: %w", origin, origin.Number+1, err)) 
        
           	} 
        
           	if l1t.block.Hash != nextL1Origin.ParentHash { 
        
           		return NewResetError(fmt.Errorf("detected L1 reorg from %s to %s with conflicting parent %s", l1t.block, nextL1Origin, nextL1Origin.ParentID())) 
        
           	}

So this means if we turn off derivation (--l2.unsafe-only), we must trigger a manual reset to trigger L2 reorg due to L1 reorg. This applies both to verifier and sequencer.

Case 1: When derivation disabled && follow source disabled.

When L1 reorg is triggered, but there is no L1 source, the sequencer and the verifier cannot detect the L1 reorg(no follow source). This is why we need the L1 source for correctness.

There is a lot of infrastructure value to not having these lite nodes use L1 connections at all. I need to read more closely to understand, but this comment leads me to think the L1 connection remains required?

So my take is that L1 connection remains required for the light CL for Case 1.

Case 2: When derivation disabled && follow source enabled.

Because we already have the L1 source, not from follow source for Case 1, we use the same reorg detection, relying on the L1 source, not the follow source.
We technically can only rely on the follow source if the source is the CL endpoint: syncStatus. It is because the syncStatus contains HeadL1, and we can detect the reorg, examining using the algorithm https://github.com/ethereum-optimism/optimism/tree/develop/op-node#l1-reorg.
I chose to rely on the L1 source because
- It always works whether follow source is disabled / follow source is EL / follow source is CL
- We can reuse the reorging algorithm introduced at Case 1.

Food for thought: If we only allow the combination: [derivation disabled && follow source enabled && follow source is the CL endpoint: syncStatus], I agree that we can simplify.

pcw109550 · 2025-12-12T09:22:24Z

Because the follow up PR #18571 merges the unsafe only flag and the follow source flag, it makes sense to consolidate these two PRs. Closing this PR in favor of #18571, and making the originally stacked PR targeting the develop

This comment was marked as outdated.

Sign in to view

This comment was marked as resolved.

Sign in to view

pcw109550 force-pushed the pcw109550/light-cl-unsafe-only branch from 10f056f to 59214ce Compare November 25, 2025 13:19

pcw109550 force-pushed the pcw109550/light-cl-follow-source branch from 54fbe8d to 12835bb Compare November 25, 2025 13:24

karlfloersch reviewed Nov 26, 2025

View reviewed changes

op-node/rollup/engine/engine_controller.go Show resolved Hide resolved

pcw109550 requested a review from axelKingsley November 30, 2025 13:44

pcw109550 marked this pull request as ready for review November 30, 2025 14:44

pcw109550 requested review from a team as code owners November 30, 2025 14:44

pcw109550 requested review from karlfloersch and sebastianst and removed request for a team November 30, 2025 14:44

Base automatically changed from pcw109550/light-cl-unsafe-only to develop December 1, 2025 18:27

karlfloersch reviewed Dec 2, 2025

View reviewed changes

op-acceptance-tests/tests/sync/unsafe_only/sync_test.go Outdated Show resolved Hide resolved

karlfloersch approved these changes Dec 2, 2025

View reviewed changes

pcw109550 removed the request for review from sebastianst December 2, 2025 11:15

pcw109550 force-pushed the pcw109550/light-cl-follow-source branch from fe7a184 to e91771e Compare December 2, 2025 11:18

pcw109550 mentioned this pull request Dec 4, 2025

op-node: Light CL: Follow Source using EL or CL #18500

Closed

axelKingsley reviewed Dec 4, 2025

View reviewed changes

almanax-ai bot reviewed Dec 5, 2025

View reviewed changes

op-node/rollup/driver/driver.go Show resolved Hide resolved

pcw109550 requested a review from axelKingsley December 5, 2025 16:21

pcw109550 added 7 commits December 11, 2025 20:43

wiring for l2.follow.source

e583288

Follow Safe without handling external safe > local unsafe

4a93d15

safe follow

b20020a

safe follow consider unsafe gap EL Sync

b6658b3

follow finalized

75b721b

Comments

85170b0

cleanup

94bcfef

pcw109550 added 23 commits December 11, 2025 20:43

drop unused interface methods

6ec3dc5

sanity check external finalized

84ee669

Adjust follow source log level

12be36d

op-devstack: Follow Source Support

555d1e9

op-acceptance-tests: Follow Source

6c2ec4c

op-devstack: Follow Source Support

6a8d5c2

simplify labeling

8910c6b

l1 reorg protection: reset

8052175

add reorg tc

d1b80cf

typo

8c70d91

follow source: check unsafe

c8397cc

convention

1c33795

Add unsafe only reorg test

28397c0

devstack/acceptance test : rename FollowSource to FollowL2

3979a79

follow upstream source enabled when derivation disabled, reorg detection

40fe3d7

fix unsafe only reorg sync test comments

f63d24a

rm unused interface method

4c68fde

dsl

05aa278

devstack support for ext sync test + follow l2

3d3539f

op-acceptance-tests: Follow L2 using Ext + SyncTester

20a5b2b

use blockref

2132fb3

fix log msg err

4bb9301

Fix composite interface naming

f17bedd

pcw109550 force-pushed the pcw109550/light-cl-follow-source branch from eab29d5 to f17bedd Compare December 11, 2025 11:44

pcw109550 mentioned this pull request Dec 11, 2025

op-node: Light CL: Always Follow Source using CL #18571

Merged

pcw109550 closed this Dec 12, 2025

	// Specific options are applied after global options
	// this means unsafeOnly is always disabled for the first verifier
	opt.Add(WithL2CLNode(ids.L2CLB, ids.L1CL, ids.L1EL, ids.L2ELB, L2CLVerifierDisableUnsafeOnly()))

		// In this mode, the node does not derive from L1; instead, it uses L1 as a mandatory
		// upstream anchor for its unsafe head, and may optionally import safe/finalized state

Conversation

pcw109550 commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Driver changes

Engine Controller Changes

Uh oh!

This comment was marked as outdated.

This comment was marked as resolved.

Uh oh!

Uh oh!

karlfloersch left a comment

Choose a reason for hiding this comment

Uh oh!

axelKingsley left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pcw109550 commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcw109550 commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

pcw109550 commented Nov 24, 2025 •

edited

Loading

`Driver` changes

`Engine Controller` Changes

axelKingsley left a comment •

edited

Loading

pcw109550 commented Dec 5, 2025 •

edited

Loading