[PoC] op-node: Light CL by pcw109550 · Pull Request #18196 · ethereum-optimism/optimism

pcw109550 · 2025-11-06T10:14:35Z

Description

This PR is to track progress of light CL originated from karlfloersch#6.

The PR will not be merged, and rather for experimentation. Ideally this PR will be divided into chunks and incrementally merged to develop.

Testing

Targeting op-sepolia:

sync-tester

./op-sync-tester --config config.yaml --rpc.port=8551 --log.level TRACE

op-node

./op-node \
  --l1="https://sepolia.infura.io/v3/<YOUR_INFURA_KEY>" \
  --l2="http://localhost:8551/chain/11155420/synctest/7b39fb24-4acf-4d45-8920-ab27b6a31ee4?latest=0&safe=0&finalized=0&el_sync_active=true" \
  --l2.jwt-secret="/tmp/jwt.txt" \
  --network op-sepolia \
  --rpc.addr=0.0.0.0 \
  --rpc.port=8547 \
  --l1.beacon="https://beacon-sepolia.ethpandaops.io" \
  --sequencer.enabled=false \
  --p2p.sync.req-resp=false \
  --syncmode=execution-layer \
  --log.level=DEBUG \
  --safe-source.l2-rpc="https://sepolia.optimism.io" \
  --safe-source=l2

Monitor

cast rpc optimism_syncStatus --rpc-url http://localhost:8547 | jq .unsafe_l2.number
cast bn latest --rpc-url https://sepolia.optimism.io
cast rpc optimism_syncStatus --rpc-url http://localhost:8547 | jq .safe_l2.number
cast bn safe --rpc-url https://sepolia.optimism.io
cast rpc optimism_syncStatus --rpc-url http://localhost:8547 | jq .finalized_l2.number
cast bn finalized --rpc-url https://sepolia.optimism.io
cast rpc sync_getSession --rpc-url http://localhost:8551/chain/11155420/synctest/7b39fb24-4acf-4d45-8920-ab27b6a31ee4 | jq

Add comprehensive implementation plan for safe-source=l2 fast follower mode: - ~640 LOC total (200 core, 225 infrastructure, 215 tests) - Reuses existing EngineClient infrastructure - Skip derivation when using L2 safe source - Query remote safe/finalized via eth_getBlockByNumber tags - Includes sync tester test design similar to EL sync tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Implement configuration foundation for safe-source=l2 feature: - Add SafeSource enum (L1/L2) to sync.Config - Add --safe-source and --safe-source.l2-rpc CLI flags - Parse and validate safe source configuration - Validate incompatible flag combinations (e.g., EL sync + L2 source) - Warn users when trusting remote L2 node Changes (~90 LOC): - op-node/rollup/sync/config.go: Add SafeSource type and config fields - op-node/flags/flags.go: Add CLI flags - op-node/service.go: Parse and validate config Tests: ✅ Build passes, packages compile successfully Part of implementing fast follower mode where op-node can trust another L2 node's safe/finalized heads instead of deriving from L1. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Phase 2: Wire up remote L2 client initialization - Initialize EngineClient for remote L2 when SafeSource=L2 - Add SafeSourceL2Client field to SyncDeriver and EngineController - Pass client through driver initialization Phase 3: Add derivation skip logic - Skip derivation pipeline when using L2 safe source - Similar to EL sync, but for safe-source=l2 mode Phase 4: Query remote safe/finalized in forkchoice - Add fetchRemoteL2BlockHash helper to query remote L2 blocks - Modify insertUnsafePayload to use remote safe/finalized hashes - Update ForkchoiceState with remote block hashes Note: Full payload fetching with NewPayload will be added later. Currently just queries block hashes from remote L2. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Improvements to Phase 4: - Add SafeSourceL2Client interface with PayloadByHash/PayloadByLabel - Implement fetchAndEnsureRemoteL2Block with full payload fetching - Use L2BlockRefByNumber to check canonical chain (not just any block) - Detect chain divergence and trigger reorg by setting unsafe head - Call NewPayload to insert missing blocks into local EL - Add L2BlockRefByNumber to ExecEngine interface Key logic: 1. Query remote L2 for safe/finalized block ref 2. Check local EL by block number (canonical chain check) 3. If hash mismatch → reorg detected, set unsafe head to remote 4. If block missing → fetch payload and insert via NewPayload 5. Use remote hash in ForkchoiceUpdate 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Phase 5: Unit tests covering: 1. SafeSource config type (op-node/rollup/sync/config_test.go): - Test StringToSafeSource parsing (l1, l2, uppercase, invalid) - Test SafeSource.String() output - Test SafeSource.Set() method - Test SafeSource.Clone() method 2. Engine controller safe-source=l2 logic (op-node/rollup/engine/engine_controller_test.go): - Test fetchAndEnsureRemoteL2Block when block already exists locally - Test fetchAndEnsureRemoteL2Block with chain divergence (triggers reorg) - Test fetchAndEnsureRemoteL2Block with missing block (fetches via NewPayload) - Mock implementations for SafeSourceL2Client All tests pass and compilation succeeds. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Fix indentation in driver.go syncDeriver struct initialization - Remove ResetStepBackoff call in safe-source=l2 block to allow unsafe head to progress at max speed. EL sync backoff is already handled by the IsEngineSyncing() check above.

Reset backoff to keep processing steps (especially TryUpdateEngine for unsafe blocks) frequently, similar to EL sync. The backoff mechanism still works for actual errors via TemporaryErrorEvent handlers.

Simplified comment to match the formatting of the IsEngineSyncing block.

Removes 157 lines of tests for boilerplate String/Set/Clone methods. The Mode type in the same file has identical methods without tests, and other config_test.go files test actual validation logic instead.

- Extend L2CLConfig to support SafeSource and SafeSourceL2RPC fields - Modify WithOpNode to wire SafeSource config into sync.Config - Add DefaultSingleChainMultiNodeWithSafeSourceL2System preset that configures L2CLB to use L2CL as safe source - Add basic acceptance test verifying safe head matching

- Add DefaultSimpleSystemWithSyncTesterSafeSourceL2 preset that configures L2CL2 to use L2CL as safe source - Add SimpleWithSyncTesterSafeSourceL2 preset wrapper - Add sync_tester test verifying safe head matching with sync tester - Test verifies L2CL2 follows L2CL's safe head via RPC without performing derivation

Required by devstack test framework to initialize orchestrator.

…sing blocks Moved safe-source=l2 fetching logic to run in tryUpdateEngineInternal before needFCUCall check, ensuring it executes periodically regardless of block insertion success. This fixes a chicken-and-egg problem where safe-source couldn't run after EL reset because blocks couldn't insert without safe-source data. Fixed error handling in fetchAndEnsureRemoteL2BlockWithRef to treat any error from L2BlockRefByNumber as "block not found" rather than fatal. The EL returns various error types when blocks are missing, not just ethereum.NotFound. These changes enable safe-source=l2 to recover when the EL has missing blocks, which is critical for the sync tester scenario and real-world data loss recovery. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Moved safe-source=l2 sync logic from tryUpdateEngineInternal to a dedicated safeSourceL2Ticker in the driver event loop. This provides: - Cleaner separation of concerns - More intuitive code organization - Ticker fires every 4 seconds regardless of chain progress - All-or-nothing update pattern for safe and finalized heads - Only executes when SafeSource == SafeSourceL2 Changes: - Made FetchAndEnsureRemoteL2BlockWithRef public in engine_controller - Removed safe-source=l2 logic from tryUpdateEngineInternal - Added safeSourceL2Ticker to driver.go event loop - Ticker polls remote L2 and updates both safe/finalized atomically 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Consistent with SyncStep behavior, skip safeSourceL2Ticker updates when the execution engine is syncing to avoid unnecessary FCU calls.

Renamed test setup files to use 'safesourcel2' naming for better clarity and consistency with --safe-source=l2 flag.

- Remove ResetStepBackoff from safe-source=l2 block to allow unsafe head to progress at max speed via P2P gossip - Change "follower node" to "L2 sourced node" for better terminology - Remove SetUnsafeHead call during reorg to maintain safe <= unsafe invariant (ticker handles setting all heads atomically) - Improve logging for diverged block insertions

Fetch finalized block first, then safe block to ensure we maintain the invariant finalized <= safe <= unsafe when inserting blocks into the local EL during sync.

- Extract safeSourceL2Ticker logic into syncSafeHeadFromL2 function - Simplify SafeSource from int to string type, reducing boilerplate from 53 to 31 lines - Update flags to use hardcoded options string 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Fetch and update unsafe head from remote L2 to maintain finalized <= safe <= unsafe invariant - Rename FetchAndEnsureRemoteL2BlockWithRef -> FetchAndInsertRemotePayloadIfMissing for clarity - Update comments to better describe payload insertion behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…andling Address PR review feedback: - Rename FetchAndEnsureRemoteL2BlockWithRef to FetchAndInsertRemotePayloadIfMissing - Remove unsafe head fetching from remote L2 (unsafe comes from P2P gossip) - Add reorg detection: return bool indicating if local chain diverged at safe block - Trigger unsafe head reorg when local EL has different hash at safe block number - Simplify error handling with early returns instead of complex multi-error check The reorg logic now correctly: 1. Inserts any missing payloads 2. Checks if local EL block hash matches remote safe block hash 3. Sets unsafe head to safe head when divergence detected (triggers reorg) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Remove NewPayload calls from safe-source=l2 mode and rely on EL sync triggered by forkchoice updates instead. Changes: - Simplified syncSafeHeadFromL2 to only fetch block refs (not payloads) - Check if safe block is on canonical chain by comparing hashes - Set unsafe head to safe head when block is missing or diverged - Added FetchRemoteL2BlockRef helper for lightweight block ref fetching - Added L2BlockRefByNumber wrapper for canonical chain checks This approach is cleaner and allows EL sync to handle payload insertion naturally through the forkchoice update mechanism. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…nesis When safe-source=l2 is enabled, the startup engine reset was causing the first ForkchoiceUpdate to go to genesis instead of the remote L2's safe/ finalized heads. This happened because: 1. Driver.Start() immediately triggered an engine reset 2. The reset called FindL2Heads() which read the local EL's current state 3. On a fresh EL, this defaulted to genesis for safe/finalized heads 4. The reset applied these genesis heads and sent FCU(genesis) 5. This caused snap sync to complete immediately at genesis The fix removes the startup engine reset entirely. The syncSafeHeadFromL2 function, which runs on a 4-second ticker, will handle initialization by fetching the remote safe/finalized heads and setting them as the first forkchoice state. This ensures the first FCU uses the correct remote heads instead of genesis. Tested with op-acceptance-tests: - tests/safe_source_l2: PASS - tests/sync_tester/sync_tester_safesourcel2: PASS 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

pcw109550 · 2025-11-12T05:31:21Z

+		// However, we don't reset the step backoff here because we still want the
+		// unsafe head to progress at max speed via P2P gossip.


We may need a step backoff, comment must be adjusted.

pcw109550 · 2025-11-12T06:04:27Z

+		p2pOpt1.Deploy(orch)
+		p2pOpt1.AfterDeploy(orch)
+
+		p2pOpt2 := WithL2ELP2PConnection(ids.L2EL, ids.L2ELB)


The existing tests may not benefit from the EL Sync because they all start at genesis, and advance blocks starting from 1, and those blocks are directly appendable to the genesis. Need a sync tester case to test chains syncing from the already progressed chain

pcw109550 · 2025-11-12T08:49:34Z

 		// there is no need to request L2 blocks when we are syncing already.
 		if head := s.SyncDeriver.Engine.UnsafeL2Head(); head != lastUnsafeL2 || !s.SyncDeriver.Derivation.DerivationReady() {
 			lastUnsafeL2 = head
 			altSyncTicker.Reset(syncCheckInterval)


If derivation is off, !s.SyncDeriver.Derivation.DerivationReady() will always be true, resulting to disable RR Sync. This means unsafe gap is never filled. This must be mitigated.

pcw109550 · 2025-11-12T14:03:59Z

+	logger.Info("### Safe  ", "ver", sys.L2CL2.SafeL2BlockRef(), "seq", sys.L2CL.SafeL2BlockRef())
+	logger.Info("### Unsafe", "ver", sys.L2CL2.UnsafeHead(), "seq", sys.L2CL.UnsafeHead())
+	logger.Info("### Finalzed", "ver", sys.L2CL2.SyncStatus().FinalizedL2, "seq", sys.L2CL.SyncStatus().FinalizedL2)
+


EL never advanced since the underlying sync tester EL did not enable EL Syncing. Fixed

opgitgovernance · 2025-11-29T16:45:46Z

This pr has been automatically marked as stale and will be closed in 5 days if no updates

pcw109550 added the M-do-not-merge Meta: Do not merge label Nov 6, 2025

opsuperchain and others added 29 commits November 7, 2025 18:19

fix: address PR review comments

ee1789f

- Fix indentation in driver.go syncDeriver struct initialization - Remove ResetStepBackoff call in safe-source=l2 block to allow unsafe head to progress at max speed. EL sync backoff is already handled by the IsEngineSyncing() check above.

fix: add ResetStepBackoff back to safe-source=l2 block

47d6dcd

Reset backoff to keep processing steps (especially TryUpdateEngine for unsafe blocks) frequently, similar to EL sync. The backoff mechanism still works for actual errors via TemporaryErrorEvent handlers.

style: match EL sync comment style for safe-source=l2 block

6eacf92

Simplified comment to match the formatting of the IsEngineSyncing block.

test: remove config_test.go for SafeSource

b370957

Removes 157 lines of tests for boilerplate String/Set/Clone methods. The Mode type in the same file has identical methods without tests, and other config_test.go files test actual validation logic instead.

test: add init_test.go for safe_source_l2

86c3c9c

Required by devstack test framework to initialize orchestrator.

docs: remove implementation planning document

e0daad4

fix: skip safe-source=l2 updates when EL is syncing

5022f92

Consistent with SyncStep behavior, skip safeSourceL2Ticker updates when the execution engine is syncing to avoid unnecessary FCU calls.

refactor: rename safesource to safesourcel2 for consistency

e2c9d92

Renamed test setup files to use 'safesourcel2' naming for better clarity and consistency with --safe-source=l2 flag.

fix: fetch finalized before safe to maintain invariant

3221aa4

Fetch finalized block first, then safe block to ensure we maintain the invariant finalized <= safe <= unsafe when inserting blocks into the local EL during sync.

Fix unit test?

9541f7f

Rm unused method and unit tests

633779a

Deflake

8be9545

linter happy

8590515

safe source with sequencer

5342a5d

blast every logs

fbecf30

pcw109550 force-pushed the pcw109550/karlfloersch/light-node-poc branch from d1dbfd2 to fbecf30 Compare November 7, 2025 09:19

disable derivation + seq works?

dc42477

pcw109550 commented Nov 7, 2025

View reviewed changes

Comment thread op-node/rollup/driver/sync_deriver.go

pcw109550 added 7 commits November 11, 2025 23:12

Do not FCU to genesis

2ffc9bd

revert back

f958ffd

Fix seq + light node?

8a9f55e

edge case

b73e69e

try another approach

89c60af

linter

6811966

logs

bda342f