Conversation
Add comprehensive implementation plan for safe-source=l2 fast follower mode: - ~640 LOC total (200 core, 225 infrastructure, 215 tests) - Reuses existing EngineClient infrastructure - Skip derivation when using L2 safe source - Query remote safe/finalized via eth_getBlockByNumber tags - Includes sync tester test design similar to EL sync tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implement configuration foundation for safe-source=l2 feature: - Add SafeSource enum (L1/L2) to sync.Config - Add --safe-source and --safe-source.l2-rpc CLI flags - Parse and validate safe source configuration - Validate incompatible flag combinations (e.g., EL sync + L2 source) - Warn users when trusting remote L2 node Changes (~90 LOC): - op-node/rollup/sync/config.go: Add SafeSource type and config fields - op-node/flags/flags.go: Add CLI flags - op-node/service.go: Parse and validate config Tests: ✅ Build passes, packages compile successfully Part of implementing fast follower mode where op-node can trust another L2 node's safe/finalized heads instead of deriving from L1. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Phase 2: Wire up remote L2 client initialization - Initialize EngineClient for remote L2 when SafeSource=L2 - Add SafeSourceL2Client field to SyncDeriver and EngineController - Pass client through driver initialization Phase 3: Add derivation skip logic - Skip derivation pipeline when using L2 safe source - Similar to EL sync, but for safe-source=l2 mode Phase 4: Query remote safe/finalized in forkchoice - Add fetchRemoteL2BlockHash helper to query remote L2 blocks - Modify insertUnsafePayload to use remote safe/finalized hashes - Update ForkchoiceState with remote block hashes Note: Full payload fetching with NewPayload will be added later. Currently just queries block hashes from remote L2. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Improvements to Phase 4: - Add SafeSourceL2Client interface with PayloadByHash/PayloadByLabel - Implement fetchAndEnsureRemoteL2Block with full payload fetching - Use L2BlockRefByNumber to check canonical chain (not just any block) - Detect chain divergence and trigger reorg by setting unsafe head - Call NewPayload to insert missing blocks into local EL - Add L2BlockRefByNumber to ExecEngine interface Key logic: 1. Query remote L2 for safe/finalized block ref 2. Check local EL by block number (canonical chain check) 3. If hash mismatch → reorg detected, set unsafe head to remote 4. If block missing → fetch payload and insert via NewPayload 5. Use remote hash in ForkchoiceUpdate 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Phase 5: Unit tests covering: 1. SafeSource config type (op-node/rollup/sync/config_test.go): - Test StringToSafeSource parsing (l1, l2, uppercase, invalid) - Test SafeSource.String() output - Test SafeSource.Set() method - Test SafeSource.Clone() method 2. Engine controller safe-source=l2 logic (op-node/rollup/engine/engine_controller_test.go): - Test fetchAndEnsureRemoteL2Block when block already exists locally - Test fetchAndEnsureRemoteL2Block with chain divergence (triggers reorg) - Test fetchAndEnsureRemoteL2Block with missing block (fetches via NewPayload) - Mock implementations for SafeSourceL2Client All tests pass and compilation succeeds. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fix indentation in driver.go syncDeriver struct initialization - Remove ResetStepBackoff call in safe-source=l2 block to allow unsafe head to progress at max speed. EL sync backoff is already handled by the IsEngineSyncing() check above.
Reset backoff to keep processing steps (especially TryUpdateEngine for unsafe blocks) frequently, similar to EL sync. The backoff mechanism still works for actual errors via TemporaryErrorEvent handlers.
Simplified comment to match the formatting of the IsEngineSyncing block.
Removes 157 lines of tests for boilerplate String/Set/Clone methods. The Mode type in the same file has identical methods without tests, and other config_test.go files test actual validation logic instead.
- Extend L2CLConfig to support SafeSource and SafeSourceL2RPC fields - Modify WithOpNode to wire SafeSource config into sync.Config - Add DefaultSingleChainMultiNodeWithSafeSourceL2System preset that configures L2CLB to use L2CL as safe source - Add basic acceptance test verifying safe head matching
- Add DefaultSimpleSystemWithSyncTesterSafeSourceL2 preset that configures L2CL2 to use L2CL as safe source - Add SimpleWithSyncTesterSafeSourceL2 preset wrapper - Add sync_tester test verifying safe head matching with sync tester - Test verifies L2CL2 follows L2CL's safe head via RPC without performing derivation
Required by devstack test framework to initialize orchestrator.
…sing blocks Moved safe-source=l2 fetching logic to run in tryUpdateEngineInternal before needFCUCall check, ensuring it executes periodically regardless of block insertion success. This fixes a chicken-and-egg problem where safe-source couldn't run after EL reset because blocks couldn't insert without safe-source data. Fixed error handling in fetchAndEnsureRemoteL2BlockWithRef to treat any error from L2BlockRefByNumber as "block not found" rather than fatal. The EL returns various error types when blocks are missing, not just ethereum.NotFound. These changes enable safe-source=l2 to recover when the EL has missing blocks, which is critical for the sync tester scenario and real-world data loss recovery. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Moved safe-source=l2 sync logic from tryUpdateEngineInternal to a dedicated safeSourceL2Ticker in the driver event loop. This provides: - Cleaner separation of concerns - More intuitive code organization - Ticker fires every 4 seconds regardless of chain progress - All-or-nothing update pattern for safe and finalized heads - Only executes when SafeSource == SafeSourceL2 Changes: - Made FetchAndEnsureRemoteL2BlockWithRef public in engine_controller - Removed safe-source=l2 logic from tryUpdateEngineInternal - Added safeSourceL2Ticker to driver.go event loop - Ticker polls remote L2 and updates both safe/finalized atomically 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Consistent with SyncStep behavior, skip safeSourceL2Ticker updates when the execution engine is syncing to avoid unnecessary FCU calls.
Renamed test setup files to use 'safesourcel2' naming for better clarity and consistency with --safe-source=l2 flag.
- Remove ResetStepBackoff from safe-source=l2 block to allow unsafe head to progress at max speed via P2P gossip - Change "follower node" to "L2 sourced node" for better terminology - Remove SetUnsafeHead call during reorg to maintain safe <= unsafe invariant (ticker handles setting all heads atomically) - Improve logging for diverged block insertions
Fetch finalized block first, then safe block to ensure we maintain the invariant finalized <= safe <= unsafe when inserting blocks into the local EL during sync.
- Extract safeSourceL2Ticker logic into syncSafeHeadFromL2 function - Simplify SafeSource from int to string type, reducing boilerplate from 53 to 31 lines - Update flags to use hardcoded options string 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fetch and update unsafe head from remote L2 to maintain finalized <= safe <= unsafe invariant - Rename FetchAndEnsureRemoteL2BlockWithRef -> FetchAndInsertRemotePayloadIfMissing for clarity - Update comments to better describe payload insertion behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…andling Address PR review feedback: - Rename FetchAndEnsureRemoteL2BlockWithRef to FetchAndInsertRemotePayloadIfMissing - Remove unsafe head fetching from remote L2 (unsafe comes from P2P gossip) - Add reorg detection: return bool indicating if local chain diverged at safe block - Trigger unsafe head reorg when local EL has different hash at safe block number - Simplify error handling with early returns instead of complex multi-error check The reorg logic now correctly: 1. Inserts any missing payloads 2. Checks if local EL block hash matches remote safe block hash 3. Sets unsafe head to safe head when divergence detected (triggers reorg) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Remove NewPayload calls from safe-source=l2 mode and rely on EL sync triggered by forkchoice updates instead. Changes: - Simplified syncSafeHeadFromL2 to only fetch block refs (not payloads) - Check if safe block is on canonical chain by comparing hashes - Set unsafe head to safe head when block is missing or diverged - Added FetchRemoteL2BlockRef helper for lightweight block ref fetching - Added L2BlockRefByNumber wrapper for canonical chain checks This approach is cleaner and allows EL sync to handle payload insertion naturally through the forkchoice update mechanism. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…nesis When safe-source=l2 is enabled, the startup engine reset was causing the first ForkchoiceUpdate to go to genesis instead of the remote L2's safe/ finalized heads. This happened because: 1. Driver.Start() immediately triggered an engine reset 2. The reset called FindL2Heads() which read the local EL's current state 3. On a fresh EL, this defaulted to genesis for safe/finalized heads 4. The reset applied these genesis heads and sent FCU(genesis) 5. This caused snap sync to complete immediately at genesis The fix removes the startup engine reset entirely. The syncSafeHeadFromL2 function, which runs on a 4-second ticker, will handle initialization by fetching the remote safe/finalized heads and setting them as the first forkchoice state. This ensures the first FCU uses the correct remote heads instead of genesis. Tested with op-acceptance-tests: - tests/safe_source_l2: PASS - tests/sync_tester/sync_tester_safesourcel2: PASS 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
d1dbfd2 to
fbecf30
Compare
| // However, we don't reset the step backoff here because we still want the | ||
| // unsafe head to progress at max speed via P2P gossip. |
There was a problem hiding this comment.
We may need a step backoff, comment must be adjusted.
| p2pOpt1.Deploy(orch) | ||
| p2pOpt1.AfterDeploy(orch) | ||
|
|
||
| p2pOpt2 := WithL2ELP2PConnection(ids.L2EL, ids.L2ELB) |
There was a problem hiding this comment.
The existing tests may not benefit from the EL Sync because they all start at genesis, and advance blocks starting from 1, and those blocks are directly appendable to the genesis. Need a sync tester case to test chains syncing from the already progressed chain
| // there is no need to request L2 blocks when we are syncing already. | ||
| if head := s.SyncDeriver.Engine.UnsafeL2Head(); head != lastUnsafeL2 || !s.SyncDeriver.Derivation.DerivationReady() { | ||
| lastUnsafeL2 = head | ||
| altSyncTicker.Reset(syncCheckInterval) |
There was a problem hiding this comment.
If derivation is off, !s.SyncDeriver.Derivation.DerivationReady() will always be true, resulting to disable RR Sync. This means unsafe gap is never filled. This must be mitigated.
| logger.Info("### Safe ", "ver", sys.L2CL2.SafeL2BlockRef(), "seq", sys.L2CL.SafeL2BlockRef()) | ||
| logger.Info("### Unsafe", "ver", sys.L2CL2.UnsafeHead(), "seq", sys.L2CL.UnsafeHead()) | ||
| logger.Info("### Finalzed", "ver", sys.L2CL2.SyncStatus().FinalizedL2, "seq", sys.L2CL.SyncStatus().FinalizedL2) | ||
|
|
There was a problem hiding this comment.
EL never advanced since the underlying sync tester EL did not enable EL Syncing. Fixed
|
This pr has been automatically marked as stale and will be closed in 5 days if no updates |
Description
This PR is to track progress of light CL originated from karlfloersch#6.
The PR will not be merged, and rather for experimentation. Ideally this PR will be divided into chunks and incrementally merged to develop.
Testing
Targeting op-sepolia:
sync-tester
op-node
Monitor