Skip to content
This repository was archived by the owner on Jan 16, 2026. It is now read-only.

fix(node/engine): audit engine for bugs. rename forkchoice task -> synchronize task. move block building logic to build task#2524

Merged
theochap merged 1 commit intomainfrom
theo/ensure-engine-paths-match
Jul 25, 2025
Merged

fix(node/engine): audit engine for bugs. rename forkchoice task -> synchronize task. move block building logic to build task#2524
theochap merged 1 commit intomainfrom
theo/ensure-engine-paths-match

Conversation

@theochap
Copy link
Member

Description

Attempt at cleaning-up the engine and restoring the behavior from pre-refactor https://github.com/op-rs/kona/pull/2388/files.
In particular:

Copilot AI review requested due to automatic review settings July 22, 2025 23:56
@theochap theochap self-assigned this Jul 22, 2025
@theochap theochap added K-feature Kind: feature A-node Area: cl node (eq. Go op-node) handles single-chain consensus A-engine Area: engine K-fix Kind: fix and removed K-feature Kind: feature labels Jul 22, 2025
@theochap theochap changed the title feat(node/engine): audit engine for bugs. rename forkchoice task -> synchronize task. move block building logic to build task fix(node/engine): audit engine for bugs. rename forkchoice task -> synchronize task. move block building logic to build task Jul 22, 2025
@theochap theochap moved this to In Review in Project Tracking Jul 22, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the engine task system by renaming and restructuring the forkchoice update logic. The main purpose is to split forkchoice responsibilities into separate synchronization and build tasks, restoring behavior from pre-refactor state while cleaning up the engine architecture.

  • Renames ForkchoiceTask to SynchronizeTask and removes it from the main EngineTask enum
  • Moves block building logic from forkchoice task into the BuildTask with a new start_build method
  • Updates all task references and error handling to use the new task structure

Reviewed Changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
docs/docs/pages/node/design/engine.mdx Updates documentation to reflect ForkchoiceTask rename to SynchronizeTask
crates/node/engine/src/task_queue/tasks/task.rs Removes ForkchoiceTask from EngineTask enum and associated error handling
crates/node/engine/src/task_queue/tasks/synchronize/task.rs Creates new SynchronizeTask implementation for forkchoice updates without attributes
crates/node/engine/src/task_queue/tasks/build/task.rs Adds start_build method to handle forkchoice updates with payload attributes
Multiple error files Updates error type references from ForkchoiceTaskError to SynchronizeTaskError
Multiple task files Updates task instantiation calls to use SynchronizeTask instead of ForkchoiceTask

Comment on lines +276 to +273
Err(InsertTaskError::UnexpectedPayloadStatus(e))
if self.attributes.is_deposits_only() =>
{
error!(target: "engine_builder", error = ?e, "Critical: Deposit-only payload import failed");
return Err(BuildTaskError::DepositOnlyPayloadFailed)
}
// HOLOCENE: Re-attempt payload import with deposits only
Err(InsertTaskError::ForkchoiceUpdateFailed(
ForkchoiceTaskError::InvalidPayloadStatus(e),
)) if self
.cfg
.is_holocene_active(self.attributes.inner().payload_attributes.timestamp) =>
Err(InsertTaskError::UnexpectedPayloadStatus(e))
Copy link

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error pattern matching is inconsistent with the error type being matched. The code is matching InsertTaskError::UnexpectedPayloadStatus(e) but based on the context and other files, this should likely be matching a different error variant that actually exists in the InsertTaskError enum.

Copilot uses AI. Check for mistakes.
)) if self
.cfg
.is_holocene_active(self.attributes.inner().payload_attributes.timestamp) =>
Err(InsertTaskError::UnexpectedPayloadStatus(e))
Copy link

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as line 276 - the error pattern matching appears to be using a non-existent error variant UnexpectedPayloadStatus on the InsertTaskError enum.

Copilot uses AI. Check for mistakes.
@theochap theochap added this to the [kona-node] Phase 5: Alpha milestone Jul 22, 2025
@codecov
Copy link

codecov bot commented Jul 22, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.1%. Comparing base (0d43738) to head (eac943c).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.


if let Err(err) = ForkchoiceTask::new(
// Retry to synchronize the engine until we succeeds or a critical error occurs.
while let Err(err) = SynchronizeTask::new(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Motivation, the synchronize task can only fail if:

  • RPC error (temporary error) -> we should retry to reset
  • finalized ahead of unsafe (critical error) -> should never happen if find_starting_forkchoice is correct
  • unexpected payload status (temporary error) -> this should never happen when we don't provide payload attributes. The synchronize task should only return Valid or EngineSyncing status
  • invalid forkchoice state -> try to reset again

If we don't have a while loop here:

  • we will keep adding tasks with the former engine state
  • this will cause the engine to reset because the tasks will return invalid forkchoice state

/// The engine is syncing.
#[error("Attempting to update forkchoice state while EL syncing")]
#[error("The engine is syncing")]
EngineSyncing,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
Self::EngineBuildError(EngineBuildError::EngineSyncing) => {
EngineTaskErrorSeverity::Temporary
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note here that:

}
Self::EngineBuildError(EngineBuildError::UnexpectedPayloadStatus(_)) => {
EngineTaskErrorSeverity::Temporary
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check: I think that this should instead be

Suggested change
}
Self::EngineBuildError(EngineBuildError::UnexpectedPayloadStatus(_)) => {
EngineTaskErrorSeverity::Critical
}

The forkchoice_updated should never return the Accepted variant

EngineTaskErrorSeverity::Temporary
}
Self::EngineBuildError(EngineBuildError::InvalidPayload(_)) => {
EngineTaskErrorSeverity::Temporary
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also should probably be

Suggested change
EngineTaskErrorSeverity::Temporary
EngineTaskErrorSeverity::Critical
  • trying to insert an invalid payload should never happen

Self::NewPayloadFailed(_) => EngineTaskErrorSeverity::Temporary,
Self::HoloceneInvalidFlush => EngineTaskErrorSeverity::Flush,
Self::MissingPayloadId => EngineTaskErrorSeverity::Critical,
Self::UnexpectedPayloadStatus(_) => EngineTaskErrorSeverity::Critical,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these variants were either unused or outdated

update
.payload_id
.ok_or(BuildTaskError::EngineBuildError(EngineBuildError::MissingPayloadId))
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)) if self.attributes.is_deposits_only() => {
Err(InsertTaskError::UnexpectedPayloadStatus(e))
if self.attributes.is_deposits_only() =>
{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug in the refactor. The error we should match on should be raised when we first try to call new_payload inside the InsertTask. See https://github.com/op-rs/kona/pull/2388/files#diff-47bc5e44b8483f9c829c8829f56e6f5b6d9211888f53f7afe36b44c185515ccbR169-R220 (for the former build code), and

// Check the `engine_newPayload` response.
let response = match response {
Ok(resp) => resp,
Err(e) => return Err(InsertTaskError::InsertFailed(e)),
};
if !self.check_new_payload_status(&response.status) {
return Err(InsertTaskError::UnexpectedPayloadStatus(response.status));
}
let insert_duration = insert_time_start.elapsed();
(for the counterpart inside the InsertTask)

@@ -134,7 +134,6 @@ impl EngineTaskExt for InsertTask {
safe_head: self.is_payload_safe.then_some(new_unsafe_ref),
..Default::default()
},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s => {
// Other codes are not expected.
Err(SynchronizeTaskError::UnexpectedPayloadStatus(s.clone()))
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case we should return a Critical error. The forkchoice_updated method shouldn't ever return the other status enum if we're not providing payloads

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only critical if the error is actually invalid. Accepted is fine. We should be very explicit about what is a critical error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/alloy-rs/alloy/blob/ff758f86622c343457c4fe856a88033819f11d6e/crates/rpc-types-engine/src/payload.rs#L1807-L1830.
I don't think is dramatic if we're Temporary instead of Critical here, although the error is actually invalid in that case because the client should never return Accepted inside the forkchoice_update method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I am also a bit unsure about returning a critical error here because we might also get affected if there is any change to the downstream api (if alloy ever decide to return Accepted inside forkchoice_update)

Self::FinalizedAheadOfUnsafe(_, _) => EngineTaskErrorSeverity::Critical,
Self::UnexpectedPayloadStatus(_) => EngineTaskErrorSeverity::Critical,
Self::ForkchoiceUpdateFailed(_) => EngineTaskErrorSeverity::Temporary,
Self::UnexpectedPayloadStatus(_) => EngineTaskErrorSeverity::Temporary,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below for the UnexpectedPayloadStatus that should be Critical

@theochap theochap force-pushed the theo/ensure-engine-paths-match branch 2 times, most recently from 6877a48 to d8f2d59 Compare July 23, 2025 00:25
#[async_trait]
impl EngineTaskExt for SynchronizeTask {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like how we're still defining this operation as a task.

To me a task is an isolated unit or work that's enqueued, but this is something that is composed within other tasks.

Maybe these operations should be refactored into methods on the EngineState and EngineClient and just inlined. This is definitely worth thinking more about because it is incredibly confusing for someone to look at the task abstraction and then this synchronize task not being used properly like the other tasks, but rather being composed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think we can work on that in a follow-up PR. I would do as you said and just add methods on a Synchronize struct and not implement the EngineTaskExt anymore

@theochap theochap force-pushed the theo/ensure-engine-paths-match branch 2 times, most recently from 78b2a92 to 8cabd01 Compare July 23, 2025 19:38
…ynchronize task. move block building logic to build task
@theochap theochap force-pushed the theo/ensure-engine-paths-match branch from 8cabd01 to eac943c Compare July 25, 2025 15:22
@theochap theochap enabled auto-merge July 25, 2025 15:24
@theochap theochap disabled auto-merge July 25, 2025 15:24
@theochap theochap enabled auto-merge July 25, 2025 15:32
@theochap theochap added this pull request to the merge queue Jul 25, 2025
Merged via the queue into main with commit c725985 Jul 25, 2025
30 checks passed
@theochap theochap deleted the theo/ensure-engine-paths-match branch July 25, 2025 15:51
@github-project-automation github-project-automation bot moved this from In Review to Done in Project Tracking Jul 25, 2025
theochap added a commit to ethereum-optimism/optimism that referenced this pull request Dec 10, 2025
…nchronize task. move block building logic to build task (op-rs/kona#2524)

## Description

Attempt at cleaning-up the engine and restoring the behavior from
pre-refactor https://github.com/op-rs/kona/pull/2388/files.
In particular:
- Split up the forkchoice task into separated build and synchronize
logic. The synchronize logic is now closer to the pre op-rs/kona#2388 state
- Restore the build FCU method in the build task. Ensure that the
side-effects match the pre-#2388 state
- Audit the other tasks and ensure the composition (if any), follow the
side-effects from op-rs/kona#2388
- Remove the `Forkchoice` task from the `EngineTask`s.
theochap added a commit to ethereum-optimism/optimism that referenced this pull request Jan 14, 2026
…nchronize task. move block building logic to build task (op-rs/kona#2524)

## Description

Attempt at cleaning-up the engine and restoring the behavior from
pre-refactor https://github.com/op-rs/kona/pull/2388/files.
In particular:
- Split up the forkchoice task into separated build and synchronize
logic. The synchronize logic is now closer to the pre #2388 state
- Restore the build FCU method in the build task. Ensure that the
side-effects match the pre-#2388 state
- Audit the other tasks and ensure the composition (if any), follow the
side-effects from #2388
- Remove the `Forkchoice` task from the `EngineTask`s.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A-engine Area: engine A-node Area: cl node (eq. Go op-node) handles single-chain consensus K-fix Kind: fix

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants