Skip to content
This repository was archived by the owner on Jan 16, 2026. It is now read-only.

fix(engine): Prevent drain starvation#1853

Merged
clabby merged 1 commit intomainfrom
cl/prevent-drain-starvation
May 23, 2025
Merged

fix(engine): Prevent drain starvation#1853
clabby merged 1 commit intomainfrom
cl/prevent-drain-starvation

Conversation

@clabby
Copy link
Contributor

@clabby clabby commented May 23, 2025

Overview

Alternative version of #1851, which keeps the BinaryHeap priority queue and achieves the same goal of not starving the engine actor control loop.

@clabby clabby self-assigned this May 23, 2025
Copilot AI review requested due to automatic review settings May 23, 2025 21:55
@clabby clabby requested a review from refcell as a code owner May 23, 2025 21:55
@clabby clabby added the K-fix Kind: fix label May 23, 2025
@clabby clabby requested review from emhane and theochap as code owners May 23, 2025 21:55
@clabby clabby added W-node Workstream: kona-node A-engine Area: engine labels May 23, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a mechanism to prevent drain from starving the engine’s control loop by waiting for new tasks before invoking drain. Key changes include:

  • New TaskWaiter future and Engine::wait_for_tasks() to await incoming tasks.
  • Updated NodeActor loop in engine.rs to use wait_for_tasks() instead of unconditional drain.
  • Exposed TaskWaiter in public APIs and adjusted re-exports.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
crates/node/service/src/actors/engine.rs Replaced the final unconditional drain() branch with a wait_for_tasks() select branch, moving drain into that branch.
crates/node/engine/src/task_queue/mod.rs Re-exported TaskWaiter alongside existing types.
crates/node/engine/src/task_queue/core.rs Added TaskWaiter type, Engine::wait_for_tasks() method, and its Future implementation.
crates/node/engine/src/lib.rs Updated top-level exports to include TaskWaiter.
Comments suppressed due to low confidence (2)

crates/node/engine/src/task_queue/core.rs:155

  • The doc comment states it resolves after “more than one task” arrives, but the implementation actually fires when any task exists. Update the comment to accurately reflect that it resolves when the queue becomes non-empty.
/// A [`Future`] that resolves once more than one task has arrived in the inner [`Engine`] task queue.

crates/node/engine/src/task_queue/core.rs:128

  • There are no tests for TaskWaiter to verify its wake behavior. Consider adding unit tests that enqueue tasks and assert TaskWaiter completes only when expected.
pub const fn wait_for_tasks(&self) -> TaskWaiter<'_> {

@codecov
Copy link

codecov bot commented May 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.7%. Comparing base (39eb2b5) to head (6297d6c).
Report is 1 commits behind head on main.

✅ All tests successful. No failed tests found.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@clabby clabby force-pushed the cl/prevent-drain-starvation branch 2 times, most recently from a50825b to 0403b70 Compare May 23, 2025 22:31
chore(engine): Move `drain` above `select`
@clabby clabby force-pushed the cl/prevent-drain-starvation branch from 0403b70 to 6297d6c Compare May 23, 2025 22:38
@clabby clabby enabled auto-merge May 23, 2025 22:52
@clabby clabby added this pull request to the merge queue May 23, 2025
Merged via the queue into main with commit f435ea2 May 23, 2025
22 checks passed
@clabby clabby deleted the cl/prevent-drain-starvation branch May 23, 2025 23:13
@github-project-automation github-project-automation bot moved this from In Review to Done in Project Tracking May 23, 2025
github-merge-queue bot pushed a commit that referenced this pull request May 25, 2025
## Description

This PR adds a test to ensure that nodes sync properly the safe heads
and have a consistent view of the state.
For each kona-node, this test:
- Gathers the node's safe head gossip over 10sec
- Checks that all the other nodes that have seen all these safe heads
have a consistent view with the gossiped safe heads.

For now, we only have positive tests so we're lacking some coverage
(what happens with rogue nodes, network failures, reorgs, ...). These
cases will come once we adapt the integration testing suite to use
`in-memory` mode.

## Note

Adds back the real "large network configuration" to CI now that we have
fixed CPU usage of the kona nodes #1853

## Related issues

Close #1463
theochap pushed a commit to ethereum-optimism/optimism that referenced this pull request Dec 10, 2025
## Overview

Alternative version of op-rs/kona#1851, which keeps the `BinaryHeap` priority
queue and achieves the same goal of not starving the engine actor
control loop.
theochap added a commit to ethereum-optimism/optimism that referenced this pull request Dec 10, 2025
## Description

This PR adds a test to ensure that nodes sync properly the safe heads
and have a consistent view of the state.
For each kona-node, this test:
- Gathers the node's safe head gossip over 10sec
- Checks that all the other nodes that have seen all these safe heads
have a consistent view with the gossiped safe heads.

For now, we only have positive tests so we're lacking some coverage
(what happens with rogue nodes, network failures, reorgs, ...). These
cases will come once we adapt the integration testing suite to use
`in-memory` mode.

## Note

Adds back the real "large network configuration" to CI now that we have
fixed CPU usage of the kona nodes op-rs/kona#1853

## Related issues

Close op-rs/kona#1463
theochap pushed a commit to ethereum-optimism/optimism that referenced this pull request Jan 14, 2026
## Overview

Alternative version of #1851, which keeps the `BinaryHeap` priority
queue and achieves the same goal of not starving the engine actor
control loop.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A-engine Area: engine K-fix Kind: fix W-node Workstream: kona-node

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants