Skip to content

perf(l1): set high thread priority on block execution pipeline threads#6300

Closed
ilitteri wants to merge 2 commits into
mainfrom
perf/thread-priority
Closed

perf(l1): set high thread priority on block execution pipeline threads#6300
ilitteri wants to merge 2 commits into
mainfrom
perf/thread-priority

Conversation

@ilitteri
Copy link
Copy Markdown
Collaborator

@ilitteri ilitteri commented Mar 3, 2026

Motivation

The block execution pipeline (warmer, executor, merkleizer) runs on scoped threads that compete with background work (networking, syncing, RocksDB compaction) for CPU time. Setting high thread priority ensures these latency-critical threads get scheduled preferentially by the OS.

Description

  • Add thread-priority crate dependency
  • Add increase_thread_priority() helper that attempts max priority with a fallback to priority 62/100
  • Call it at the start of all three pipeline threads: block_executor_warmer, block_executor_execution, and block_executor_merkleizer
  • Failures are logged at debug level and silently ignored (non-critical optimization)

How to Test

  • cargo check passes
  • Run block import and verify pipeline threads get elevated priority: ps -eo pid,ni,comm | grep block_executor
  • Benchmark block import throughput (expected: small improvement on loaded systems)

ilitteri added 2 commits March 2, 2026 23:31
…hanges

The rebase onto main broke this branch because it had warmer-cancellation
code mixed in (cancelled AtomicBool, extra params to warm_block). Remove
those changes since they belong to perf/warmer-cancellation, and fix
tracing::debug! to use the already-imported debug! macro.
Copilot AI review requested due to automatic review settings March 3, 2026 21:25
@ilitteri ilitteri requested a review from a team as a code owner March 3, 2026 21:25
@github-actions github-actions Bot added L1 Ethereum client performance Block execution throughput and performance in general labels Mar 3, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

🤖 Kimi Code Review

Review Summary

The PR introduces thread priority management for block processing threads. While the intent is reasonable (improving performance during sync), there are several issues that need addressing:

Critical Issues

  1. Unbounded expect() on line 233 - This will panic if the platform doesn't support the priority value range. Use unwrap_or_else with proper error handling.

  2. Silent failures - Both priority setting attempts only log debug messages on failure. This could lead to performance degradation without operators noticing. Consider warning-level logging.

  3. Cross-platform compatibility - ThreadPriority::Max may not be supported on all platforms. The fallback assumes a 0-100 range which isn't guaranteed.

Security Considerations

  • Denial of Service: Elevated thread priorities could starve other system processes. Consider making this configurable via CLI flags or config files.
  • Resource exhaustion: No validation of how many threads might have elevated priority simultaneously.

Suggested Changes

// blockchain.rs:225-240
fn set_block_processing_priority() -> Result<(), String> {
    use thread_priority::{ThreadPriority, ThreadPriorityValue};
    
    // Make priority configurable
    let priority = std::env::var("ETHREX_THREAD_PRIORITY")
        .ok()
        .and_then(|s| s.parse::<u8>().ok())
        .map(|p| ThreadPriority::Crossplatform(
            ThreadPriorityValue::try_from(p).unwrap_or(ThreadPriorityValue::MAX)
        ))
        .unwrap_or(ThreadPriority::Max);
    
    priority.set_for_current()
        .map_err(|e| format!("Failed to set thread priority: {}", e))
}

// Then use it with proper error handling:
if let Err(e) = set_block_processing_priority() {
    warn!("{}", e);
}

Additional Recommendations

  1. Add configuration: Expose thread priority as a configurable option rather than hardcoded values
  2. Platform detection: Check platform support before attempting priority changes
  3. Documentation: Add comments explaining why thread priority is being increased
  4. Metrics: Add metrics to track when priority changes succeed/fail

The current implementation is too aggressive and could cause issues in production environments.


Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

🤖 Codex Code Review

  1. Potential local DoS / scheduling starvation risk at crates/blockchain/blockchain.rs:228, crates/blockchain/blockchain.rs:431, crates/blockchain/blockchain.rs:452, crates/blockchain/blockchain.rs:483
    The PR sets ThreadPriority::Max on all three per-block worker threads (warmer, execution, merkleizer). During sync, this can run continuously on untrusted network input, potentially starving networking/consensus/IO threads and degrading node liveness on some hosts.
    Suggested change: make priority bump configurable/opt-in and avoid Max by default (use a moderate priority class), or move to a bounded worker pool where priority is set once and controlled.

  2. Avoid panic path in consensus-critical hot path at crates/blockchain/blockchain.rs:233
    ThreadPriorityValue::try_from(62u8).expect(...) can panic if crate/platform validation semantics ever differ. Panics in block-processing threads are undesirable.
    Suggested change: handle this fallibly (no expect) and log/skip priority bump on error.

No EVM opcode/gas/state-transition logic changed in this diff; concerns are operational robustness and security posture around thread scheduling.


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

Lines of code report

Total lines added: 15
Total lines removed: 0
Total lines changed: 15

Detailed view
+----------------------------------------+-------+------+
| File                                   | Lines | Diff |
+----------------------------------------+-------+------+
| ethrex/crates/blockchain/blockchain.rs | 2185  | +15  |
+----------------------------------------+-------+------+

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Mar 3, 2026

Greptile Summary

This PR elevates OS thread priority for the three block execution pipeline threads (block_executor_warmer, block_executor_execution, block_executor_merkleizer) to reduce CPU scheduling latency relative to background work (networking, syncing, RocksDB compaction). It mirrors the approach taken in reth #22541.

Key changes:

  • Adds thread-priority = "3.0.0" as a workspace dependency and wires it into crates/blockchain.
  • Introduces a private increase_thread_priority() helper that first attempts ThreadPriority::Max and falls back to ThreadPriority::Crossplatform(62/100) on failure; all errors are logged at debug level and silently ignored.
  • Calls the helper at the very start of each of the three scoped pipeline thread closures.

One style note: The expect used to construct the fallback ThreadPriorityValue from the constant 62u8 is semantically inconsistent with the function's overall error-handling philosophy of silently ignoring all failures. Replacing it with a match that logs and returns would make the function uniformly non-panicking.

Confidence Score: 4/5

  • Safe to merge; the change is a non-critical best-effort optimisation and all failure paths are already handled gracefully.
  • The implementation is small, well-scoped, and follows an established pattern (reth). The only concern is a stylistic inconsistency where expect is used inside a function that otherwise silently absorbs all errors. Since 62u8 is a compile-time constant always within the valid range this will never panic in practice, but it is worth fixing for correctness-by-construction.
  • crates/blockchain/blockchain.rs — the expect in increase_thread_priority on line 233 should be replaced with graceful error handling consistent with the rest of the function.

Important Files Changed

Filename Overview
crates/blockchain/blockchain.rs Adds increase_thread_priority() helper and calls it at the start of all three pipeline thread closures (warmer, execution, merkleizer). Logic is sound; minor style issue: expect on the fallback ThreadPriorityValue::try_from(62u8) is inconsistent with the function's design to silently ignore all failures.
Cargo.toml Adds thread-priority = "3.0.0" to the workspace dependency manifest. Version is pinned; placement is appropriate.
crates/blockchain/Cargo.toml Wires thread-priority.workspace = true into the blockchain crate's dependencies. No issues.

Sequence Diagram

sequenceDiagram
    participant Caller as Blockchain::add_block
    participant Scope as thread::scope
    participant W as block_executor_warmer
    participant E as block_executor_execution
    participant M as block_executor_merkleizer
    participant OS as OS Scheduler

    Caller->>Scope: spawn scoped threads
    Scope->>W: spawn
    W->>OS: increase_thread_priority()<br/>(Max → fallback 62/100)
    OS-->>W: priority elevated (or silently ignored)
    W->>W: warm_block / warm_block_from_bal

    Scope->>E: spawn
    E->>OS: increase_thread_priority()<br/>(Max → fallback 62/100)
    OS-->>E: priority elevated (or silently ignored)
    E->>M: send merkleization requests (channel)
    E->>E: execute_block_pipeline + validations

    Scope->>M: spawn
    M->>OS: increase_thread_priority()<br/>(Max → fallback 62/100)
    OS-->>M: priority elevated (or silently ignored)
    M->>M: handle_merkleization / handle_merkleization_bal

    W-->>Scope: warmer_duration
    E-->>Scope: (execution_result, exec_end_instant)
    M-->>Scope: (account_updates_list, accumulated_updates, merkle_end_instant)
    Scope-->>Caller: (execution_result, merkleization_result, warmer_duration)
Loading

Last reviewed commit: 3796e75

Comment on lines +232 to +237
let fallback = ThreadPriority::Crossplatform(
ThreadPriorityValue::try_from(62u8).expect("62 is within valid 0..100 range"),
);
if let Err(err) = fallback.set_for_current() {
debug!(?err, "failed to set moderate thread priority");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect inconsistent with best-effort error-handling philosophy

increase_thread_priority is designed to silently absorb all failures (errors are logged at debug level and ignored), but the inner expect introduces a panic path. Although 62u8 is a compile-time constant that will always satisfy the 0..=99 range check, the use of expect is semantically inconsistent with the function's intent. If the fallback value were ever changed to an out-of-range constant, or if a future platform changes the valid range, this would panic in production code that is supposed to be non-critical.

Consider handling the try_from result the same way the rest of the function does:

Suggested change
let fallback = ThreadPriority::Crossplatform(
ThreadPriorityValue::try_from(62u8).expect("62 is within valid 0..100 range"),
);
if let Err(err) = fallback.set_for_current() {
debug!(?err, "failed to set moderate thread priority");
}
let fallback = match ThreadPriorityValue::try_from(62u8) {
Ok(val) => ThreadPriority::Crossplatform(val),
Err(err) => {
debug!(?err, "invalid fallback thread priority value");
return;
}
};
if let Err(err) = fallback.set_for_current() {
debug!(?err, "failed to set moderate thread priority");
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/blockchain.rs
Line: 232-237

Comment:
**`expect` inconsistent with best-effort error-handling philosophy**

`increase_thread_priority` is designed to silently absorb all failures (errors are logged at `debug` level and ignored), but the inner `expect` introduces a `panic` path. Although `62u8` is a compile-time constant that will always satisfy the `0..=99` range check, the use of `expect` is semantically inconsistent with the function's intent. If the fallback value were ever changed to an out-of-range constant, or if a future platform changes the valid range, this would panic in production code that is supposed to be non-critical.

Consider handling the `try_from` result the same way the rest of the function does:

```suggestion
        let fallback = match ThreadPriorityValue::try_from(62u8) {
            Ok(val) => ThreadPriority::Crossplatform(val),
            Err(err) => {
                debug!(?err, "invalid fallback thread priority value");
                return;
            }
        };
        if let Err(err) = fallback.set_for_current() {
            debug!(?err, "failed to set moderate thread priority");
        }
```

How can I resolve this? If you propose a fix, please make it concise.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

🤖 Claude Code Review

Here is the code review for PR #6300:


Review: perf(l1): set high thread priority on block execution pipeline threads

The approach is clean, minimal, and the graceful degradation design (try max → try moderate → silently ignore) is appropriate for a non-critical optimization. A few points worth flagging:


Correctness of ThreadPriorityValue::try_from(62u8)

crates/blockchain/blockchain.rs:233

The .expect("62 is within valid 0..100 range") is correct — ThreadPriorityValue accepts 0..=99, and 62 is always in range. Using .expect() on a compile-time constant is fine here and the comment justifies it clearly.


Privilege requirement will silently make this a no-op in most deployments

On Linux, both priority attempts require elevated privileges:

  • ThreadPriority::Max tries SCHED_FIFO/SCHED_RR at priority 99, which requires CAP_SYS_NICE or root.
  • ThreadPriority::Crossplatform(62) maps to approximately nice -5 (negative), which also requires CAP_SYS_NICE or a RLIMIT_NICE limit set below the default.

On a typical production server running ethrex as an unprivileged user, both calls fail and the function is effectively a no-op. This is handled correctly (debug log + ignore), but it's worth noting in the PR description or a code comment that the optimization only takes effect when running with CAP_SYS_NICE (e.g., in Docker via --cap-add=SYS_NICE, or via systemd's AmbientCapabilities=CAP_SYS_NICE). Otherwise users enabling this expecting an improvement won't see one without knowing why.


Inconsistent coverage: worker threads inside the pipeline are not elevated

crates/blockchain/blockchain.rs:572-575block_executor_merkleization_shard_worker_{i} (16 threads)
crates/blockchain/blockchain.rs:860-863bal_storage_worker_{worker_id}
crates/blockchain/blockchain.rs:965-967bal_state_shard_{index}

Thread priority is not inherited by child threads on Linux. These worker threads are spawned from within the pipeline threads that do call increase_thread_priority(), but they themselves run at normal priority. For example, the 16 merkleization shard workers do actual trie work on the critical path during block import. If the goal is to preferentially schedule the entire block execution pipeline, these threads should also call increase_thread_priority().

The three top-level pipeline threads (warmer, execution, merkleizer) are covered, but the sub-workers they spawn are not.


Dependency placement

Adding thread-priority to the workspace Cargo.toml and only pulling it into crates/blockchain/Cargo.toml is correct. No issues there.


Summary

Design Good — best-effort with clean fallback
Correctness Correct for the threads it covers
Gap Worker sub-threads (shard_worker, bal_storage_worker, bal_state_shard) should also call increase_thread_priority() for consistent coverage
Usability Consider documenting the CAP_SYS_NICE requirement so operators know when to expect this to take effect

Automated review by Claude (Anthropic) · custom prompt

@ilitteri
Copy link
Copy Markdown
Collaborator Author

ilitteri commented Mar 3, 2026

Recreating with updated description.

@ilitteri ilitteri closed this Mar 3, 2026
@github-project-automation github-project-automation Bot moved this to Done in ethrex_l1 Mar 3, 2026
@github-project-automation github-project-automation Bot moved this from Todo to Done in ethrex_performance Mar 3, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce block import latency under CPU contention by elevating OS thread priority for the L1 block execution pipeline’s scoped threads (warmer, executor, merkleizer).

Changes:

  • Adds the thread-priority crate as a workspace dependency and enables it in ethrex-blockchain.
  • Introduces an increase_thread_priority() helper that attempts max priority with a cross-platform fallback.
  • Calls the helper at the start of the warmer/execution/merkleizer scoped threads.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
crates/blockchain/blockchain.rs Adds the priority-bump helper and applies it to the three pipeline threads.
crates/blockchain/Cargo.toml Adds thread-priority to the blockchain crate dependencies via workspace.
Cargo.toml Adds thread-priority to [workspace.dependencies] with a pinned version.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Cargo.toml
indexmap = { version = "2.11.4" }
k256 = "0.13.4"
anyhow = "1.0.86"
thread-priority = "3.0.0"
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thread-priority was added to workspace dependencies, but Cargo.lock was not updated (it currently has no thread-priority entries). CI runs make check-cargo-lock (uses cargo metadata --locked), so this will fail unless you regenerate and commit the updated lockfile(s) (e.g., via make update-cargo-lock).

Suggested change
thread-priority = "3.0.0"

Copilot uses AI. Check for mistakes.
Comment on lines +232 to +234
let fallback = ThreadPriority::Crossplatform(
ThreadPriorityValue::try_from(62u8).expect("62 is within valid 0..100 range"),
);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper is intended to be a non-critical optimization that logs and continues on failure, but it contains an expect(...) when building the fallback priority value. Please avoid panicking here; handle the try_from result explicitly (and log/return if it ever fails) so thread priority adjustment can never crash the pipeline threads.

Suggested change
let fallback = ThreadPriority::Crossplatform(
ThreadPriorityValue::try_from(62u8).expect("62 is within valid 0..100 range"),
);
let fallback_value = match ThreadPriorityValue::try_from(62u8) {
Ok(value) => value,
Err(err) => {
debug!(?err, "failed to build fallback thread priority value");
return;
}
};
let fallback = ThreadPriority::Crossplatform(fallback_value);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client performance Block execution throughput and performance in general

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants