Skip to content

perf: release execution cache lock before waiting for state root validation#22138

Closed
yongkangc wants to merge 1 commit intomainfrom
yk/perf-save-cache-no-block
Closed

perf: release execution cache lock before waiting for state root validation#22138
yongkangc wants to merge 1 commit intomainfrom
yk/perf-save-cache-no-block

Conversation

@yongkangc
Copy link
Member

Summary
Traced via Grafana Tempo that save_cache is the #1 bottleneck in the new_payload pipeline, blocking for ~95-131ms (up to 80% of NP time) while holding the ExecutionCache mutex. The root cause is that valid_block_rx.recv() (waiting for state root validation) is called inside update_with_guard, holding the lock for the entire state root duration. This blocks the next block's prewarming from accessing the cache.

Split save_cache from a single locked phase into three phases:

  1. Phase 1 (locked, ~1ms): consume the SavedCache, run insert_state, and optimistically publish the warmed cache
  2. Phase 2 (unlocked, ~100ms+): wait for valid_block_rx without holding the lock — the next block's prewarming can start immediately
  3. Phase 3 (locked, only if invalid): re-acquire the lock and clear the cache only if it still belongs to this block (hash guard prevents clearing a newer valid cache from a subsequent block)

Changes

  • Restructured save_cache to release the ExecutionCache lock before blocking on valid_block_rx.recv()
  • Added hash-guarded rollback: on invalid blocks, only clears cache if executed_block_hash() == hash

Expected Impact
Reduces execution cache lock hold time from ~100ms+ to ~1ms per block, unblocking prewarming for subsequent blocks and improving overall NP throughput.

@yongkangc yongkangc added C-perf A change motivated by improving speed, memory usage or disk footprint A-engine Related to the engine implementation labels Feb 12, 2026
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Feb 12, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 12, 2026

⚠️ Changelog not found.

A changelog entry is required before merging. We've generated a suggested changelog based on your changes:

Preview
---
reth-engine-tree: patch
---

Fixed cache lock contention by deferring state root validation to after lock release in payload prewarming, reducing blocking time for concurrent operations.

Add changelog to commit this to your branch.

…dation

Previously, save_cache held the ExecutionCache mutex for the entire
duration of state root validation (~95-131ms), blocking the next
block's prewarming from accessing the cache.

Split save_cache into three phases:
1. Phase 1 (locked, ~1ms): insert_state and optimistically publish
   the warmed cache
2. Phase 2 (unlocked): wait for valid_block_rx without holding the
   lock, allowing the next block's prewarming to proceed
3. Phase 3 (locked, only if invalid): re-acquire lock and clear cache
   only if it still belongs to this block

Amp-Thread-ID: https://ampcode.com/threads/T-019c5345-8a04-75be-bb4a-cf30b28fe68d
@yongkangc yongkangc force-pushed the yk/perf-save-cache-no-block branch from 96947d7 to a6b34f2 Compare February 12, 2026 20:15
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure I reviewed the same change somewhere else

this change is problematic because this doesnt guard against re-use for alternative blocks, e.g. we just executed N + 1 == A what happens when we get a different A' with N as parent.

now the cache can already point to A but we haven't adjusted the anchor yet

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the assessment here is correct.

this change doesnt make it faster overall and just moves around the locking which is now problematic

the meassurements of

debug!(target: "engine::caching", parent_hash=?hash, elapsed=?elapsed, "Updated execution cache");

are flawed because this includes both cache updates + waiting until the block was validated

Comment on lines +241 to +245
// Wait for state root validation WITHOUT holding the cache lock.
// This is the key optimization: the original code held the lock across this
// blocking recv(), which blocked the next block's prewarming from accessing
// the cache for ~100ms+.
if valid_block_rx.recv().is_err() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt this unproblematic anyways, because this fires once we have validated the block and the remaining ops here are cheap

Comment on lines +217 to +219
/// saved cache, inserts state, and publishes under a brief write lock. This avoids
/// the ~100ms+ lock hold that previously blocked concurrent readers during
/// `valid_block_rx.recv()`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are no concurrent readers here, because the main exec thread is busy validating the block

@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Feb 14, 2026
@mattsse mattsse closed this Feb 14, 2026
@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Tracker Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-engine Related to the engine implementation C-perf A change motivated by improving speed, memory usage or disk footprint

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants