Skip to content

fix(engine): defer changeset static file truncation during reorgs#23291

Closed
joshieDo wants to merge 8 commits into
mainfrom
joshie/defer-changeset-prune
Closed

fix(engine): defer changeset static file truncation during reorgs#23291
joshieDo wants to merge 8 commits into
mainfrom
joshie/defer-changeset-prune

Conversation

@joshieDo
Copy link
Copy Markdown
Collaborator

@joshieDo joshieDo commented Mar 30, 2026

Defers changeset static file truncation during RemoveBlocksAbove to the next SaveBlocks, preventing stale mmap panics when concurrent readers hold old memory-mapped handles.

Waits for all MDBX readers from the reorg era to drain (via oldest_reader_txnid) before applying the deferred truncation.


Follow-up: #23305 (pins RocksDB snapshot for cross-store read consistency, stacked on this PR)

During RemoveBlocksAbove, changeset static file prune strategies are now
deferred to the next SaveBlocks commit. This prevents truncating
memory-mapped files while concurrent readers (payload builders, RPC) may
still hold stale handles, which caused a panic in NippyJar from reading
garbage offsets on the truncated mmap.

Headers, transactions, and receipts are still truncated immediately.
Only changeset prunes are deferred since those are the segments read
concurrently during state root computation.

Amp-Thread-ID: https://ampcode.com/threads/T-019d3ffb-22ed-70d4-ad5c-23eeed6f0ad7
Co-authored-by: Amp <amp@ampcode.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 30, 2026

⚠️ Changelog not found.

A changelog entry is required before merging. We've generated a suggested changelog based on your changes:

Preview
---
reth-engine-tree: patch
reth-db-api: minor
reth-db: patch
reth-libmdbx: minor
reth-provider: minor
---

Added deferred changeset static file truncation during reorgs to prevent truncating memory-mapped files while concurrent readers (payload builders, RPC) may still hold stale handles. Added `oldest_reader_txnid` and `last_txnid` to the `Database` trait and implemented them for MDBX, and added `take_changeset_prunes`/`requeue_changeset_prunes` to `StaticFileWriter` to support deferring and re-applying changeset prunes after stale readers drain.

Add changelog to commit this to your branch.

Comment on lines +157 to +161
debug_assert_eq!(
account, storage,
"account and storage changeset prunes must target the same block"
);
account.or(storage)
Copy link
Copy Markdown
Collaborator Author

@joshieDo joshieDo Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be the same block, unsure if a new type for both is necessary

@joshieDo joshieDo marked this pull request as ready for review March 30, 2026 19:42
@joshieDo joshieDo added C-bug An unexpected or incorrect behavior A-db Related to the database A-engine Related to the engine implementation labels Mar 30, 2026
joshieDo and others added 2 commits March 30, 2026 20:51
… changeset prune

Adds oldest_reader_txnid() and last_txnid() to the Database trait,
backed by MDBX's mi_latter_reader_txnid and mi_recent_txnid.

Before applying a deferred changeset prune in SaveBlocks, the
persistence thread now spins until all MDBX readers from the reorg
era have completed. This ensures no reader holds a stale mmap handle
when the changeset files are truncated.

Amp-Thread-ID: https://ampcode.com/threads/T-019d3ffb-22ed-70d4-ad5c-23eeed6f0ad7
Co-authored-by: Amp <amp@ampcode.com>
@joshieDo joshieDo requested a review from gakonst as a code owner March 31, 2026 12:08
@joshieDo joshieDo changed the title fix(engine): defer changeset static file truncation during reorgs fix(storage): pin RocksDB snapshot for cross-store read consistency Mar 31, 2026
@joshieDo joshieDo marked this pull request as draft March 31, 2026 12:09
@joshieDo joshieDo force-pushed the joshie/defer-changeset-prune branch 3 times, most recently from c6a5cab to d1ecfa7 Compare March 31, 2026 12:27
@joshieDo joshieDo changed the title fix(storage): pin RocksDB snapshot for cross-store read consistency fix(engine): defer changeset static file truncation during reorgs Mar 31, 2026
@joshieDo joshieDo marked this pull request as ready for review March 31, 2026 17:09
Copy link
Copy Markdown
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly nits, based on my understanding of the problem this fix should work. I think we should add some docs explaining why we would not get any new readers by the time we call save_blocks and wait for any final readers to complete

Comment on lines 409 to 413
/// Last transaction ID
#[inline]
pub const fn last_txnid(&self) -> usize {
self.0.mi_recent_txnid as usize
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we document that this is the most recent txn id?

///
/// This only restores the queued prune-on-commit state. Safety still depends on the caller
/// having waited for stale MDBX readers to drain before the next commit executes the prune.
pub(crate) fn requeue_changeset_prunes(&self, last_block: BlockNumber) -> ProviderResult<()> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename this from requeue to maybe complete or something that mentions that this finalizes / flushes the queued prunes?

/// commit.
///
/// This is used by the deferred reorg-prune flow in the persistence service. Callers must
/// only reapply the returned prune after stale MDBX readers from the pre-unwind era have
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// only reapply the returned prune after stale MDBX readers from the pre-unwind era have
/// only apply the returned prune after stale MDBX readers from the pre-unwind era have

/// concurrent readers may still hold handles. The prune is applied at the start of the next
/// `SaveBlocks`, after waiting for all MDBX readers from the reorg era to drain.
#[derive(Debug)]
struct DeferredChangesetPrune {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets please not add private types to top of file :)

Comment on lines +223 to +224
.oldest_reader_txnid()
.is_some_and(|oldest| oldest < prune_txn)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this perhaps too relaxed?

this would also include rpc related transactions?
and there's a risk that an ethgetproof or adjacent tx will be open for a while?
should we either introduce a counter, that after like 3 blocks or so we always do this?

@joshieDo joshieDo closed this Apr 2, 2026
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Reth Tracker Apr 2, 2026
@emmajam emmajam deleted the joshie/defer-changeset-prune branch May 1, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-db Related to the database A-engine Related to the engine implementation C-bug An unexpected or incorrect behavior

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants