Skip to content

fix(prune): count unique addresses instead of changesets in RocksDB pruner limiter#21694

Closed
gakonst wants to merge 2 commits intomainfrom
fix/reth-292-pruner-limiter-accounting
Closed

fix(prune): count unique addresses instead of changesets in RocksDB pruner limiter#21694
gakonst wants to merge 2 commits intomainfrom
fix/reth-292-pruner-limiter-accounting

Conversation

@gakonst
Copy link
Member

@gakonst gakonst commented Feb 2, 2026

Summary

Fixes RETH-292: The RocksDB pruning path was incrementing the limiter for every changeset scanned from static files, not for each unique address that corresponds to actual RocksDB shard work.

Problem

The pruner limiter counted changesets scanned instead of RocksDB operations performed:

// Before: increments for every changeset
for result in walker {
    // ...
    limiter.increment_deleted_entries_count();  // Called 7k times per block!
}

With ~7k changesets/block but fewer unique addresses, the limiter stopped prematurely.

Solution

Use HashMap::entry() to only increment the limiter on first occurrence of each unique key:

// After: increments only for new unique addresses
match highest_deleted_accounts.entry(changeset.address) {
    Entry::Vacant(v) => {
        v.insert(block_number);
        limiter.increment_deleted_entries_count(); // Only for new addresses
    }
    Entry::Occupied(mut o) => {
        if block_number > *o.get() {
            o.insert(block_number);
        }
    }
}

Test Results

Scenario Before (bug) After (fix)
10k changesets, 10 unique addresses, limit=100 Stops at <1 block Processes all 10 blocks
=== RETH-292 Fix Verified ===
Limit: 100
Changesets per block: 1000
Unique addresses (RocksDB shards): 10
Blocks processed: 9
Finished: true

FIX: Limiter now counts 10 unique addresses (shards), not 10000 changesets.

Changes

  • account_history.rs: Increment limiter only on new unique addresses
  • storage_history.rs: Increment limiter only on new unique (address, slot) pairs
  • Added regression test that verifies the fix

All prune tests pass

test result: ok. 39 passed; 0 failed; 0 ignored

Adds a test that demonstrates the RocksDB pruning limiter issue where
the limiter increments per changeset scanned from static files, not
per RocksDB shard actually modified.

The test creates a scenario with:
- 10 blocks × 1000 changesets = 10,000 total changesets
- But only 10 unique addresses (high repetition)
- Limit set to 100

Expected behavior (after fix):
- 10 unique addresses = 10 RocksDB shard operations
- Should process all 10 blocks since 10 < 100 limit

Current behavior (bug):
- Stops after ~100 changesets scanned (< 1 block)
- Because limiter counts input scans, not output work

This test documents the current buggy behavior. After fixing the
limiter to count unique addresses instead of changesets, the test
assertion should be updated to verify all blocks are processed.

Amp-Thread-ID: https://ampcode.com/threads/T-019c1ca0-1067-7584-a10b-649ca5b1c5cb
Co-authored-by: Amp <amp@ampcode.com>
@gakonst gakonst added C-bug An unexpected or incorrect behavior A-db Related to the database labels Feb 2, 2026
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Feb 2, 2026
…runer limiter

Fixes RETH-292: The RocksDB pruning path was incrementing the limiter
for every changeset scanned from static files, not for each unique
address (account_history) or (address, slot) pair (storage_history)
that corresponds to actual RocksDB shard work.

This caused the pruner to stop prematurely when there was high
changeset repetition (e.g., popular contracts touched many times
per block).

Changes:
- account_history.rs: Use HashMap::entry() to only increment limiter
  on first occurrence of each address
- storage_history.rs: Use HashMap::entry() to only increment limiter
  on first occurrence of each (address, slot) pair
- Update regression test to verify the fix works

Before: 10k changesets with 10 unique addresses, limit=100 → stops at <1 block
After: Same scenario → processes all 10 blocks (10 < 100 limit)
Amp-Thread-ID: https://ampcode.com/threads/T-019c1ca0-1067-7584-a10b-649ca5b1c5cb
Co-authored-by: Amp <amp@ampcode.com>
@gakonst gakonst changed the title test(prune): add regression test for RETH-292 limiter accounting fix(prune): count unique addresses instead of changesets in RocksDB pruner limiter Feb 2, 2026
@gakonst
Copy link
Member Author

gakonst commented Feb 2, 2026

Closing - the underlying issue is mitigated by PR #19141 which sets delete_limit: usize::MAX by default. With unlimited limit, the limiter accounting difference has no practical effect.

The real concern (throughput for large backlogs) is tracked in RETH-296.

@gakonst gakonst closed this Feb 2, 2026
@github-project-automation github-project-automation bot moved this from Backlog to Done in Reth Tracker Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-db Related to the database C-bug An unexpected or incorrect behavior

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant