Skip to content

feat(stages): add RocksDB support for IndexAccountHistoryStage#21165

Merged
yongkangc merged 44 commits intomainfrom
yk/rocksdb-index-account-history-v3
Jan 20, 2026
Merged

feat(stages): add RocksDB support for IndexAccountHistoryStage#21165
yongkangc merged 44 commits intomainfrom
yk/rocksdb-index-account-history-v3

Conversation

@yongkangc
Copy link
Member

@yongkangc yongkangc commented Jan 17, 2026

Add RocksDB support to the account history index stage using the EitherWriter abstraction

Changes

  • Add RocksDBProviderFactory trait bound to Stage impl
  • Use EitherWriter::new_accounts_history() to route writes to MDBX or RocksDB
  • Add inline helper functions for loading/flushing shards with proper u64::MAX handling
  • Add RocksDB-specific tests (execute, incremental sync, unwind checkpoint)

PR Stack

main
  ↑
#21165 (rocksdb-index-account-history) ◀ you are here
  ↑
#21175 (rocksdb-index-storage-history)

Closes #20593

@yongkangc yongkangc added the A-staged-sync Related to staged sync (pipelines and stages) label Jan 17, 2026
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Jan 17, 2026
@yongkangc yongkangc self-assigned this Jan 17, 2026
@yongkangc yongkangc marked this pull request as ready for review January 17, 2026 23:28
yongkangc added a commit that referenced this pull request Jan 17, 2026
This PR adds RocksDB support for the IndexStorageHistoryStage, following the
same pattern established in #21165 for IndexAccountHistoryStage.

Changes:
- Add RocksDBProviderFactory and StorageSettingsCache trait bounds to the Stage impl
- Use EitherWriter::new_storages_history with explicit #[cfg] blocks for RocksDB batch creation
- Add helper functions for loading/flushing storage history shards:
  - load_storage_history_indices_with_writer - iterate collector, merge with existing shards
  - get_last_storage_history_shard - read from RocksDB or MDBX based on settings
  - write_storage_history_shards_keep_last - write full shards, keep partial in memory
  - flush_storage_history_shards - write all remaining shards with u64::MAX for last
- Only clear MDBX table on first sync if not using RocksDB
- Add rocksdb_tests module with tests for:
  - execute_writes_to_rocksdb_when_enabled
  - unwind_deletes_from_rocksdb_when_enabled
  - execute_incremental_sync

Part of #20593 - Move secondary indices to RocksDB
@yongkangc yongkangc force-pushed the yk/rocksdb-index-account-history-v3 branch from 40a72a8 to 1dc3c78 Compare January 18, 2026 14:29
@yongkangc yongkangc force-pushed the yk/rocksdb-index-account-history-v3 branch 2 times, most recently from e2d6aea to a462de0 Compare January 18, 2026 16:55
Comment on lines -217 to -218
// We have reached the end of this subset of keys so
// we need to flush its last indice shard.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont remove this comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - restored.

Comment on lines -231 to -232
// If it's not the first sync, there might an existing shard already, so we need to
// merge it with the one coming from the collector
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont remove this comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - restored.

)?;
}

// There will be one remaining shard that needs to be flushed to DB.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont remove this comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - restored.

CURSOR: DbCursorRW<tables::AccountsHistory> + DbCursorRO<tables::AccountsHistory>,
{
/// Puts an account history entry.
pub fn put_account_history(
Copy link
Member Author

@yongkangc yongkangc Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Renamed put_account_history to upsert_account_history to make the semantics clearer (it was already calling cursor.upsert internally). Also added a separate append_account_history for the first-sync case which is more efficient when we know keys are monotonically increasing.


// Compute exclusive end bound: `last_key || 0x00` is the smallest key > last_key
// under bytewise ordering (avoids 0xFF carry issues of increment-last-byte approach)
let end_key = Self::exclusive_end_after(last_key);
Copy link
Member Author

@yongkangc yongkangc Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary for first-sync performance: on first sync we clear the table and rebuild from scratch (same pattern as MDBX's tx.clear::<T>()). The implementation is defensive - returns early if table is empty, and uses the safe exclusive_end_after helper to compute the range bound.

/// Only flushes when we have more than one shard's worth of data, keeping the last
/// (possibly partial) shard for continued accumulation. This avoids writing a shard
/// that may need to be updated when more indices arrive.
fn flush_account_history_shards_partial<N, CURSOR>(
Copy link
Member Author

@yongkangc yongkangc Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need partial flush and full flush functions?

  • flush_account_history_shards_partial: Called during iteration when accumulating indices. Only flushes complete shards, keeping the last (possibly partial) shard in memory for continued accumulation.

  • flush_account_history_shards: Called at key boundaries (when address changes) and at the end. Flushes all remaining shards, using u64::MAX as the highest block number for the final shard (the invariant that allows seek_exact(address, u64::MAX) to find the last shard during incremental sync).

///
/// Uses `Option<Address>` instead of `Address::default()` as the sentinel to avoid
/// incorrectly treating `Address::ZERO` as "no previous address".
pub(crate) fn load_account_history_via_writer<N, CURSOR>(
Copy link
Member Author

@yongkangc yongkangc Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this function specialized for EitherWriter. The original load_history_indices is generic over the table type and works with raw cursors, while this one works specifically with EitherWriter which abstracts MDBX/RocksDB backends. The key difference is using Option<Address> as the sentinel (to handle Address::ZERO correctly) and calling EitherWriter methods directly instead of through cursor trait bounds.

@yongkangc yongkangc requested a review from joshieDo January 19, 2026 22:10
@yongkangc
Copy link
Member Author

@joshieDo just addressed your comments :)

@yongkangc yongkangc requested a review from joshieDo January 20, 2026 11:15
@yongkangc yongkangc added this pull request to the merge queue Jan 20, 2026
@yongkangc yongkangc removed this pull request from the merge queue due to a manual request Jan 20, 2026
@yongkangc yongkangc added this pull request to the merge queue Jan 20, 2026
github-merge-queue bot pushed a commit that referenced this pull request Jan 20, 2026
Co-authored-by: Georgios Konstantopoulos <me@gakonst.com>
Co-authored-by: joshieDo <93316087+joshieDo@users.noreply.github.com>
@yongkangc yongkangc removed this pull request from the merge queue due to a manual request Jan 20, 2026
@yongkangc yongkangc added this pull request to the merge queue Jan 20, 2026
Merged via the queue into main with commit bd144a4 Jan 20, 2026
44 checks passed
@yongkangc yongkangc deleted the yk/rocksdb-index-account-history-v3 branch January 20, 2026 14:32
@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Tracker Jan 20, 2026
yongkangc added a commit that referenced this pull request Jan 20, 2026
This implements RocksDB support for the IndexStorageHistoryStage following
the pattern established in #21165 for IndexAccountHistoryStage:

- Add RocksDBProviderFactory and StorageSettingsCache trait bounds to Stage impl
- Use EitherWriter::new_storages_history with explicit #[cfg] blocks for batch creation
- Add helper functions for loading/flushing storage history shards:
  - load_storage_history_via_writer - iterate collector, merge with existing shards
  - flush_storage_history_shards_partial - write full shards, keep partial in memory
  - flush_storage_history_shards - write all remaining shards with u64::MAX for last
- Add EitherWriter methods:
  - append_storage_history, upsert_storage_history, get_last_storage_history_shard
  - unwind_storage_history_shard - handles RocksDB shard unwinding
- Add RocksDBProvider::clear() and RocksDBBatch::get() methods
- Route unwind to RocksDB when storages_history_in_rocksdb is enabled
- Add rocksdb_tests module with tests for:
  - execute_writes_to_rocksdb_when_enabled
  - unwind_deletes_from_rocksdb_when_enabled
  - unwind_to_zero_keeps_block_zero
  - execute_incremental_sync

Part of #20593 - Move secondary indices to RocksDB
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-staged-sync Related to staged sync (pipelines and stages)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Wire up RocksDB initialization to Stages

4 participants