Skip to content

feat(stages): add RocksDB support for account history index stage#21124

Closed
yongkangc wants to merge 23 commits intomainfrom
yk/rocksdb-index-account-history-v2
Closed

feat(stages): add RocksDB support for account history index stage#21124
yongkangc wants to merge 23 commits intomainfrom
yk/rocksdb-index-account-history-v2

Conversation

@yongkangc
Copy link
Member

@yongkangc yongkangc commented Jan 16, 2026

Add RocksDB support to the account history index stage using the EitherWriter abstraction, enabling dual MDBX/RocksDB writes.

Changes

  • Add collect_shards_for_unwind() helper function for generic shard collection from RocksDB
  • Add unwind_account_history_shards() and unwind_account_history_shards_rocksdb() to EitherWriter
  • Add load_accounts_history_indices() and related helpers to stage utils
  • Update IndexAccountHistoryStage::unwind() to use EitherWriter for RocksDB support
  • Fix unwind algorithm to walk backwards from u64::MAX shard instead of forward-seeking

Expected Impact
Enables the account history index stage to write to RocksDB when the --storage.rocksdb flag is set.


PR Stack

main
  ↑
#21124 (rocksdb-index-account-history) ◀ you are here
  ↑
#21125 (rocksdb-index-storage-history)

This wires RocksDB into the history lookup paths:

- Adds account_history_info and storage_history_info methods to EitherReader
- Updates HistoricalStateProviderRef to use EitherReader for lookups
- Adds RocksDBProviderFactory trait bounds to provider impls
- Uses the rank/select pattern for efficient binary search in shards
**Problem**
- EitherReader had unnecessary PhantomData markers
- RocksDB transaction setup was duplicated in historical.rs with cfg-gated blocks
- Addressed joshieDo's feedback about RocksDB logic leaking into historical provider

**Solution**
- Remove PhantomData from EitherReader enum variants (lifetime already captured by RocksDB reference)
- Add with_rocksdb_tx helper method to RocksDBProviderFactory trait
- Refactor historical.rs to use trait method instead of duplicated cfg-gated blocks

**Changes**
- Remove PhantomData from EitherReader enum and all constructors/match arms
- Add with_rocksdb_tx to RocksDBProviderFactory trait with default implementation
- Refactor account_history_lookup and storage_history_lookup to use with_rocksdb_tx helper
- Make RocksTxRefArg type alias public for trait method

**Expected Impact**
- Cleaner EitherReader API without unnecessary PhantomData
- RocksDB transaction setup encapsulated in trait method
- Reduced cfg-gated block duplication in historical.rs
- No behavioral changes, all existing tests pass (96/97)
When rocksdb feature is disabled, the RocksDB variant is compiled out,
leaving the lifetime 'a unused and causing E0392 error.

Add PhantomData<&'a ()> to StaticFile variant to ensure the lifetime
is always used regardless of feature flags.
…lication

- Add compute_history_rank() function for shared rank/select logic
- Simplify EitherReader::storage_history_info and account_history_info
- Simplify RocksTx::history_info by using the shared helper
- Fix clippy doc_markdown warning for RocksDB

Reduces duplicated rank/select code across 3 locations while preserving comments.
…ks_data

Add append_account_history_shard and append_storage_history_shard methods
to RocksDBBatch that properly handle shard boundaries. Update write_blocks_data
to use these methods instead of the naive approach that just wrote to u64::MAX
shard without checking existing shards.
…lookups

- Fix misleading comments in invariants.rs about sentinel shards
- Add doc comments and debug assertions for sorted indices requirement
- Add comprehensive shard split tests at NUM_OF_INDICES_IN_SHARD boundary
- Cache RocksDBProvider in HistoricalStateProviderRef using OnceCell
The caching added unnecessary complexity for negligible benefit:
- RocksDBProvider is Arc-based (cheap to clone)
- The real cost is transaction creation, which wasn't cached
- std::cell::OnceCell is not Sync, constraining auto-traits

Now uses trait's default with_rocksdb_tx implementation directly.
- Add tracking HashSets for touched account/storage history keys (debug only)
- Add early return for empty indices in append_*_history_shard methods
- Add debug_assert to catch duplicate appends for same key in one batch
- Document the one-append-per-key invariant in struct doc comment

The batch reads existing shards from committed DB state, not from pending
writes. Calling append twice for the same key would cause the second call
to overwrite the first, silently losing data.
This adds support for RocksDB storage backend in the account history
index stage, including:

- Helper macros for RocksDB provider/batch creation (make_rocksdb_provider,
  make_rocksdb_batch_arg, register_rocksdb_batch)
- EitherWriter methods for account history unwind operations
- Updated load_accounts_history_indices to use EitherWriter
- Updated unwind implementation to handle both MDBX and RocksDB
- Added iter_from and collect_shards_for_key methods to RocksDBBatch
- Updated RocksBatchArg type to use Option for flexibility

This is the first part of splitting the index history stages for RocksDB
support. Storage history support will follow in a subsequent PR.
@yongkangc yongkangc force-pushed the yk/rocksdb-index-account-history-v2 branch from 8ed9c05 to c96e4b3 Compare January 16, 2026 21:51
@yongkangc yongkangc self-assigned this Jan 16, 2026
- Remove dead code: shard_and_write function in utils.rs
- Revert unrelated Debug impl change in historical.rs
- Revert blank line change in rocksdb/mod.rs
- Add empty list guard in load_accounts_history_indices
- Keep required Some() wrapping for RocksBatchArg type change
Remove the Option wrapper from RocksBatchArg since it adds unnecessary
complexity. The batch is always required when RocksDB features are enabled,
so Option provides no benefit and requires callers to use Some() everywhere.
HashSet is simpler and faster - we only need deduplication, not ordering.
Remove touched_account_history and touched_storage_history HashSets that
were only used in debug builds. The invariant (one append per key per batch)
is enforced by the API design and callers already pre-aggregate indices.

Also reverts a minor comment change (real -> RocksDB) in either_writer.rs.
@yongkangc yongkangc force-pushed the yk/rocksdb-index-account-history-v2 branch from 40c1183 to 4c59195 Compare January 16, 2026 22:25
Add register_for_commit() method on EitherWriter that handles
registering the RocksDB batch for commit. This is cleaner than
the macro approach while still keeping the make_rocksdb_provider
and make_rocksdb_batch_arg macros (needed for lifetime reasons).
yongkangc added a commit that referenced this pull request Jan 17, 2026
This implements RocksDB support for the IndexAccountHistoryStage following
the TransactionLookupStage pattern:

- Add RocksDBProviderFactory trait bound to Stage impl
- Use explicit #[cfg(all(unix, feature = "rocksdb"))] blocks for RocksDB batch
  creation instead of macros
- Use EitherWriter::new_accounts_history to route writes to MDBX or RocksDB
  based on storage settings
- Add inline helper functions for loading/flushing shards with proper u64::MAX
  handling for last shard
- Add RocksDB-specific tests: execute_writes_to_rocksdb_when_enabled,
  unwind_deletes_from_rocksdb_when_enabled, execute_incremental_sync

Note: Full unwind support for RocksDB requires updates to the HistoryWriter
trait implementation, which is out of scope for this PR.

Closes #21124
@yongkangc
Copy link
Member Author

Superseded by #21165 which uses a cleaner inline approach following the TransactionLookupStage pattern instead of macros.

@yongkangc yongkangc closed this Jan 17, 2026
@github-project-automation github-project-automation bot moved this from Backlog to Done in Reth Tracker Jan 17, 2026
yongkangc added a commit that referenced this pull request Jan 18, 2026
This implements RocksDB support for the IndexAccountHistoryStage following
the TransactionLookupStage pattern:

- Add RocksDBProviderFactory trait bound to Stage impl
- Use explicit #[cfg(all(unix, feature = "rocksdb"))] blocks for RocksDB batch
  creation instead of macros
- Use EitherWriter::new_accounts_history to route writes to MDBX or RocksDB
  based on storage settings
- Add inline helper functions for loading/flushing shards with proper u64::MAX
  handling for last shard
- Add RocksDB-specific tests: execute_writes_to_rocksdb_when_enabled,
  unwind_deletes_from_rocksdb_when_enabled, execute_incremental_sync

Note: Full unwind support for RocksDB requires updates to the HistoryWriter
trait implementation, which is out of scope for this PR.

Closes #21124
yongkangc added a commit that referenced this pull request Jan 19, 2026
This implements RocksDB support for the IndexAccountHistoryStage following
the TransactionLookupStage pattern:

- Add RocksDBProviderFactory trait bound to Stage impl
- Use explicit #[cfg(all(unix, feature = "rocksdb"))] blocks for RocksDB batch
  creation instead of macros
- Use EitherWriter::new_accounts_history to route writes to MDBX or RocksDB
  based on storage settings
- Add inline helper functions for loading/flushing shards with proper u64::MAX
  handling for last shard
- Add RocksDB-specific tests: execute_writes_to_rocksdb_when_enabled,
  unwind_deletes_from_rocksdb_when_enabled, execute_incremental_sync

Note: Full unwind support for RocksDB requires updates to the HistoryWriter
trait implementation, which is out of scope for this PR.

Closes #21124
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-db Related to the database A-staged-sync Related to staged sync (pipelines and stages)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant