Skip to content

feat: add StaticFileSegment::AccountChangeSets#18882

Merged
shekhirin merged 2 commits intomainfrom
dan/account-changeset-static-files
Jan 8, 2026
Merged

feat: add StaticFileSegment::AccountChangeSets#18882
shekhirin merged 2 commits intomainfrom
dan/account-changeset-static-files

Conversation

@Rjected
Copy link
Member

@Rjected Rjected commented Oct 6, 2025

ref #18846

Introduces a new flag --static-file.account-change-sets which controls whether or not we should write account changesets to static files or the DB.

Adds account changesets as a new static file segment, where each row is a change. This makes them different from "block-based" static files like headers, and "transaction-based" static files like transactions and receipts. Row ranges for each block are stored in the header for the static file segment.

Backwards compat serialization / deserialization code is added because this adds a field to the static file header.

db stats on a node with this PR:

ubuntu@reth8:~/reth$ ./target/profiling/reth db --datadir /mnt/reth-mainnet/ stats
2025-11-13T15:41:39.695842Z  INFO Initialized tracing, debug log directory: /home/ubuntu/.cache/reth/logs/mainnet
2025-11-13T15:41:39.697931Z  INFO Opening storage db_path="/mnt/reth-mainnet/db" sf_path="/mnt/reth-mainnet/static_files"
2025-11-13T15:41:44.982848Z  INFO Verifying storage consistency.
| Segment           | Block Range  | Transaction Range | Shape (columns x rows) | Size      |
|-------------------|--------------|-------------------|------------------------|-----------|
| Headers           | 0..=23000000 | N/A               | 3 x 23000001           | 11.8 GiB  |
| Transactions      | 0..=23000000 | 0..=2907942767    | 1 x 2907942768         | 721.3 GiB |
| Receipts          | 0..=23000000 | 0..=2907942767    | 1 x 2907942768         | 332.5 GiB |
| AccountChangeSets | 0..=23000000 | N/A               | 1 x 3787160399         | 140.8 GiB |
| ----------------- | ------------ | ----------------- | ---------------------- | --------- |
| Total             |              |                   |                        | 1.2 TiB   |


| Table Name                 | # Entries  | Branch Pages | Leaf Pages | Overflow Pages | Total Size |
|----------------------------|------------|--------------|------------|----------------|------------|
| AccountChangeSets          | 0          | 0            | 0          | 0              | 0 B        |
| AccountsHistory            | 67493      | 19           | 1907       | 1100           | 11.8 MiB   |
| AccountsTrie               | 24075389   | 4126         | 996709     | 0              | 3.8 GiB    |
| AccountsTrieChangeSets     | 125727     | 2989         | 16809      | 0              | 77.3 MiB   |
| BlockBodyIndices           | 23000001   | 609          | 136118     | 0              | 534.1 MiB  |
| BlockOmmers                | 1209744    | 844          | 188675     | 0              | 740.3 MiB  |
| BlockWithdrawals           | 5965061    | 3806         | 852142     | 0              | 3.3 GiB    |
| Bytecodes                  | 1757397    | 2835         | 179017     | 3608635        | 14.5 GiB   |
| CanonicalHeaders           | 0          | 0            | 0          | 0              | 0 B        |
| ChainState                 | 0          | 0            | 0          | 0              | 0 B        |
| HashedAccounts             | 304987658  | 43165        | 4100529    | 0              | 15.8 GiB   |
| HashedStorages             | 1339912717 | 612376       | 18592372   | 0              | 73.3 GiB   |
| HeaderNumbers              | 23000001   | 11299        | 547070     | 0              | 2.1 GiB    |
| HeaderTerminalDifficulties | 0          | 0            | 0          | 0              | 0 B        |
| Headers                    | 0          | 0            | 0          | 0              | 0 B        |
| Metadata                   | 1          | 0            | 1          | 0              | 4 KiB      |
| PlainAccountState          | 304987658  | 50115        | 4632009    | 0              | 17.9 GiB   |
| PlainStorageState          | 1339912717 | 883587       | 28323800   | 0              | 111.4 GiB  |
| PruneCheckpoints           | 1          | 0            | 1          | 0              | 4 KiB      |
| Receipts                   | 0          | 0            | 0          | 0              | 0 B        |
| StageCheckpointProgresses  | 1          | 0            | 1          | 0              | 4 KiB      |
| StageCheckpoints           | 16         | 0            | 1          | 0              | 4 KiB      |
| StorageChangeSets          | 6882009001 | 4504624      | 118596017  | 0              | 469.6 GiB  |
| StoragesHistory            | 538338     | 243          | 13720      | 213            | 55.4 MiB   |
| StoragesTrie               | 117839052  | 595741       | 6016100    | 0              | 25.2 GiB   |
| StoragesTrieChangeSets     | 205298     | 7876         | 30288      | 0              | 149.1 MiB  |
| TransactionBlocks          | 21360530   | 613          | 136927     | 0              | 537.3 MiB  |
| TransactionHashNumbers     | 2907942768 | 1436716      | 70050725   | 0              | 272.7 GiB  |
| TransactionSenders         | 2907942768 | 121328       | 27177036   | 0              | 104.1 GiB  |
| Transactions               | 0          | 0            | 0          | 0              | 0 B        |
| VersionHistory             | 1          | 0            | 1          | 0              | 4 KiB      |
| -------------------------- | ---------- | ------------ | ---------- | -------------- | ---------- |
| Tables                     |            |              |            |                | 1.1 TiB    |
| Freelist                   | 20893      |              |            |                | 81.6 MiB   |

db stats on a regular node:

ubuntu@reth9:~/reth$ ./target/profiling/reth db --datadir /mnt/reth-mainnet/ stats
2025-11-13T15:42:14.343426Z  INFO Initialized tracing, debug log directory: /home/ubuntu/.cache/reth/logs/mainnet
2025-11-13T15:42:14.345517Z  INFO Opening storage db_path="/mnt/reth-mainnet/db" sf_path="/mnt/reth-mainnet/static_files"
2025-11-13T15:42:14.357218Z  INFO Verifying storage consistency.
| Segment      | Block Range  | Transaction Range | Shape (columns x rows) | Size      |
|--------------|--------------|-------------------|------------------------|-----------|
| Headers      | 0..=23000000 | N/A               | 3 x 23000001           | 11.8 GiB  |
| Transactions | 0..=23000000 | 0..=2907942767    | 1 x 2907942768         | 721.3 GiB |
| Receipts     | 0..=23000000 | 0..=2907942767    | 1 x 2907942768         | 332.5 GiB |
| ------------ | ------------ | ----------------- | ---------------------- | --------- |
| Total        |              |                   |                        | 1 TiB     |


| Table Name                 | # Entries  | Branch Pages | Leaf Pages | Overflow Pages | Total Size |
|----------------------------|------------|--------------|------------|----------------|------------|
| AccountChangeSets          | 3787160399 | 15867407     | 48833564   | 0              | 246.8 GiB  |
| AccountsHistory            | 67493      | 19           | 1907       | 1100           | 11.8 MiB   |
| AccountsTrie               | 24075389   | 4126         | 996709     | 0              | 3.8 GiB    |
| AccountsTrieChangeSets     | 125727     | 2989         | 16809      | 0              | 77.3 MiB   |
| BlockBodyIndices           | 23000001   | 609          | 136118     | 0              | 534.1 MiB  |
| BlockOmmers                | 1209744    | 844          | 188675     | 0              | 740.3 MiB  |
| BlockWithdrawals           | 5965061    | 3806         | 852142     | 0              | 3.3 GiB    |
| Bytecodes                  | 1757397    | 2835         | 179017     | 3608635        | 14.5 GiB   |
| CanonicalHeaders           | 0          | 0            | 0          | 0              | 0 B        |
| ChainState                 | 0          | 0            | 0          | 0              | 0 B        |
| HashedAccounts             | 304987658  | 43165        | 4100529    | 0              | 15.8 GiB   |
| HashedStorages             | 1339912717 | 612376       | 18592372   | 0              | 73.3 GiB   |
| HeaderNumbers              | 23000001   | 11299        | 547070     | 0              | 2.1 GiB    |
| HeaderTerminalDifficulties | 0          | 0            | 0          | 0              | 0 B        |
| Headers                    | 0          | 0            | 0          | 0              | 0 B        |
| Metadata                   | 1          | 0            | 1          | 0              | 4 KiB      |
| PlainAccountState          | 304987658  | 50115        | 4632009    | 0              | 17.9 GiB   |
| PlainStorageState          | 1339912717 | 883587       | 28323800   | 0              | 111.4 GiB  |
| PruneCheckpoints           | 1          | 0            | 1          | 0              | 4 KiB      |
| Receipts                   | 0          | 0            | 0          | 0              | 0 B        |
| StageCheckpointProgresses  | 1          | 0            | 1          | 0              | 4 KiB      |
| StageCheckpoints           | 16         | 0            | 1          | 0              | 4 KiB      |
| StorageChangeSets          | 6882009001 | 4504624      | 118596017  | 0              | 469.6 GiB  |
| StoragesHistory            | 538338     | 243          | 13720      | 213            | 55.4 MiB   |
| StoragesTrie               | 117839052  | 595741       | 6016100    | 0              | 25.2 GiB   |
| StoragesTrieChangeSets     | 205298     | 7876         | 30288      | 0              | 149.1 MiB  |
| TransactionBlocks          | 21360530   | 613          | 136927     | 0              | 537.3 MiB  |
| TransactionHashNumbers     | 2907942768 | 1436716      | 70050725   | 0              | 272.7 GiB  |
| TransactionSenders         | 2907942768 | 121328       | 27177036   | 0              | 104.1 GiB  |
| Transactions               | 0          | 0            | 0          | 0              | 0 B        |
| VersionHistory             | 1          | 0            | 1          | 0              | 4 KiB      |
| -------------------------- | ---------- | ------------ | ---------- | -------------- | ---------- |
| Tables                     |            |              |            |                | 1.3 TiB    |
| Freelist                   | 20893      |              |            |                | 81.6 MiB   |

This is about 40% savings, going from 246G on a regular node, to 140G with this PR

@gakonst
Copy link
Member

gakonst commented Oct 6, 2025

Super excited for this

@Rjected Rjected force-pushed the dan/account-changeset-static-files branch from 64d1a7b to c746862 Compare October 7, 2025 16:25
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 7, 2025

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing dan/account-changeset-static-files (9f24f3a) with main (ef70879)

Summary

✅ 118 untouched benchmarks
⏩ 7 skipped benchmarks1

Footnotes

  1. 7 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@Rjected Rjected force-pushed the dan/account-changeset-static-files branch 2 times, most recently from f4367fc to ca64041 Compare October 8, 2025 14:45
@Rjected Rjected force-pushed the dan/account-changeset-static-files branch from 6b3d318 to 7ac0911 Compare October 14, 2025 21:16
@jenpaff jenpaff linked an issue Oct 15, 2025 that may be closed by this pull request
@Rjected Rjected force-pushed the dan/account-changeset-static-files branch from 7ac0911 to 31f3362 Compare October 16, 2025 15:03
@joshieDo
Copy link
Collaborator

we might want to try this one with lz4

// Transaction and Receipt already have the compression scheme used natively in its encoding.
// (zstd-dictionary)
if segment.is_headers() {
jar = jar.with_lz4();
}

@Rjected Rjected force-pushed the dan/account-changeset-static-files branch 6 times, most recently from 02cdb37 to 2c67469 Compare October 23, 2025 23:15
@Rjected Rjected force-pushed the dan/account-changeset-static-files branch from 2c67469 to b547cd2 Compare October 31, 2025 03:55
@Rjected Rjected force-pushed the dan/account-changeset-static-files branch from 4b3a75a to f2193a1 Compare November 3, 2025 21:03
@Rjected
Copy link
Member Author

Rjected commented Nov 3, 2025

figured out why this is causing state root mismatches sometimes, we are still using raw txs to access changesets in AccountExtReader:

let mut changeset_cursor = self.tx.cursor_read::<tables::AccountChangeSets>()?;

.cursor_read::<tables::AccountChangeSets>()?

this causes problems for account hashing, particularly when you have already run the pipeline once

@Rjected Rjected force-pushed the dan/account-changeset-static-files branch from 994044e to af1bdd8 Compare November 4, 2025 15:52
.static_file_provider
.get_highest_static_file_block(StaticFileSegment::AccountChangeSets);

if let Some(highest) = highest_static_block {
Copy link
Collaborator

@joshieDo joshieDo Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to move this logic by impl AccountExtReader for StaticFileProvider similarly to the others ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, should be possible

Comment on lines +1458 to +1467
let block_changesets = state
.block_ref()
.execution_output
.bundle
.reverts
.clone()
.to_plain_state_reverts()
.accounts
.into_iter()
.flatten()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could this be a helper? feels like ive seen this before dupped

Comment on lines +1796 to +1797
// Write account changes to static files
tracing::trace!("Writing account changes to static files");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a small note on the linked snippet that changesets get initialized on insert_state -> write_state -> write_state_reverts

// Static file segments start empty, so we need to initialize the genesis block.
let static_file_provider = provider_rw.static_file_provider();
static_file_provider.latest_writer(StaticFileSegment::Receipts)?.increment_block(0)?;
static_file_provider.latest_writer(StaticFileSegment::Transactions)?.increment_block(0)?;

Copy link
Collaborator

@joshieDo joshieDo Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe very edge case, but if the genesis has no state changes, i guess we'd never actually initialize the segment. (setting block range from None to Some(0..=0))

Copy link
Collaborator

@joshieDo joshieDo Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess it can be generalized that if a block has no state changes (is it even possible?), we wouldnt increment the block either

nvm, we still iterate through the blocks even if they'd have no state changes

for (block_index, mut account_block_reverts) in reverts.accounts.into_iter().enumerate() {

Copy link
Member Author

@Rjected Rjected Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note, yeah we still add a changeset offset for the block even if there are no changes

@Rjected Rjected force-pushed the dan/account-changeset-static-files branch from f4d8f5f to 9332e17 Compare November 10, 2025 21:28
@Rjected
Copy link
Member Author

Rjected commented Nov 10, 2025

Update, this no longer has state root mismatches. Testing running this on a very large range of blocks to compare size versus the table in mdbx.

TODOs required for this PR to be complete so far:

  • Introduce method is_changeset_based instead of is_block_based / is_transaction_based
  • Use StorageSettings to determine behavior when reading / writing
  • Look into test_segment_config_backwards
  • Remove traces added
  • Put increment_block in append_account_changeset
  • Fix IndexAccountHistory collector to walk incrementally rather than collect all changesets in range

/// Segment type
segment: StaticFileSegment,
/// List of offsets, for where each block's changeset starts.
changeset_offsets: Option<Vec<ChangesetOffset>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @joshieDo this makes the segment header size dynamic, is it ok?

}

// Advance to next block if we exhausted the previous one
if !self.current_changesets.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a comment why this shouldn't be if self.current_changests.is_empty()? I don't understand the negation here

Copy link
Member Author

@Rjected Rjected Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do not return from the previous condition (meaning changesets.get(self.changeset_index) is None), but current_changesets is non-empty, it means we have iterated through all changesets in the current block and need to fetch a changeset for the next block. Will add a comment for this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

let mut destroyed_accounts = HashSet::default();

// Get account changesets using the provider (handles static files + database)
let account_changesets = provider.account_changesets_range(*range.start()..*range.end() + 1)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice for account_changesets_range to accept an impl RangeBounds so we don't have to do this

@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Dec 1, 2025
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshieDo and @shekhirin already flagged a few things

reading and writing logic, especially bounds checks, are always very complex, so perhaps we could add a few more helpful docs here and there, when we read/write sets for example

Comment on lines +472 to +474
let db_args = reth_node_core::args::DatabaseArgs::default();
let db_args = db_args.database_args();
let db_env = reth_db::init_db(&db_path, db_args).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this way we're going through the cli args

Comment on lines +171 to +172
/// Number of changes in this changeset
num_changes: u64,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume that this is already respected

Comment on lines +1512 to +1513
.clone()
.to_plain_state_reverts()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uff, unsure how often we need this so might be fine

@Rjected Rjected force-pushed the dan/account-changeset-static-files branch 14 times, most recently from 29fc813 to e025fc0 Compare December 3, 2025 20:37
Comment on lines +132 to +133
// Make sure to set storage settings before anything reads / writes
factory.set_storage_settings_cache(storage_settings);
Copy link
Collaborator

@joshieDo joshieDo Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think this should only be called when really initializing the genesis. ProviderFactory::new fetches and caches it from storage if it exists:

// Load storage settings from database at init time. Creates a temporary provider
// to read persisted settings, falling back to legacy defaults if none exist.
//
// Both factory and all providers it creates should share these cached settings.
let legacy_settings = StorageSettings::legacy();
let storage_settings = DatabaseProvider::<_, N>::new(
db.tx()?,
chain_spec.clone(),
static_file_provider.clone(),
Default::default(),
Default::default(),
Arc::new(RwLock::new(legacy_settings)),
rocksdb_provider.clone(),
)
.storage_settings()?
.unwrap_or(legacy_settings);
Ok(Self {
db,
chain_spec,
static_file_provider,
prune_modes: PruneModes::default(),
storage: Default::default(),
storage_settings: Arc::new(RwLock::new(storage_settings)),
rocksdb_provider,
})
. This would be replacing its cache to whatever default we end up changing to

maybe renaming this to genesis_storage_settings can remove future confusions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we always call this when we launch the node btw:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now only call this when we actually are writing the genesis, fn also renamed

@joshieDo
Copy link
Collaborator

need a follow-up to handle the pruner:

let mut highest_deleted_accounts = FxHashMap::default();
let (pruned_changesets, done) =
provider.tx_ref().prune_table_with_range::<tables::AccountChangeSets>(
range,
&mut limiter,
|_| false,
|(block_number, account)| {
highest_deleted_accounts.insert(account.address, block_number);
last_changeset_pruned_block = Some(block_number);
},
)?;

maybe, worth waiting for the rocksdb implementation of AccountHistory


impl StorageSettings {
/// Creates a new `StorageSettings` with default values.
pub const fn new() -> Self {
Copy link
Collaborator

@joshieDo joshieDo Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc, fn new was intentionally left out, so StorageSettings create would be explicit on whats creating

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Member

@yongkangc yongkangc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, dont have additional comments here that hasnt been brought up

if EitherWriter::account_changesets_destination(provider).is_database() {
// Old pruned nodes (including full node) do not store receipts as static
// files.
debug!(target: "reth::providers::static_file", ?segment, "Skipping account changesets consistency check: receipts stored in database");
Copy link
Contributor

@meetrick meetrick Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this beaccount changesets stored in database instead of receipts?
No functional impact, but this would be clearer for logs and future debugging.
The comment above (lines 1094–1095) also mentions receipts, while this block handles AccountChangeSets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@meetrick meetrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Rjected and others added 2 commits January 7, 2026 16:45
chore: add changeset info to segment header

feat(provider): implement append_account_changeset

chore: formatting and more TODOs

chore: add stub for account_before_block

fix: use proper value for account changeset static files

feat: impl binary search for address in changeset

feat: finish writer fn, add tests

chore: add fallback to db when not found in static files

feat: add account_changesets_range to ChangesetReader

feat: add support for account changesets in IndexAccountHistory

feat: add support for account changesets in static file producer

chore: serialization changes

feat: add support for account changesets in db get

chore: docs and attempt to fix db get

fix: properly upgrade segment header serialize / deserialize

feat: add read-only segments

feat: add cli flag for enabling v2 static files

chore: remove Segment for AccountChangeSets

chore: add unreachable! for unreachable db get branch

chore: remove noisy traces

chore: make binary search more concise

feat: add support for account changeset static file pruning

feat: make HashedPostState::from_reverts work with static files

fix: don't write to db

chore: update book cli

chore: make clippy happy

chore: fix test compilation

chore: fix feat propagation

feat: add StaticFileRangeWalker skeleton

fix: fix inclusive ranges

feat: impl removal for changesets in exec unwind

chore: replace prefixsetloader with fn that uses provider

chore(trie): propagate errors from root methods

chore: make clippy happy

chore: no underscores

chore(static-file): call increment_block when appending account changesets

chore(merkle-changesets): add logs for changeset reverts

fix: return 0 for start

feat: add account_changeset_count to ChangeSetReader

fix: introduce walker to prevent OOM in IndexAccountHistory

fix: remove static file v2 read-only segment stuff

chore: remove traces

feat: add storage_settings branches

chore: backwards compat segment header writing

fix: only try to read offsets for offset segments

fix: fix test for account changeset format

chore: add lz4 and make database check better

fix: custom serializer for SegmentHeader

chore: update settings cli to have account changeset option

chore: integrate EitherWriter for account changesets

chore: rm outdated doc

chore: check storage settings for reading always

chore: add note on changeset static file initialization

chore: doc fixes

chore: make clippy happy

chore: update book cli

chore: use core instead of std

chore: fix doc links

fix: stop using stale storage settings

fix: fix incremental walker loop

chore: update snapshot test

wip: rangebounds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

AccountChangeSets static files segment

7 participants