feat: add StaticFileSegment::AccountChangeSets#18882
Conversation
|
Super excited for this |
64d1a7b to
c746862
Compare
CodSpeed Performance ReportMerging this PR will not alter performanceComparing Summary
Footnotes
|
f4367fc to
ca64041
Compare
6b3d318 to
7ac0911
Compare
7ac0911 to
31f3362
Compare
|
we might want to try this one with reth/crates/storage/provider/src/providers/static_file/writer.rs Lines 833 to 837 in f0c0b3d |
02cdb37 to
2c67469
Compare
2c67469 to
b547cd2
Compare
4b3a75a to
f2193a1
Compare
|
figured out why this is causing state root mismatches sometimes, we are still using raw txs to access changesets in this causes problems for account hashing, particularly when you have already run the pipeline once |
994044e to
af1bdd8
Compare
| .static_file_provider | ||
| .get_highest_static_file_block(StaticFileSegment::AccountChangeSets); | ||
|
|
||
| if let Some(highest) = highest_static_block { |
There was a problem hiding this comment.
is it possible to move this logic by impl AccountExtReader for StaticFileProvider similarly to the others ?
| let block_changesets = state | ||
| .block_ref() | ||
| .execution_output | ||
| .bundle | ||
| .reverts | ||
| .clone() | ||
| .to_plain_state_reverts() | ||
| .accounts | ||
| .into_iter() | ||
| .flatten() |
There was a problem hiding this comment.
nit: could this be a helper? feels like ive seen this before dupped
| // Write account changes to static files | ||
| tracing::trace!("Writing account changes to static files"); |
There was a problem hiding this comment.
can we add a small note on the linked snippet that changesets get initialized on insert_state -> write_state -> write_state_reverts
reth/crates/storage/db-common/src/init.rs
Lines 159 to 162 in 3ae73e6
There was a problem hiding this comment.
maybe very edge case, but if the genesis has no state changes, i guess we'd never actually initialize the segment. (setting block range from None to Some(0..=0))
There was a problem hiding this comment.
i guess it can be generalized that if a block has no state changes (is it even possible?), we wouldnt increment the block either
nvm, we still iterate through the blocks even if they'd have no state changes
There was a problem hiding this comment.
Added a note, yeah we still add a changeset offset for the block even if there are no changes
f4d8f5f to
9332e17
Compare
|
Update, this no longer has state root mismatches. Testing running this on a very large range of blocks to compare size versus the table in mdbx. TODOs required for this PR to be complete so far:
|
| /// Segment type | ||
| segment: StaticFileSegment, | ||
| /// List of offsets, for where each block's changeset starts. | ||
| changeset_offsets: Option<Vec<ChangesetOffset>>, |
There was a problem hiding this comment.
cc @joshieDo this makes the segment header size dynamic, is it ok?
| } | ||
|
|
||
| // Advance to next block if we exhausted the previous one | ||
| if !self.current_changesets.is_empty() { |
There was a problem hiding this comment.
can we add a comment why this shouldn't be if self.current_changests.is_empty()? I don't understand the negation here
There was a problem hiding this comment.
if we do not return from the previous condition (meaning changesets.get(self.changeset_index) is None), but current_changesets is non-empty, it means we have iterated through all changesets in the current block and need to fetch a changeset for the next block. Will add a comment for this
| let mut destroyed_accounts = HashSet::default(); | ||
|
|
||
| // Get account changesets using the provider (handles static files + database) | ||
| let account_changesets = provider.account_changesets_range(*range.start()..*range.end() + 1)?; |
There was a problem hiding this comment.
would be nice for account_changesets_range to accept an impl RangeBounds so we don't have to do this
mattsse
left a comment
There was a problem hiding this comment.
@joshieDo and @shekhirin already flagged a few things
reading and writing logic, especially bounds checks, are always very complex, so perhaps we could add a few more helpful docs here and there, when we read/write sets for example
| let db_args = reth_node_core::args::DatabaseArgs::default(); | ||
| let db_args = db_args.database_args(); | ||
| let db_env = reth_db::init_db(&db_path, db_args).unwrap(); |
There was a problem hiding this comment.
I see, this way we're going through the cli args
| /// Number of changes in this changeset | ||
| num_changes: u64, |
There was a problem hiding this comment.
I'd assume that this is already respected
| .clone() | ||
| .to_plain_state_reverts() |
There was a problem hiding this comment.
uff, unsure how often we need this so might be fine
29fc813 to
e025fc0
Compare
crates/storage/db-common/src/init.rs
Outdated
| // Make sure to set storage settings before anything reads / writes | ||
| factory.set_storage_settings_cache(storage_settings); |
There was a problem hiding this comment.
think this should only be called when really initializing the genesis. ProviderFactory::new fetches and caches it from storage if it exists:
reth/crates/storage/provider/src/providers/database/mod.rs
Lines 94 to 119 in 9489667
maybe renaming this to genesis_storage_settings can remove future confusions?
There was a problem hiding this comment.
we always call this when we launch the node btw:
reth/crates/node/builder/src/launch/engine.rs
Line 125 in b5e7a69
There was a problem hiding this comment.
we now only call this when we actually are writing the genesis, fn also renamed
|
need a follow-up to handle the pruner: reth/crates/prune/prune/src/segments/user/account_history.rs Lines 80 to 90 in b5e7a69 maybe, worth waiting for the rocksdb implementation of |
|
|
||
| impl StorageSettings { | ||
| /// Creates a new `StorageSettings` with default values. | ||
| pub const fn new() -> Self { |
There was a problem hiding this comment.
iirc, fn new was intentionally left out, so StorageSettings create would be explicit on whats creating
yongkangc
left a comment
There was a problem hiding this comment.
lgtm, dont have additional comments here that hasnt been brought up
| if EitherWriter::account_changesets_destination(provider).is_database() { | ||
| // Old pruned nodes (including full node) do not store receipts as static | ||
| // files. | ||
| debug!(target: "reth::providers::static_file", ?segment, "Skipping account changesets consistency check: receipts stored in database"); |
There was a problem hiding this comment.
Should this beaccount changesets stored in database instead of receipts?
No functional impact, but this would be clearer for logs and future debugging.
The comment above (lines 1094–1095) also mentions receipts, while this block handles AccountChangeSets.
chore: add changeset info to segment header feat(provider): implement append_account_changeset chore: formatting and more TODOs chore: add stub for account_before_block fix: use proper value for account changeset static files feat: impl binary search for address in changeset feat: finish writer fn, add tests chore: add fallback to db when not found in static files feat: add account_changesets_range to ChangesetReader feat: add support for account changesets in IndexAccountHistory feat: add support for account changesets in static file producer chore: serialization changes feat: add support for account changesets in db get chore: docs and attempt to fix db get fix: properly upgrade segment header serialize / deserialize feat: add read-only segments feat: add cli flag for enabling v2 static files chore: remove Segment for AccountChangeSets chore: add unreachable! for unreachable db get branch chore: remove noisy traces chore: make binary search more concise feat: add support for account changeset static file pruning feat: make HashedPostState::from_reverts work with static files fix: don't write to db chore: update book cli chore: make clippy happy chore: fix test compilation chore: fix feat propagation feat: add StaticFileRangeWalker skeleton fix: fix inclusive ranges feat: impl removal for changesets in exec unwind chore: replace prefixsetloader with fn that uses provider chore(trie): propagate errors from root methods chore: make clippy happy chore: no underscores chore(static-file): call increment_block when appending account changesets chore(merkle-changesets): add logs for changeset reverts fix: return 0 for start feat: add account_changeset_count to ChangeSetReader fix: introduce walker to prevent OOM in IndexAccountHistory fix: remove static file v2 read-only segment stuff chore: remove traces feat: add storage_settings branches chore: backwards compat segment header writing fix: only try to read offsets for offset segments fix: fix test for account changeset format chore: add lz4 and make database check better fix: custom serializer for SegmentHeader chore: update settings cli to have account changeset option chore: integrate EitherWriter for account changesets chore: rm outdated doc chore: check storage settings for reading always chore: add note on changeset static file initialization chore: doc fixes chore: make clippy happy chore: update book cli chore: use core instead of std chore: fix doc links fix: stop using stale storage settings fix: fix incremental walker loop chore: update snapshot test wip: rangebounds
ref #18846
Introduces a new flag
--static-file.account-change-setswhich controls whether or not we should write account changesets to static files or the DB.Adds account changesets as a new static file segment, where each row is a change. This makes them different from "block-based" static files like headers, and "transaction-based" static files like transactions and receipts. Row ranges for each block are stored in the header for the static file segment.
Backwards compat serialization / deserialization code is added because this adds a field to the static file header.
db statson a node with this PR:db statson a regular node:This is about 40% savings, going from 246G on a regular node, to 140G with this PR