Skip to content

feat: implement table range checksums for reth db checksum#7623

Merged
Rjected merged 9 commits intoparadigmxyz:mainfrom
AbnerZheng:issue-7561
May 22, 2024
Merged

feat: implement table range checksums for reth db checksum#7623
Rjected merged 9 commits intoparadigmxyz:mainfrom
AbnerZheng:issue-7561

Conversation

@AbnerZheng
Copy link
Contributor

@AbnerZheng AbnerZheng commented Apr 13, 2024

close #7561

Check whether it is in the right direction.

cargo run --bin reth db checksum HashedAccounts --datadir ~/.local/share/reth/holesky  --start 0x005e54f1867fd030f90673b8b625ac8f0656e44a88cfc0b3af3e3f3c3d486960 --end 0x03089e01be9eb2af5ff5fa1c5983c6c6fb78dd734658d1f8f11d4f8d27a23fd5

And here is the result:

2024-04-13T17:16:42.693809Z  WARN This command should be run without the node running!
2024-04-13T17:16:42.695030Z  INFO <range>:: 0x005e54f1867fd030f90673b8b625ac8f0656e44a88cfc0b3af3e3f3c3d486960..=0x03089e01be9eb2af5ff5fa1c5983c6c6fb78dd734658d1f8f11d4f8d27a23fd5
2024-04-13T17:16:42.695765Z  INFO Hashed 0 entries.
2024-04-13T17:16:42.695850Z  INFO Hashed 4 entries.
2024-04-13T17:16:42.695895Z  INFO Checksum for table `HashedAccounts`: 0xf0b3ac90be2afa66 (elapsed: 301.167µs)

Copy link
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! I think this is in the right direction, although an eventual goal would be to use this in reth db stats --checksum. So I think it would be useful to document the units for each range, per table. For example, if you were to try to compare two nodes' HashedAccounts tables, how would you determine ranges? The tables that are indexed by block are slightly easier, because you can just go by some number of blocks, but that should still be documented

@Rjected
Copy link
Member

Rjected commented Apr 24, 2024

hey @AbnerZheng did my comment make sense / are you stuck on anything?

@AbnerZheng
Copy link
Contributor Author

hey @AbnerZheng did my comment make sense / are you stuck on anything?

It makes sense. But I am not familiar with these table, trying to sync a node on my server so that I could inspect these tables.

@Rjected
Copy link
Member

Rjected commented Apr 29, 2024

hey @AbnerZheng did my comment make sense / are you stuck on anything?

It makes sense. But I am not familiar with these table, trying to sync a node on my server so that I could inspect these tables.

btw it should be possible to test this with a testnet node, for example holesky, which is much smaller! lmk if you're still blocked or don't understand something - the table definitions are here

tables! {
/// Stores the header hashes belonging to the canonical chain.
table CanonicalHeaders<Key = BlockNumber, Value = HeaderHash>;
/// Stores the total difficulty from a block header.
table HeaderTerminalDifficulties<Key = BlockNumber, Value = CompactU256>;
/// Stores the block number corresponding to a header.
table HeaderNumbers<Key = BlockHash, Value = BlockNumber>;
/// Stores header bodies.
table Headers<Key = BlockNumber, Value = Header>;
/// Stores block indices that contains indexes of transaction and the count of them.
///
/// More information about stored indices can be found in the [`StoredBlockBodyIndices`] struct.
table BlockBodyIndices<Key = BlockNumber, Value = StoredBlockBodyIndices>;
/// Stores the uncles/ommers of the block.
table BlockOmmers<Key = BlockNumber, Value = StoredBlockOmmers>;
/// Stores the block withdrawals.
table BlockWithdrawals<Key = BlockNumber, Value = StoredBlockWithdrawals>;
/// Canonical only Stores the transaction body for canonical transactions.
table Transactions<Key = TxNumber, Value = TransactionSignedNoHash>;
/// Stores the mapping of the transaction hash to the transaction number.
table TransactionHashNumbers<Key = TxHash, Value = TxNumber>;
/// Stores the mapping of transaction number to the blocks number.
///
/// The key is the highest transaction ID in the block.
table TransactionBlocks<Key = TxNumber, Value = BlockNumber>;
/// Canonical only Stores transaction receipts.
table Receipts<Key = TxNumber, Value = Receipt>;
/// Stores all smart contract bytecodes.
/// There will be multiple accounts that have same bytecode
/// So we would need to introduce reference counter.
/// This will be small optimization on state.
table Bytecodes<Key = B256, Value = Bytecode>;
/// Stores the current state of an [`Account`].
table PlainAccountState<Key = Address, Value = Account>;
/// Stores the current value of a storage key.
table PlainStorageState<Key = Address, Value = StorageEntry, SubKey = B256>;
/// Stores pointers to block changeset with changes for each account key.
///
/// Last shard key of the storage will contain `u64::MAX` `BlockNumber`,
/// this would allows us small optimization on db access when change is in plain state.
///
/// Imagine having shards as:
/// * `Address | 100`
/// * `Address | u64::MAX`
///
/// What we need to find is number that is one greater than N. Db `seek` function allows us to fetch
/// the shard that equal or more than asked. For example:
/// * For N=50 we would get first shard.
/// * for N=150 we would get second shard.
/// * If max block number is 200 and we ask for N=250 we would fetch last shard and
/// know that needed entry is in `AccountPlainState`.
/// * If there were no shard we would get `None` entry or entry of different storage key.
///
/// Code example can be found in `reth_provider::HistoricalStateProviderRef`
table AccountsHistory<Key = ShardedKey<Address>, Value = BlockNumberList>;
/// Stores pointers to block number changeset with changes for each storage key.
///
/// Last shard key of the storage will contain `u64::MAX` `BlockNumber`,
/// this would allows us small optimization on db access when change is in plain state.
///
/// Imagine having shards as:
/// * `Address | StorageKey | 100`
/// * `Address | StorageKey | u64::MAX`
///
/// What we need to find is number that is one greater than N. Db `seek` function allows us to fetch
/// the shard that equal or more than asked. For example:
/// * For N=50 we would get first shard.
/// * for N=150 we would get second shard.
/// * If max block number is 200 and we ask for N=250 we would fetch last shard and
/// know that needed entry is in `StoragePlainState`.
/// * If there were no shard we would get `None` entry or entry of different storage key.
///
/// Code example can be found in `reth_provider::HistoricalStateProviderRef`
table StoragesHistory<Key = StorageShardedKey, Value = BlockNumberList>;
/// Stores the state of an account before a certain transaction changed it.
/// Change on state can be: account is created, selfdestructed, touched while empty
/// or changed balance,nonce.
table AccountChangeSets<Key = BlockNumber, Value = AccountBeforeTx, SubKey = Address>;
/// Stores the state of a storage key before a certain transaction changed it.
/// If [`StorageEntry::value`] is zero, this means storage was not existing
/// and needs to be removed.
table StorageChangeSets<Key = BlockNumberAddress, Value = StorageEntry, SubKey = B256>;
/// Stores the current state of an [`Account`] indexed with `keccak256Address`
/// This table is in preparation for merkelization and calculation of state root.
/// We are saving whole account data as it is needed for partial update when
/// part of storage is changed. Benefit for merkelization is that hashed addresses are sorted.
table HashedAccounts<Key = B256, Value = Account>;
/// Stores the current storage values indexed with `keccak256Address` and
/// hash of storage key `keccak256key`.
/// This table is in preparation for merkelization and calculation of state root.
/// Benefit for merklization is that hashed addresses/keys are sorted.
table HashedStorages<Key = B256, Value = StorageEntry, SubKey = B256>;
/// Stores the current state's Merkle Patricia Tree.
table AccountsTrie<Key = StoredNibbles, Value = StoredBranchNode>;
/// From HashedAddress => NibblesSubKey => Intermediate value
table StoragesTrie<Key = B256, Value = StorageTrieEntry, SubKey = StoredNibblesSubKey>;
/// Stores the transaction sender for each canonical transaction.
/// It is needed to speed up execution stage and allows fetching signer without doing
/// transaction signed recovery
table TransactionSenders<Key = TxNumber, Value = Address>;
/// Stores the highest synced block number and stage-specific checkpoint of each stage.
table StageCheckpoints<Key = StageId, Value = StageCheckpoint>;
/// Stores arbitrary data to keep track of a stage first-sync progress.
table StageCheckpointProgresses<Key = StageId, Value = Vec<u8>>;
/// Stores the highest pruned block number and prune mode of each prune segment.
table PruneCheckpoints<Key = PruneSegment, Value = PruneCheckpoint>;
/// Stores the history of client versions that have accessed the database with write privileges by unix timestamp in seconds.
table VersionHistory<Key = u64, Value = ClientVersion>;
}

@AbnerZheng AbnerZheng marked this pull request as ready for review April 30, 2024 15:18
@AbnerZheng AbnerZheng requested a review from onbjerg as a code owner April 30, 2024 15:18
@AbnerZheng
Copy link
Contributor Author

AbnerZheng commented Apr 30, 2024

@Rjected I have add another argument limit, and print the start-key and end-key when running checksum.

I can imagine the usage of this tool would like:

  1. Get the table name by running reth db stats if don't know the name exactly.
  2. Run checksum with the argument limit being specified, but without setting start-key or end-key. For example:
reth db checksum AccountsHistory --datadir ~/.local/share/reth/holesky --limit 100

The result would be like:

2024-04-30T15:27:36.360619Z  WARN This command should be run without the node running!
2024-04-30T15:27:36.360754Z  INFO Hashed 0 entries.
2024-04-30T15:27:36.360870Z  INFO Hashed 100 entries.
2024-04-30T15:27:36.360908Z  INFO start-key: {"key":"0x0000000000000000000000000000000000000000","highest_block_number":30161}
2024-04-30T15:27:36.360919Z  INFO end-key: {"key":"0x000000000000000000000000000000000000005e","highest_block_number":18446744073709551615}
2024-04-30T15:27:36.360934Z  INFO Checksum for table `AccountsHistory`: 0x98f8199844a34072 (elapsed: 178.629µs)
  1. Compare the start-key, end-key, checksum with other. If they are the same, continue running the command with the end-key we got before as the new start-key, for example:
reth db checksum AccountsHistory --datadir ~/.local/share/reth/holesky --limit 100 --start-key '{"key":"0x000000000000000000000000000000000000005e","highest_block_number":18446744073709551615}'

We can get:

2024-04-30T15:32:01.614445Z  WARN This command should be run without the node running!
2024-04-30T15:32:01.614565Z  INFO start={"key":"0x000000000000000000000000000000000000005e","highest_block_number":18446744073709551615} 
 end= 
2024-04-30T15:32:01.614625Z  INFO Hashed 0 entries.
2024-04-30T15:32:01.614779Z  INFO Hashed 100 entries.
2024-04-30T15:32:01.614813Z  INFO start-key: {"key":"0x000000000000000000000000000000000000005e","highest_block_number":18446744073709551615}
2024-04-30T15:32:01.614820Z  INFO end-key: {"key":"0x00000000000000000000000000000000000000c1","highest_block_number":18446744073709551615}
2024-04-30T15:32:01.614833Z  INFO Checksum for table `AccountsHistory`: 0xadb85f752caba2fe (elapsed: 207.885µs)
  1. repeat step 2 and 3 and to find the corrupt range. Or we can use binary search strategy if you like.

So instead of documenting the unit, user can know how key would look like by running directly with limit setted and without setting start-key and end-key.

@AbnerZheng AbnerZheng requested a review from Rjected May 5, 2024 04:41
@emhane emhane added C-enhancement New feature or request A-cli Related to the reth CLI labels May 16, 2024
@mattsse
Copy link
Collaborator

mattsse commented May 17, 2024

@Rjected bump

@Rjected
Copy link
Member

Rjected commented May 20, 2024

@Rjected I have add another argument limit, and print the start-key and end-key when running checksum.

I can imagine the usage of this tool would like:

1. Get the table name by running `reth db stats` if don't know the name exactly.

2. Run checksum with the argument `limit` being specified, but without setting `start-key` or `end-key`.  For example:
reth db checksum AccountsHistory --datadir ~/.local/share/reth/holesky --limit 100

The result would be like:

2024-04-30T15:27:36.360619Z  WARN This command should be run without the node running!
2024-04-30T15:27:36.360754Z  INFO Hashed 0 entries.
2024-04-30T15:27:36.360870Z  INFO Hashed 100 entries.
2024-04-30T15:27:36.360908Z  INFO start-key: {"key":"0x0000000000000000000000000000000000000000","highest_block_number":30161}
2024-04-30T15:27:36.360919Z  INFO end-key: {"key":"0x000000000000000000000000000000000000005e","highest_block_number":18446744073709551615}
2024-04-30T15:27:36.360934Z  INFO Checksum for table `AccountsHistory`: 0x98f8199844a34072 (elapsed: 178.629µs)
3. Compare the `start-key`, `end-key`, `checksum` with other. If they are the same, continue running the command with the `end-key` we got before as the new `start-key`, for example:
reth db checksum AccountsHistory --datadir ~/.local/share/reth/holesky --limit 100 --start-key '{"key":"0x000000000000000000000000000000000000005e","highest_block_number":18446744073709551615}'

We can get:

2024-04-30T15:32:01.614445Z  WARN This command should be run without the node running!
2024-04-30T15:32:01.614565Z  INFO start={"key":"0x000000000000000000000000000000000000005e","highest_block_number":18446744073709551615} 
 end= 
2024-04-30T15:32:01.614625Z  INFO Hashed 0 entries.
2024-04-30T15:32:01.614779Z  INFO Hashed 100 entries.
2024-04-30T15:32:01.614813Z  INFO start-key: {"key":"0x000000000000000000000000000000000000005e","highest_block_number":18446744073709551615}
2024-04-30T15:32:01.614820Z  INFO end-key: {"key":"0x00000000000000000000000000000000000000c1","highest_block_number":18446744073709551615}
2024-04-30T15:32:01.614833Z  INFO Checksum for table `AccountsHistory`: 0xadb85f752caba2fe (elapsed: 207.885µs)
4. repeat step 2 and 3 and to find the corrupt range. Or we can use binary search strategy if you like.

So instead of documenting the unit, user can know how key would look like by running directly with limit setted and without setting start-key and end-key.

this sounds great! taking a look at the changes

Copy link
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits, otherwise this looks pretty good!

@@ -41,10 +65,34 @@ impl<DB: Database> ChecksumViewer<'_, DB> {
let tx = provider.tx_ref();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a small info! message informing the user that the checksum is starting, and with which parameters (start, end, limit)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

AbnerZheng and others added 4 commits May 21, 2024 11:41
Co-authored-by: Dan Cline <6798349+Rjected@users.noreply.github.com>
Co-authored-by: Dan Cline <6798349+Rjected@users.noreply.github.com>
@AbnerZheng AbnerZheng requested a review from Rjected May 21, 2024 05:04
@mattsse
Copy link
Collaborator

mattsse commented May 22, 2024

bump @Rjected

@Rjected Rjected mentioned this pull request May 22, 2024
Copy link
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! this looks good to me, just tried it out:

dan@Dans-MacBook-Pro-4 ~/p/reth (issue-7561)> cargo run -- db checksum --chain holesky --end-key 0x000000000000c57cf0a1f923d44527e703f1ad70 PlainStorageState
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.34s
     Running `target/debug/reth db checksum --chain holesky --end-key 0x000000000000c57cf0a1f923d44527e703f1ad70 PlainStorageState`
2024-05-22T21:45:25.181274Z  WARN This command should be run without the node running!
2024-05-22T21:45:25.181992Z  INFO Start computing checksum, start=None, end=Some("\"0x000000000000c57cf0a1f923d44527e703f1ad70\""), limit=None
2024-05-22T21:45:25.183467Z  INFO Hashed 0 entries.
2024-05-22T21:45:25.183965Z  INFO Hashed 142 entries.
2024-05-22T21:45:25.184330Z  INFO start-key: "0x00000000000000447e69651d841bd8d104bed493"
2024-05-22T21:45:25.184366Z  INFO end-key: "0x000000000000c57cf0a1f923d44527e703f1ad70"
2024-05-22T21:45:25.184396Z  INFO Checksum for table `PlainStorageState`: 0x5b24949084bcf957 (elapsed: 1.119167ms)
dan@Dans-MacBook-Pro-4 ~/p/reth (issue-7561)> cargo run -- db checksum --chain holesky --limit 142 PlainStorageState
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.34s
     Running `target/debug/reth db checksum --chain holesky --limit 142 PlainStorageState`
2024-05-22T21:45:57.432259Z  WARN This command should be run without the node running!
2024-05-22T21:45:57.432980Z  INFO Start computing checksum, start=None, end=None, limit=Some(142)
2024-05-22T21:45:57.434404Z  INFO Hashed 0 entries.
2024-05-22T21:45:57.434888Z  INFO Hashed 142 entries.
2024-05-22T21:45:57.435563Z  INFO start-key: "0x00000000000000447e69651d841bd8d104bed493"
2024-05-22T21:45:57.435606Z  INFO end-key: "0x000000000000c57cf0a1f923d44527e703f1ad70"
2024-05-22T21:45:57.435636Z  INFO Checksum for table `PlainStorageState`: 0x5b24949084bcf957 (elapsed: 1.416042ms)

@Rjected Rjected added this pull request to the merge queue May 22, 2024
Merged via the queue into paradigmxyz:main with commit 3eddaf3 May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-cli Related to the reth CLI C-enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement table range checksums for reth db checksum

4 participants

Comments