Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: index account/storage history #978

Merged
merged 10 commits into from
Jan 26, 2023
Merged

feat: index account/storage history #978

merged 10 commits into from
Jan 26, 2023

Conversation

rakita
Copy link
Collaborator

@rakita rakita commented Jan 23, 2023

ref: #815

Base automatically changed from rakita/selfdestruct_changeset to main January 23, 2023 13:04
@rakita rakita force-pushed the rakita/history_indices branch from 7fd6b99 to c1931ba Compare January 23, 2023 13:06
@codecov-commenter
Copy link

codecov-commenter commented Jan 23, 2023

Codecov Report

Merging #978 (9fa194d) into main (2397c54) will increase coverage by 0.52%.
The diff coverage is 98.14%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main     #978      +/-   ##
==========================================
+ Coverage   74.36%   74.88%   +0.52%     
==========================================
  Files         309      312       +3     
  Lines       33312    34064     +752     
==========================================
+ Hits        24773    25510     +737     
- Misses       8539     8554      +15     
Flag Coverage Δ
unit-tests 74.88% <98.14%> (+0.52%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
crates/stages/src/test_utils/test_db.rs 73.02% <ø> (ø)
crates/storage/db/src/tables/mod.rs 0.00% <ø> (ø)
crates/storage/db/src/tables/models/mod.rs 85.71% <ø> (ø)
...rates/storage/provider/src/providers/historical.rs 0.00% <0.00%> (ø)
crates/stages/src/stages/index_account_history.rs 98.24% <98.24%> (ø)
crates/stages/src/stages/index_storage_history.rs 98.40% <98.40%> (ø)
crates/storage/db/src/implementation/mdbx/mod.rs 98.89% <100.00%> (+0.03%) ⬆️
crates/storage/db/src/tables/models/sharded_key.rs 100.00% <100.00%> (ø)
...torage/db/src/tables/models/storage_sharded_key.rs 100.00% <100.00%> (ø)
crates/stages/src/stages/sender_recovery.rs 92.59% <0.00%> (-0.53%) ⬇️
... and 3 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@onbjerg onbjerg added C-enhancement New feature or request A-staged-sync Related to staged sync (pipelines and stages) labels Jan 23, 2023
crates/stages/src/db.rs Outdated Show resolved Hide resolved
crates/stages/src/db.rs Outdated Show resolved Hide resolved
crates/stages/src/db.rs Outdated Show resolved Hide resolved
Comment on lines +55 to +56
let to_block =
std::cmp::min(stage_progress + self.commit_threshold, previous_stage_progress);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be done with exec_or_return macro (see other stages)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to use macro it hides this simple intention, and as NOTE said it would be better to do bundles with transition id than a block number, so I did it like this.

crates/stages/src/stages/index_account_history.rs Outdated Show resolved Hide resolved
crates/stages/src/stages/index_account_history.rs Outdated Show resolved Hide resolved
@rakita rakita force-pushed the rakita/history_indices branch from 71658c3 to aaf8318 Compare January 24, 2023 13:26
@rakita
Copy link
Collaborator Author

rakita commented Jan 24, 2023

I added an additional change to indices and set the last shard to be (Address, u64::MAX) it is a small change but would allow us faster fetching of shard that contains our transition id.

Just using seek_exact would fetch us the TransactionList that we need while previously we would need to use cursor prev to get to latest shard and check if it contains our transition list.

@rakita rakita requested review from joshieDo and rkrasiuk January 26, 2023 11:04
Comment on lines 103 to 110
// Insert last list with u64::MAX
if let Some(last_list) = last_chunk {
tx.put::<tables::StorageHistory>(
StorageShardedKey::new(address, storage_key, u64::MAX),
TransitionList::new(last_list)
.expect("Indices are presorted and not empty"),
)?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the last list is inserted with u64::MAX? can we add more detailed description for struct IndexStorageHistoryStage explaining what it actually stores and how (with special cases like u64::MAX)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crates/stages/src/stages/index_account_history.rs Outdated Show resolved Hide resolved
Comment on lines 119 to 148
tx.cursor_read::<tables::AccountChangeSet>()?
.walk(from_transition_rev)?
.take_while(|res| res.as_ref().map(|(k, _)| *k < to_transition_rev).unwrap_or_default())
.collect::<Result<Vec<_>, _>>()?
.into_iter()
// reverse so we can get lowest transition id where we need to unwind account.
.rev()
// fold all account and get last transition index
.fold(BTreeMap::new(), |mut accounts: BTreeMap<Address, u64>, (index, account)| {
// we just need address and lowest transition id.
accounts.insert(account.address, index);
accounts
})
.into_iter()
// try to unwind the index
.try_for_each(|(address, rem_index)| -> Result<(), StageError> {
let shard_part =
unwind_account_history_shards::<DB>(&mut cursor, address, rem_index)?;

// check last shard_part, if present, items needs to be reinserted.
if !shard_part.is_empty() {
// there are items in list
tx.put::<tables::AccountHistory>(
ShardedKey::new(address, u64::MAX),
TransitionList::new(shard_part)
.expect("There is at least one element in list and it is sorted."),
)?;
}
Ok(())
})?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we flatten this (maybe already commented this)? it's quite difficult to read. same in IndexStorageHistoryStage

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, left already comment like this. what i mean by flatten is

// Collect all account changesets.
let account_changesets = tx.cursor_read::<tables::AccountChangeSet>()?
     // ... walk, take_while
     ;

// Create a map with lowest transition ids
let account_map = BTreeMap::default();
for entry in account_changesets {
    let (index, account) = entry?;
    account_map.insert(account.address, index); // maybe we need some logic to determine the lowest
}

for (address, rem_index) in account_map {
    // ... 
}

Copy link
Collaborator Author

@rakita rakita Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym with flatten here, collect it and use for _ in _? didn't see your comment when writing this.

Will flatten it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is flatten :). To be honest i still prefer iter chains, but this is nice too

@rakita rakita requested a review from rkrasiuk January 26, 2023 14:04
Copy link
Member

@rkrasiuk rkrasiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, thanks for cleaning up!

@rakita rakita merged commit 6dcced0 into main Jan 26, 2023
@rakita rakita deleted the rakita/history_indices branch January 26, 2023 16:03
onbjerg added a commit that referenced this pull request Jan 27, 2023
These were introduced in #978
literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 5, 2023
literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 5, 2023
literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-staged-sync Related to staged sync (pipelines and stages) C-enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants