Shuffle order of scanning account storages in calculate_accounts_lt_hash_at_startup#7226
Conversation
|
|
||
| impl<'a> IntoIterator for AccountStoragesOrderer<'a> { | ||
| type Item = &'a AccountStorageEntry; | ||
| type IntoIter = Box<dyn Iterator<Item = &'a AccountStorageEntry> + 'a>; |
There was a problem hiding this comment.
clippy wanted me to implement IntoIterator trait, but then compiler complained I can't use impl Iterator as associated type value, bacause it's unstable :(
There was a problem hiding this comment.
Is that why we need this to be a Boxed dyn?
There was a problem hiding this comment.
We could call the method something other than into_iter() and then clippy should be OK.
For example one of these:
pub fn my_iter(&self) -> impl ExactSizeIterator<Item = &'a AccountStorageEntry> + use<'a, '_> {
self.indices.iter().map(|i| self.storages[*i].as_ref())
}
pub fn my_into_iter(self) -> impl ExactSizeIterator<Item = &'a AccountStorageEntry> + use<'a> {
self.indices.into_iter().map(|i| self.storages[i].as_ref())
}And then we can remove the impl IntoIterator until the impl trait is allowed in associated types here.
(or fn into_seq_iter(), etc...)
There was a problem hiding this comment.
yap, without impl_trait_in_assoc_type I need to name the iterator type explicitly, other approaches:
- create a struct that encapsulate the
mapoperation and implementIteratortrait, such that IntoIterator can just use the struct's type here - work-around the clippy insistence on having
into_iter()as actualIntoIteratorimplementation, e.g. use a non-standard naming likemake_iter - a deeper refactor such that we don't use
mapto create iterator, then we can probably name the type being used as iterator implementation
Similar concern is with IntoParallelIterator, but for it there is no clippy check
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7226 +/- ##
========================================
Coverage 83.1% 83.2%
========================================
Files 850 850
Lines 369949 369978 +29
========================================
+ Hits 307739 307849 +110
+ Misses 62210 62129 -81 🚀 New features to boost your workflow:
|
|
|
||
| impl<'a> IntoIterator for AccountStoragesOrderer<'a> { | ||
| type Item = &'a AccountStorageEntry; | ||
| type IntoIter = Box<dyn Iterator<Item = &'a AccountStorageEntry> + 'a>; |
There was a problem hiding this comment.
Is that why we need this to be a Boxed dyn?
Problem
We scan storages using rayon work splitting heuristic that treats each collection element as relatively similar in cost / processing time. However account storages have vastly differing sizes and their default order actually has clusters of large vs small storages.
This causes work to be split into a unit batch that apparently has mostly large storages and may run for >100s longer than other theads that sit idle without being able to pick more work.
Summary of Changes
account_storage.rsmodule as more genericAccountStoragesOrdererutil wrapperAt current snapshot (slot 356410350) unpacking and verification:
vs baseline (master for the same data):