v2.1: Refactors get_snapshot_storages() (backport of #3760)#3785
v2.1: Refactors get_snapshot_storages() (backport of #3760)#3785brooksprumo merged 1 commit intov2.1from
Conversation
|
Blocked until the version bump pr (#3782) is merged. |
|
@Mergifyio rebase |
❌ Base branch update has failedDetails
|
|
@Mergifyio rebase |
✅ Branch has been successfully rebased |
3976744 to
7d7f7af
Compare
|
@Mergifyio rebase |
✅ Branch has been successfully rebased |
7d7f7af to
9d62df6
Compare
|
Blocked until version bump PR is merged: #3819 |
|
@Mergifyio rebase |
(cherry picked from commit 8c7ae80)
✅ Branch has been successfully rebased |
9d62df6 to
71cc91e
Compare
jeffwashington
left a comment
There was a problem hiding this comment.
lgtm. I'm not a strong advocate for backport, but i'm ok with it. I don't imagine any problems with backporting.
HaoranYi
left a comment
There was a problem hiding this comment.
seems little risk but potential good performance improvement.
lgtm.
| &self, | ||
| predicate: impl Fn(&Slot, &AccountStorageEntry) -> bool, | ||
| ) -> Box<[(Slot, Arc<AccountStorageEntry>)]> { | ||
| assert!(self.no_shrink_in_progress()); |
There was a problem hiding this comment.
This assert does exist in the old version as well.
In the old version we do:
agave/accounts-db/src/accounts_db.rs
Lines 8276 to 8284 in d07fc9b
and that .iter() also has the assert:
agave/accounts-db/src/account_storage.rs
Lines 140 to 144 in d07fc9b
| .into_vec() | ||
| .into_iter() | ||
| .unzip(); |
There was a problem hiding this comment.
This .into_vec() won't allocate, it'll steal the allocation from the Box, so we're good there.
For the unzip, we can refactor that as well. I'd love to do that. I didn't want to do that in a backport though. So we keep the same return types and keep the PR small. (Note in the original we also unzip, so we're not adding an unzip here in the backport. We also know that the backport is faster too.)
|
@t-nelson I tried to address the code that I remember you calling out. Were there other areas of concern that I missed? |
|
I'll plan to merge tomorrow unless there are new comments/concerns. |
Problem
AccountsDb::get_snapshot_storages()is due for a refactor. We can speed it up for the common case, and also simplify.Since #3737, we now call
get_snapshot_storages()every time weclean. The observation here is we'll only need about 100 storages (on average), yet the current impl ofget_snapshot_storages()Arc::clone's all the storages, and then filters out the unneeded ones. We can change this to only Arc::clone the useful ones instead.Additionally, the filter step is done in parallel. When we only have 100 storages, the parallel execution does not help. In fact, with a chunk size of 5000, we end up getting zero benefit, but do have to pay the cost of running in the thread pool.
Summary of Changes
Results
I ran this on against mnb and saw good results.
Since
get_snapshot_storages()is called in two places, I wanted to look at the perf results in both.cleanHere, we only need ~100 storages each time. So not Arc::cloning and not using the thread pool really helps. The PR ends up running consistently, and meaningfully, faster:
Full snapshots do need all the storages, so we will end up Arc::cloning almost all the storages. And it's possible the thread pool does help here. For the most part, runtimes are pretty similar. The PR does have a worse worst-case filter time.
Overall, I think the common case of
cleanmakes this change clearly a win. There is maybe a slight slow down for full snapshots, but that code is both infrequent, and in the background, so I don't think it matters much. Additionally, by not using a thread pool, we may reduce resource usage for the system as a whole.Justification to Backport
As per the Problem section, we now call get_snapshot_storages() during
clean. Until the skipping rewrites feature is enabled, that means we'll be doing this extra work. The feature is in v2.1, and so mnb on v2.0 and v2.1 will be paying the extra cost. Thus, improving the performance for get_snapshot_storages() benefits the validator as a whole.In theory we could also backport to v2.0 as well, but I'm not currently advocating for that yet.
This is an automatic backport of pull request #3760 done by [Mergify](https://mergify.com).