v3.0: runtime: Avoid locking during stake vote rewards calculation (backport of #6900)#7725
Closed
mergify[bot] wants to merge 1 commit intov3.0from
Closed
v3.0: runtime: Avoid locking during stake vote rewards calculation (backport of #6900)#7725mergify[bot] wants to merge 1 commit intov3.0from
mergify[bot] wants to merge 1 commit intov3.0from
Conversation
`calculate_stake_vote_rewards` was storing accumulated rewards per vote account in a `DashMap`, which then was used in a parallel iterator over all stake delegations. There are over 1,000,000 stake delegations and around 1,000 validators. Each thread processes one of the stake delegations and tries to acquire the lock on a `DashMap` shard corresponding to a validator. Given that the number of validators is disproportionally small and they have thousands of delegations, such solution results in high contention, with some threads spending the most of their time on waiting for lock. The time spent on these calculations was ~208.47ms: ``` redeem_rewards_us=208475i ``` Fix that by: * Removing the `DashMap` and instead using `fold` and `reduce` operations to build a regular `HashMap`. * Pre-allocating the `stake_rewards` vector and passing `&mut [MaybeUninit<PartitionedStakeReward>]` to the thread pool. * Pulling the optimization of `StakeHistory::get` in `solana-stake-interface`. solana-program/stake#81 ``` redeem_rewards_us=48781i ``` (cherry picked from commit e752ae6) # Conflicts: # Cargo.toml # programs/sbf/Cargo.toml
Author
|
Cherry-pick of e752ae6 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
calculate_stake_vote_rewardswas storing accumulated rewards per vote account in aDashMap, which then was used in a parallel iterator over all stake delegations.There are over 1,000,000 stake delegations and around 1,000 validators. Each thread processed one of the stake delegations and tried to acquire a lock on a
DashMapshard corresponding to a validator. Given that the number of validators is disproportionally small and they have thousands of delegations, such solution resulted in high contention, with some threads spending the most of their time on waiting for lock.The time spent on these calculations was ~232.21ms:
Threads spent 65% of their time on waiting for locks:
Summary of Changes
Fix that by:
DashMapand instead usingfoldandreduceoperations to build a regularHashMap.stake_rewardsvector and passing&mut [MaybeUninit<PartitionedStakeReward>]to the thread pool.StakeHistory::getinsolana-stake-interfaceinterface: Optimize theStakeHistory::getfunction solana-program/stake#81The time spent on reward calculations goes down to ~48.78ms:
Threads spend the most of time doing actual calculations:
Fixes #6899
This is an automatic backport of pull request #6900 done by [Mergify](https://mergify.com).