Skip to content

Optimize update_index to avoid vec copies#7624

Merged
HaoranYi merged 1 commit intoanza-xyz:masterfrom
HaoranYi:refactor_reclamin
Sep 22, 2025
Merged

Optimize update_index to avoid vec copies#7624
HaoranYi merged 1 commit intoanza-xyz:masterfrom
HaoranYi:refactor_reclamin

Conversation

@HaoranYi
Copy link
Copy Markdown

@HaoranYi HaoranYi commented Aug 20, 2025

Problem

This PR optimizes update_index function to avoid vector copies.

Measurement on mainnet shows more than 50% cut of update_index_time when mark obsolete account is enabled.

#7624 (comment)

Summary of Changes

  1. Modified update_index Return Type
  • Changed return type from ReclaimsSlotList to Vec<ReclaimsSlotList>
  • Removed .flatten() operation in parallel processing to avoid vector copies
  • Updated caller in _store_accounts_frozen to use reclaims.iter().flatten() for iterating on the Vec<Vec<..>> directly.

Fixes #

@HaoranYi HaoranYi marked this pull request as draft August 20, 2025 15:33
@HaoranYi HaoranYi requested a review from Copilot August 20, 2025 15:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Refactors the handle_reclaims function to improve performance by avoiding unnecessary vector copying operations and simplifying conditional logic around empty reclaims.

  • Simplified the handle_reclaims function signature by removing the Option wrapper
  • Added early returns for empty reclaims to avoid unnecessary processing
  • Updated return type from flattened SlotList to Vec<SlotList> to preserve chunking structure

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread accounts-db/src/accounts_db.rs
@HaoranYi HaoranYi changed the title refactor handle_relcaim to avoid vec copies Refactor handle_reclaims to Avoid Vec Copies Aug 20, 2025
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Aug 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.9%. Comparing base (d081c19) to head (2221fed).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #7624    +/-   ##
========================================
  Coverage    82.9%    82.9%            
========================================
  Files         823      823            
  Lines      360972   360978     +6     
========================================
+ Hits       299404   299512   +108     
+ Misses      61568    61466   -102     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@HaoranYi HaoranYi changed the title Refactor handle_reclaims to Avoid Vec Copies Refactor handle_reclaims to Avoid Vec Copies and Option wrap/unwrap Aug 20, 2025
@HaoranYi HaoranYi force-pushed the refactor_reclamin branch 2 times, most recently from 015f484 to 19af0e4 Compare August 21, 2025 20:11
This was referenced Aug 22, 2025
@HaoranYi HaoranYi changed the title Refactor handle_reclaims to Avoid Vec Copies and Option wrap/unwrap Refactor/optimize handle_reclaims to avoid vec copies and option wrap/unwrap Sep 19, 2025
@HaoranYi HaoranYi marked this pull request as ready for review September 19, 2025 14:32
Copy link
Copy Markdown

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like removing the option wrappers on the reclaims iterators!

Comment thread accounts-db/src/accounts_db.rs
@brooksprumo
Copy link
Copy Markdown

Also, can the PR be split? Seems like one can be removing the Option wrapper on the iter, and the other can be about how we return results from update_index().

@HaoranYi
Copy link
Copy Markdown
Author

Also, can the PR be split? Seems like one can be removing the Option wrapper on the iter, and the other can be about how we return results from update_index().

yeah. #8113 is the the split pr for removing option wrapper on iter.

Once #8113 is merged. I will rebase this one for just the update_index() change.

In the meantime, I will mark this one as draft.

@HaoranYi HaoranYi marked this pull request as draft September 19, 2025 17:53
@HaoranYi HaoranYi changed the title Refactor/optimize handle_reclaims to avoid vec copies and option wrap/unwrap Optimize handle_reclaims to avoid vec copies and option wrap/unwrap Sep 19, 2025
@HaoranYi HaoranYi changed the title Optimize handle_reclaims to avoid vec copies and option wrap/unwrap Optimize update_index to avoid vec copies Sep 19, 2025
@HaoranYi HaoranYi marked this pull request as ready for review September 19, 2025 20:38
@HaoranYi
Copy link
Copy Markdown
Author

Now this PR is ready to review again. Thanks!

@brooksprumo
Copy link
Copy Markdown

How was this determined to be an actual problem? Is there a perf profile or metric?

Do we end up with a lot of reclaims usually? I'm wondering if we actually end up needing to grow the vec and reallocate often.

@HaoranYi
Copy link
Copy Markdown
Author

image

@HaoranYi
Copy link
Copy Markdown
Author

image

@HaoranYi
Copy link
Copy Markdown
Author

HaoranYi commented Sep 19, 2025

Before we have obsolete accounts, there are no reclaims as the code snippet shows - "IgnoreReclaims" is passed to update_index.
But After we enable obsolete accounts, there is going to be more reclaims, which will exercise the reclaim code path, which this PR is optimizing.

@HaoranYi
Copy link
Copy Markdown
Author

HaoranYi commented Sep 19, 2025

The above chat shows the stats for `accounts_db-flush_accounts_cache.handle_reclaims_elapsed_us'.
Red: is with mark_obsolete_account enabled. Blue is not.
We can see that we are spending around 100ms on handle_reclaims when make obsolete is enabled

@HaoranYi
Copy link
Copy Markdown
Author

HaoranYi commented Sep 19, 2025

How was this determined to be an actual problem? Is there a perf profile or metric?

Not a problem now. But will be a problem when we enable mark_obsolete_account.

I notice this when I review one of the mark_obsolete_accounts PR.

Do we end up with a lot of reclaims usually? I'm wondering if we actually end up needing to grow the vec and reallocate often.

Yes, when we enable mark_obsolete_accounts. And we will spend allocations to collect those reclaims in the original code path before this PR.

@HaoranYi
Copy link
Copy Markdown
Author

#8119

^To make the measurement of reclaims more intuitive, this above follow-up PR added a stat to track the number of reclaims.

@roryharr
Copy link
Copy Markdown

roryharr commented Sep 19, 2025

image

Does that have the restart in it around 400? or is that just workload randomness.
Otherwise I would like to see measurements!

Comment thread accounts-db/src/accounts_db.rs
@HaoranYi
Copy link
Copy Markdown
Author

image

@HaoranYi
Copy link
Copy Markdown
Author

^ Above is the update_index_us after this PR.
Blue is with this optimization. Red is not.
It cut down the update_index time by more than half.

@HaoranYi
Copy link
Copy Markdown
Author

image

Here is the number of reclaims. We avoid allocating and copying 20k-40k reclaims vecs.

Copy link
Copy Markdown

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Wow, way more reclaims than I would've thought. Nice find.

@HaoranYi
Copy link
Copy Markdown
Author

:shipit:

Wow, way more reclaims than I would've thought. Nice find.

yup!

@HaoranYi HaoranYi merged commit 7c6a2d8 into anza-xyz:master Sep 22, 2025
43 checks passed
@HaoranYi HaoranYi deleted the refactor_reclamin branch September 22, 2025 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants