Skip to content

v2.3: Adds Offset to callback on scan_accounts() family (backport of #6744)#6755

Merged
brooksprumo merged 2 commits intov2.3from
mergify/bp/v2.3/pr-6744
Jul 9, 2025
Merged

v2.3: Adds Offset to callback on scan_accounts() family (backport of #6744)#6755
brooksprumo merged 2 commits intov2.3from
mergify/bp/v2.3/pr-6744

Conversation

@mergify
Copy link
Copy Markdown

@mergify mergify Bot commented Jun 26, 2025

Problem

When building the accounts index, if secondary indexes are on, we scan each account storage file twice. This is bad, performance wise. The second scan is needed to get each account's data, which secondary indexes need.

In theory we should be able to switch on if secondary indexes are enabled or not, and then decide how to scan the storage. E.g. if secondary indexes are on, scan once with account data. If secondary indexes are off, scan once without account data.

The problem this PR is solving is that there isn't a scan function that returns both account data and all the info the index needs. IOW, when secondary indexes are on, there's not a scan function available for us to use.

Summary of Changes

We observe that the only missing piece of information while scanning with account data is the offset of the account within the file. We also observe that the implementations of the scan functions do already have the offset information! So, we can bubble that information up, which will allow the next PR to use if for speeding up building the accounts index.

Tl;dr: Add an Offset field to the scan_accounts() family of functions.

Justification to Backport

This PR is a prerequisite if we want to backport #6745. If we don't backport 6745, then we do not need to backport this PR.


This is an automatic backport of pull request #6744 done by [Mergify](https://mergify.com).

(cherry picked from commit b979967)

# Conflicts:
#	accounts-db/src/tiered_storage/hot.rs
#	accounts-db/src/tiered_storage/readable.rs
@mergify mergify Bot added the conflicts label Jun 26, 2025
@mergify mergify Bot requested a review from a team as a code owner June 26, 2025 18:38
@mergify
Copy link
Copy Markdown
Author

mergify Bot commented Jun 26, 2025

Cherry-pick of b979967 has failed:

On branch mergify/bp/v2.3/pr-6744
Your branch is up to date with 'origin/v2.3'.

You are currently cherry-picking commit b9799676f.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   accounts-db/src/accounts_db.rs
	modified:   accounts-db/src/accounts_db/scan_account_storage.rs
	modified:   accounts-db/src/accounts_db/tests.rs
	modified:   accounts-db/src/accounts_file.rs
	modified:   accounts-db/src/ancient_append_vecs.rs
	modified:   accounts-db/src/append_vec.rs
	modified:   accounts-db/src/tiered_storage.rs
	modified:   runtime/src/bank/accounts_lt_hash.rs

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   accounts-db/src/tiered_storage/hot.rs
	both modified:   accounts-db/src/tiered_storage/readable.rs

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@brooksprumo brooksprumo requested review from HaoranYi and roryharr June 26, 2025 19:07
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 26, 2025

Codecov Report

❌ Patch coverage is 90.24390% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.8%. Comparing base (6ea01d8) to head (d0c937c).
⚠️ Report is 59 commits behind head on v2.3.

Additional details and impacted files
@@            Coverage Diff            @@
##             v2.3    #6755     +/-   ##
=========================================
- Coverage    82.8%    82.8%   -0.1%     
=========================================
  Files         850      850             
  Lines      379572   379579      +7     
=========================================
- Hits       314484   314464     -20     
- Misses      65088    65115     +27     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@t-nelson
Copy link
Copy Markdown

really seems like we're doing this backward and should have refactored the scan functions properly rather than the kludge half-step in the wrong direction

@brooksprumo
Copy link
Copy Markdown

really seems like we're doing this backward and should have refactored the scan functions properly rather than the kludge half-step in the wrong direction

What part is going in the wrong direction? Or iow, what should the scan functions be refactored to look like?

@t-nelson
Copy link
Copy Markdown

What part is going in the wrong direction? Or iow, what should the scan functions be refactored to look like?

exposing the one new thing needed. #[allow(clippy::too_many_arguments)] coded

@brooksprumo
Copy link
Copy Markdown

exposing the one new thing needed. #[allow(clippy::too_many_arguments)] coded

We need the offset during index generation. The offset is already exposed in the scan_index() function, so it's not a new thing entirely. If we want to do single scan of the account storage files then we need to have both the offset and the account. Currently there are no scan functions that do both. Is there an alternative? I'm happy to do that instead.

@t-nelson
Copy link
Copy Markdown

t-nelson commented Jul 9, 2025

ok, but i'm nixing the next field that needs hoisted into the api

@brooksprumo brooksprumo merged commit 9963221 into v2.3 Jul 9, 2025
46 checks passed
@brooksprumo brooksprumo deleted the mergify/bp/v2.3/pr-6744 branch July 9, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants