Skip to content

Removes index's root check in get_snapshot_storages()#3784

Merged
brooksprumo merged 1 commit intoanza-xyz:masterfrom
brooksprumo:get-snapshot-storages
Dec 2, 2024
Merged

Removes index's root check in get_snapshot_storages()#3784
brooksprumo merged 1 commit intoanza-xyz:masterfrom
brooksprumo:get-snapshot-storages

Conversation

@brooksprumo
Copy link
Copy Markdown

@brooksprumo brooksprumo commented Nov 25, 2024

Problem

get_snapshot_storages() needlessly filters the requested slots if they are roots.

  1. Maybe the caller actually needs all the slots, regardless if they are rooted yet?
  2. All the callers that only operate on roots already enforce this, and only submit roots as their requested_slots.

Summary of Changes

Remove the accounts index checks in get_snapshot_storages().

More Information

There are four places where we call get_snapshots_storages(). All of them already only ask for rooted slots.

  1. clean_accounts()

Since #3737, we now call get_snapshot_storages() in clean_accounts(). We guarantee that the slots clean works on are rooted, so we don't need to do redundant checks in get_snapshot_storages(); all the storages clean` asks for will be rooted.

  1. snapshots -- normal validator operation

AccountsBackgroundServices periodically takes snapshots. ABS only works on rooted slots. So whenever it calls get_snapshot_storages(), it'll only ask for rooted slots. Same as (1).

  1. snapshots -- ledger-tool

When creating a snapshot with ledger-tool, by definition the snapshot slot becomes a rooted slot. Therefore the call to get_snapshot_storages() is inherently also all roots.

  1. snapshots -- startup verification

At startup, we get the storages and perform verification of all the accounts. Since snapshots by definition contain only roots (see (3)), that means calling get_snapshot_storages() here can only get roots.

In all the scenarios we already can/do only ask for rooted slots. So we don't need redundant checks inside get_snapshot_storages().

@brooksprumo brooksprumo self-assigned this Nov 25, 2024
@brooksprumo brooksprumo marked this pull request as ready for review November 26, 2024 00:12
Copy link
Copy Markdown

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is correct.

To be cautious, how about make it an assert and let it running for a while?

When we are comfortable, we can then remove it.

@brooksprumo
Copy link
Copy Markdown
Author

To be cautious, how about make it an assert and let it running for a while?

We could do that, but we also can inspect all the callers here and guarantee that we only ask for roots. (Which should be the info in the PR description.)

Separately, I feel the caller of get_snapshot_storages() should be responsible for asking for the correct slots (rooted or not), as opposed to the get_snapshots_storages() fn itself.

I'm not sure in which case(s) the assert would help. Wdyt?

@HaoranYi
Copy link
Copy Markdown

Basically, we are talking about invariant-based programming.

https://en.wikipedia.org/wiki/Invariant-based_programming

Here we assume that all the slot should be root. Without assert, we may silent execute the code but won't be able find out the problem immediately. Maybe, we will see that snapshot is bad or clean is not fully working later on. That will make debugging hard.

With assert, we will be sure to cache the breaking off the invariant immediately and save us the debugging down the road.

@HaoranYi
Copy link
Copy Markdown

HaoranYi commented Nov 26, 2024

If you worries about the performance impact, we could probably use debug_assert too.

@brooksprumo
Copy link
Copy Markdown
Author

Here we assume that all the slot should be root. Without assert, we may silent execute the code but won't be able find out the problem immediately. Maybe, we will see that snapshot is bad or clean is not fully working later on. That will make debugging hard.

With assert, we will be sure to cache the breaking off the invariant immediately and save us the debugging down the road.

Yes, I agree with you. I am not against the idea of asserting we only have roots. My point is more that the caller should be the one asserting the requested slots are roots, not inside get_snapshot_storages().

We can put asserts for all the callers, but we're already guaranteed the slots are all roots already. So it would be a redundant assert. That's more the point I'm intending to communicate.

@HaoranYi
Copy link
Copy Markdown

I agree for existing callers. I am thinking about the future when we add the call this fn, we may forget about this ...

Copy link
Copy Markdown

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown

@jeffwashington jeffwashington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@brooksprumo brooksprumo merged commit 1258701 into anza-xyz:master Dec 2, 2024
@brooksprumo brooksprumo deleted the get-snapshot-storages branch December 2, 2024 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants