Skip to content
This repository was archived by the owner on Jan 22, 2025. It is now read-only.

decouple geyser's write_version from append vec on snapshot load#29623

Merged
jeffwashington merged 1 commit intosolana-labs:masterfrom
jeffwashington:jj57
Jan 11, 2023
Merged

decouple geyser's write_version from append vec on snapshot load#29623
jeffwashington merged 1 commit intosolana-labs:masterfrom
jeffwashington:jj57

Conversation

@jeffwashington
Copy link
Copy Markdown
Contributor

@jeffwashington jeffwashington commented Jan 10, 2023

Problem

we want to stop persisting write_version_obsolete in append vecs. Geyser uses this field specifically

Summary of Changes

Always pass 0 for write_version_obsolete to geyser. This means geyser doesn't care what write_version_obsolete exists in the append vec.

Fixes #

lijunwangs
lijunwangs previously approved these changes Jan 10, 2023
Copy link
Copy Markdown
Contributor

@lijunwangs lijunwangs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@mergify mergify Bot dismissed lijunwangs’s stale review January 10, 2023 19:52

Pull request has been modified.

@jeffwashington jeffwashington merged commit 0ee9993 into solana-labs:master Jan 11, 2023
@ckamm
Copy link
Copy Markdown
Contributor

ckamm commented Jan 26, 2023

@jeffwashington This means that geyser is guaranteed to be notified about account writes to an account in a slot in the order that they happened?

I think that already was the case before, but write_version left open the option that maybe geyser would be notified about writes out-of-order somehow.

@jeffwashington
Copy link
Copy Markdown
Contributor Author

@ckamm, there was never a guarantee that geyser would be notified about account writes in the order they happened. The prior code put all accounts in a HashMap and then iterated the HashMap values to call geyser. So, that became in order of HashMap, which is essentially random and non-deterministic.

write_version only matters if there are two accounts with the same pubkey in the same slot. In that case, the higher write_version indicates the more recent account write.

If you are referring to writes to an individual account in the order that it happened, then kind of. The accounts were already de-duped by the HashMap so only the account write that was most recent is contained in the HashMap. As of 1.14, it is no longer possible to disable the write cache. So as of this code in 1.15(?), there will never be more than 1 append vec created per slot. And, the validator will never write the same pubkey twice in the same slot. So, all this code related to write_version becomes moot for this code. This code is only used when loading from a snapshot. Once mnb is on 1.14, then we know that the rules apply that there can only be 1 append vec per slot, and that each pubkey in an append vec will only be written once. But, even if it were written multiple times, the most recent one in the same append vec supersedes the prior one.

There is a lot going on in these words. Feel free to dig deeper with questions.

@ckamm
Copy link
Copy Markdown
Contributor

ckamm commented Jan 26, 2023

@jeffwashington Thanks for the in-depth answer! Yes, I meant only the ordering of writes to a single account. Seeing only up to one write per account per slot in geyser should make things a little easier!

(It sounds this has already been the case for a while - but I do remember seeing multiple when I last looked in detail, many versions back. Joining geyser event streams from different RPC nodes was a major pain because of rpc-local write versions back then - looks like it'll be easier now)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants