Stateless witness prefetcher changes#29519
Conversation
There was a problem hiding this comment.
Changing the comment is nice, but the code doesn't reflect it :P
The code should check if sf.tasks is nil or not and if not, should keep looping until it becomes so, otherwise we run the risk of receiving a last task and immediately closing down; the close being executed first (remember, select branch evaluation is non-deterministic if multiple channels are ready).
There was a problem hiding this comment.
These are IMO not good changes. It makes things harder to reason about and the close becomes this magic thing that nukes the prefercher offline, but I'm not sure that's the intended case, since close is also teh thing that waits for data to be finished. So we need to figure out what close does: kill it, or wait on it.
There was a problem hiding this comment.
I don't really see the point of this change, it makes peek useless after close, but close it teh thing that waits for all the data to be loaded, so it's kind of ... weird
There was a problem hiding this comment.
But it can be called multiple times. Close might also not be the best name since we're waiting for it to finish but should AFAIK not kill the thing.
There was a problem hiding this comment.
I'm unsure about this code path here with the rewrite. Do we want to allow retrieval from a live prefetcher? If yes, why? Perhaps for tx boundaries? We should really document it somewhere why - if - it's needed. It's a very specific use case.
There was a problem hiding this comment.
When we call updateTrie on a state object, we attempt to source the trie from the prefetcher. So copying from a live prefetcher is used here to preserve that functionality.
karalabe
left a comment
There was a problem hiding this comment.
I think one issue that teh PR does not address but it must is what the new lifecycle of the prefetcher is. Previously it was jsut something we threw data at, and then at some point we aborted it and pulled every useful data it [re-loaded and built our stuff on top.
The new logic seems to push it towards a witness where we wait for all data to be loaded before pulling and operating on it. But the code doesn't seem to reflect that, many paths instead becomming dud after a close.
Either this PR is only half the code needed that actually uses the prefetcher as is, or something's kind of borked. Either way, we must define what the intended behavior is and both document it as well as make sure teh prefetcher adheres to it.
I'm kind of wondering whether close is needed, rather we should have a wait method which perhaps just ensure everything is loaded. Whether we're between txs or at block end, waiting for prefetching to finish makes sense. I guess close might be needed to nuke out the loop goroutine, but we should still have a wait then before peeking at stuff. Ah, I guess the "implicit" behavioral thing this PR is aiming for is that the prefetcher is not thread safe so by the time qwe wall peek, any shceduled data is already prefetched. I don't think that's the case, at least it's a dangerous data-race to assume that events fired on 2 different channels will arrive in the exact order one expects. If this is the inteded behavior, I'd rather make it ever so slightly more explicit that hoping for a good order of events.
As I see it, the prefetcher needs a couple of phases.
Perhaps we need something more elaborate than this, but, whatever we need, we would be well served by first jotting down the description in human language; before doing some lock/mutex/channel-based implementation of "something" |
|
As I understand the difference between the old and new prefetcher is (should be) as follows:
|
It's actually meant to gather witnesses for read values. In the stateless witness builder PR, I gather write witnesses from committing the tries. But iirc, earlier on the call today you mentioned not tying the retrieval of write witnesses to the commit operation, which would change the assumptions from my original code. |
…ssociated subfetcher Co-authored-by: Martin HS <martin@swende.se> Co-authored-by: Péter Szilágyi <peterke@gmail.com>
| // if a prefetcher is available. This path is used if snapshots are unavailable, | ||
| // since that requires reading the trie *during* execution, when the prefetchers | ||
| // cannot yet return data. | ||
| func (s *stateObject) getTrie(skipPrefetcher bool) (Trie, error) { |
There was a problem hiding this comment.
FWIW, skipPrefetcher is kind of an ugly hack, I just wanted to avoid the lack-of-snapshot poking into the prefetcher. Open to cleaner suggestions.
| if s.data.Root == types.EmptyRootHash || s.db.prefetcher == nil { | ||
| return nil, nil | ||
| } | ||
| // Attempt to retrieve the trie from the pretecher |
There was a problem hiding this comment.
typo pretecher => prefetcher
…ereum#29519) * core/state: trie prefetcher change: calling trie() doesn't stop the associated subfetcher Co-authored-by: Martin HS <martin@swende.se> Co-authored-by: Péter Szilágyi <peterke@gmail.com> * core/state: improve prefetcher * core/state: restore async prefetcher stask scheduling * core/state: finish prefetching async and process storage updates async * core/state: don't use the prefetcher for missing snapshot items * core/state: remove update concurrency for Verkle tries * core/state: add some termination checks to prefetcher async shutdowns * core/state: differentiate db tries and prefetched tries * core/state: teh teh teh --------- Co-authored-by: Jared Wasinger <j-wasinger@hotmail.com> Co-authored-by: Martin HS <martin@swende.se> Co-authored-by: Gary Rong <garyrong0905@gmail.com>
| @@ -17,6 +17,7 @@ | |||
| package state | |||
|
|
|||
| import ( | |||

Superseeds #29035 because OP didn't permit modifications from maintainers...