Stateless witness prefetcher changes by karalabe · Pull Request #29519 · ethereum/go-ethereum

karalabe · 2024-04-12T12:07:37Z

Superseeds #29035 because OP didn't permit modifications from maintainers...

karalabe · 2024-04-15T09:49:40Z

Changing the comment is nice, but the code doesn't reflect it :P

The code should check if sf.tasks is nil or not and if not, should keep looping until it becomes so, otherwise we run the risk of receiving a last task and immediately closing down; the close being executed first (remember, select branch evaluation is non-deterministic if multiple channels are ready).

karalabe · 2024-04-15T09:51:44Z

These are IMO not good changes. It makes things harder to reason about and the close becomes this magic thing that nukes the prefercher offline, but I'm not sure that's the intended case, since close is also teh thing that waits for data to be finished. So we need to figure out what close does: kill it, or wait on it.

karalabe · 2024-04-15T09:52:31Z

I don't really see the point of this change, it makes peek useless after close, but close it teh thing that waits for all the data to be loaded, so it's kind of ... weird

karalabe · 2024-04-15T09:53:08Z

But it can be called multiple times. Close might also not be the best name since we're waiting for it to finish but should AFAIK not kill the thing.

karalabe · 2024-04-15T09:54:56Z

I'm unsure about this code path here with the rewrite. Do we want to allow retrieval from a live prefetcher? If yes, why? Perhaps for tx boundaries? We should really document it somewhere why - if - it's needed. It's a very specific use case.

When we call updateTrie on a state object, we attempt to source the trie from the prefetcher. So copying from a live prefetcher is used here to preserve that functionality.

karalabe

I think one issue that teh PR does not address but it must is what the new lifecycle of the prefetcher is. Previously it was jsut something we threw data at, and then at some point we aborted it and pulled every useful data it [re-loaded and built our stuff on top.

The new logic seems to push it towards a witness where we wait for all data to be loaded before pulling and operating on it. But the code doesn't seem to reflect that, many paths instead becomming dud after a close.

Either this PR is only half the code needed that actually uses the prefetcher as is, or something's kind of borked. Either way, we must define what the intended behavior is and both document it as well as make sure teh prefetcher adheres to it.

I'm kind of wondering whether close is needed, rather we should have a wait method which perhaps just ensure everything is loaded. Whether we're between txs or at block end, waiting for prefetching to finish makes sense. I guess close might be needed to nuke out the loop goroutine, but we should still have a wait then before peeking at stuff. Ah, I guess the "implicit" behavioral thing this PR is aiming for is that the prefetcher is not thread safe so by the time qwe wall peek, any shceduled data is already prefetched. I don't think that's the case, at least it's a dangerous data-race to assume that events fired on 2 different channels will arrive in the exact order one expects. If this is the inteded behavior, I'd rather make it ever so slightly more explicit that hoping for a good order of events.

holiman · 2024-04-15T10:39:42Z

I'm kind of wondering whether close is needed, rather we should have a wait method which perhaps just ensure everything is loaded

As I see it, the prefetcher needs a couple of phases.

Phase 1: open for scheduling. At this point, it accepts tasks to be fetched. Callers must not (cannot?) retrieve data from it at this point. When an external caller tells it to, it goes into
Phase 2: No longer open for scheduling tasks. At this point, finishes all tasks, and once all tasks are done, it goes into
Phase 3: (again, not open for scheduling tasks) At this point, callers can retrieve data from it.

Perhaps we need something more elaborate than this, but, whatever we need, we would be well served by first jotting down the description in human language; before doing some lock/mutex/channel-based implementation of "something"

karalabe · 2024-04-15T13:41:08Z

As I understand the difference between the old and new prefetcher is (should be) as follows:

Old pre-fetcher:
- Purpose is to warm up the trie during execution, so that while we're crunching some EVM code, our disk is kept busy pulling in data that the hasher will need at the end.
- All operations are async, running in the background, the most important thing is to never ever block. If we get more useful data great, if less, thats life, but we should never hold up execution.
- When execution reaches a boundary (IntermediateRoot pre-Byzantium; or Finalize after Byzantium), insta-terminate all pre-fetchers to avoid the main committer thread from racing for disk accesses. Whatever we managed to load will be used, the rest pulled on demand.
New pre-fetcher:
- Purpose is to act as a witness constructor (write only for now) during execution, so that while we're crunching some EVM code, our disk is kept busy pulling in data that both the hasher hasher, but also a cross-validator will need at the end.
- Almost all operations are async, running in the background, the most important thing is to never ever block during EVM execution. However, on commit boundaries we have to switch to blocking mode, since the witness needs all data, not just whatever we loaded until that point in time.
- When execution reaches a boundary (IntermediateRoot pre-Byzantium; or Finalize after Byzantium), wait for all pre-fetchers to finish. This will block the main committer thread, but ideally if we're not loading junk, it should be all the same, the data needs to be loaded anyway to commit. For the witness, the data must be loaded, before tries are mutated.
(Threading) Quirks:
- Pre-Byzantium does an IntermediateRoot call between each transaction. A witness pre-fetcher for that block range must support stopping after a transaction, collecting the witness; then continuing against the next transaction, collecting witnesses from updated tries. This is significantly more complex from both a witness and a threading perspective, to have data across tries. Given that pre-Byzantium is ancient, it doesn't make sense to support it, but we need to very explicitly handle / reject that case, otherwise it's going to be "weird" trying to understand the code.
- The old pre-fetcher was best-effort, with no guarantees on correctness (as to how much and what data it loaded). The new pre-fetcher needs to be correct to construct a proper witness, so sometimes blocking is necessary. That however means that code paths need to be re-thought, as we still want to maximise the main EVM execution pathways even whilst waiting for data. Particularly, when terminating a pre-fetcher (i.e. waiting), we should start integrating results from finished subfetchers before waiting for all storage tries to finish loading.
Qustions:
- Does slot mutation order make the witness different? I.e. If i change 3 slots in a contract (including delete/create), does the order of applying them change what trie nodes we need? Because if so, there might be a hidden step still needed during commit to add prefetch tasks (?)

jwasinger · 2024-04-15T20:38:45Z

Purpose is to act as a witness constructor (write only for now) during execution, so that while we're crunching some EVM code, our disk is kept busy pulling in data that both the hasher hasher, but also a cross-validator will need at the end.

It's actually meant to gather witnesses for read values. In the stateless witness builder PR, I gather write witnesses from committing the tries.

But iirc, earlier on the call today you mentioned not tying the retrieval of write witnesses to the commit operation, which would change the assumptions from my original code.

…ssociated subfetcher Co-authored-by: Martin HS <martin@swende.se> Co-authored-by: Péter Szilágyi <peterke@gmail.com>

karalabe · 2024-05-07T06:23:34Z

karalabe · 2024-05-07T07:00:06Z

+// if a prefetcher is available. This path is used if snapshots are unavailable,
+// since that requires reading the trie *during* execution, when the prefetchers
+// cannot yet return data.
+func (s *stateObject) getTrie(skipPrefetcher bool) (Trie, error) {


FWIW, skipPrefetcher is kind of an ugly hack, I just wanted to avoid the lack-of-snapshot poking into the prefetcher. Open to cleaner suggestions.

rjl493456442 · 2024-05-13T06:34:00Z

+	if s.data.Root == types.EmptyRootHash || s.db.prefetcher == nil {
+		return nil, nil
+	}
+	// Attempt to retrieve the trie from the pretecher


typo pretecher => prefetcher

…ereum#29519) * core/state: trie prefetcher change: calling trie() doesn't stop the associated subfetcher Co-authored-by: Martin HS <martin@swende.se> Co-authored-by: Péter Szilágyi <peterke@gmail.com> * core/state: improve prefetcher * core/state: restore async prefetcher stask scheduling * core/state: finish prefetching async and process storage updates async * core/state: don't use the prefetcher for missing snapshot items * core/state: remove update concurrency for Verkle tries * core/state: add some termination checks to prefetcher async shutdowns * core/state: differentiate db tries and prefetched tries * core/state: teh teh teh --------- Co-authored-by: Jared Wasinger <j-wasinger@hotmail.com> Co-authored-by: Martin HS <martin@swende.se> Co-authored-by: Gary Rong <garyrong0905@gmail.com>

AungThuSoe24 · 2024-06-24T15:51:10Z

@@ -17,6 +17,7 @@
 package state

 import (


karalabe requested review from holiman and rjl493456442 as code owners April 12, 2024 12:07

rjl493456442 approved these changes Apr 15, 2024

View reviewed changes

karalabe commented Apr 15, 2024

View reviewed changes

jwasinger and others added 5 commits May 3, 2024 10:15

core/state: trie prefetcher change: calling trie() doesn't stop the a…

3da6b1c

…ssociated subfetcher Co-authored-by: Martin HS <martin@swende.se> Co-authored-by: Péter Szilágyi <peterke@gmail.com>

core/state: improve prefetcher

43babe4

core/state: restore async prefetcher stask scheduling

f166ce1

core/state: finish prefetching async and process storage updates async

f5ec2e7

core/state: don't use the prefetcher for missing snapshot items

6d3c6a1

jwasinger mentioned this pull request May 7, 2024

all: stateless witness builder and (self-)cross validator #29719

Merged

core/state: remove update concurrency for Verkle tries

e9b599c

karalabe commented May 7, 2024

View reviewed changes

karalabe added this to the 1.14.2 milestone May 7, 2024

core/state: add some termination checks to prefetcher async shutdowns

8d9f5ee

rjl493456442 reviewed May 9, 2024

View reviewed changes

Comment thread core/state/state_object.go Outdated

karalabe modified the milestones: 1.14.3, 1.14.4 May 10, 2024

karalabe added 2 commits May 10, 2024 11:25

core/state: differentiate db tries and prefetched tries

3906158

core/state: teh teh teh

bfcd4c4

rjl493456442 reviewed May 13, 2024

View reviewed changes

Comment thread core/state/state_object.go

karalabe merged commit 2ac83e1 into ethereum:master May 13, 2024

This was referenced Jun 5, 2024

ethereum 1.14.4 Homebrew/homebrew-core#173743

Merged

ethereum 1.14.5 Homebrew/homebrew-core#173874

Merged

AungThuSoe24 reviewed Jun 24, 2024

View reviewed changes

Comment thread core/state/trie_prefetcher.go

@@ -17,6 +17,7 @@

package state

import (

Copy link
Copy Markdown

AungThuSoe24 Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTC 2

buddh0 mentioned this pull request Nov 8, 2024

upstream: Prague code merge [v1.13.15, v1.14.11] bnb-chain/bsc#2753

Closed

buddh0 mentioned this pull request Nov 22, 2024

upstream: Prague code merge [v1.13.15, v1.14.11] bnb-chain/bsc#2761

Merged

This was referenced Dec 9, 2024

release: prepare for release v1.5.1-alpha bnb-chain/bsc#2789

Merged

release: prepare for release v1.5.1-alpha (#2789) bnb-chain/bsc#2790

Merged

Unique-Divine mentioned this pull request Apr 16, 2025

feat(nibiru): geth v1.14 compatible with Nibiru NibiruChain/go-ethereum#9

Merged

Conversation

karalabe commented Apr 12, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

karalabe left a comment

Choose a reason for hiding this comment

Uh oh!

holiman commented Apr 15, 2024

Uh oh!

karalabe commented Apr 15, 2024

Uh oh!

jwasinger commented Apr 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karalabe commented May 7, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jwasinger commented Apr 15, 2024 •

edited

Loading