Dynamic state snapshots #326
Closed
DarianShawn wants to merge 178 commits intodevfrom
Closed
Conversation
Collaborator
Author
|
The snapshot is half done without syncing protocol. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Mostly come from geth PR 20152:
This PR creates a secondary data structure for storing the Dogechain state, called a snapshot. This snapshot is special as it dynamically follows the chain:
<hash> -> <account>mapping for the account trie and<account-hash><slot-hash> -> <slot-value>mapping for the storage tries. The layout permits fast iteration over the accounts and storage, which will be used for a new sync algorithm (not done yet).The snapshot can be built fully online, during the live operation of a Dogechain node. This is harder than it seems because rebuilding the snapshot for mainnet takes days, during which the in-memory garbage collection long deletes the state needed for a single capture. So we'll have to provide the first canonical initialized snapshot, in order to make the latter things simpler and easier.
The benefit of the snapshot is that it acts as an acceleration structure for state accesses:
O(log N)disk reads (+leveldb overhead) to access an account / storage slot, the snapshot can provide direct,O(1)access time. This should be a small improvement in block processing and a huge improvement ineth_callevaluations.O(1)complexity per entry + sequential disk access, which should enable remote nodes to retrieve state data significantly cheaper than before (the sort order is the state trie leaf order, so responses can directly be assembled into tries too).The downside of the snapshot is that the raw account and storage data is essentially duplicated. In the case of mainnet, this means an extra 8-12GB of SSD space used (estimate data, not done yet).
Changes include
Testing
Manual tests
Backward compatibility
multicalcontract transactions, too.It works as expected, and block execution of the newer version is a little faster than the target version.
Snapshot generation
All works as expected, the generation done with almost same size. And the fastest one is "using a block recovery file".
Snapshot regeneration
The regeneration works fine, and will resume if it restart. The regeneration only take minutes compare with first initialization.
Documentation update
Will update the cli documentation once the version bumped.