-
Notifications
You must be signed in to change notification settings - Fork 840
perf: Share proposal between hash and commit #4697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
6b31642 to
9ab772f
Compare
201573a to
e7a87c3
Compare
e7a87c3 to
1395fce
Compare
e7ee927 to
51e8ad4
Compare
1395fce to
de5a6aa
Compare
de5a6aa to
9961f98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the Firewood database integration to improve proposal management and enable a ~30% performance improvement for bootstrapping nodes. The key optimization is sharing proposals between hashing and commit operations, eliminating redundant proposal creation. Additionally, the refactoring includes better recovery handling through persistent storage of block hashes and heights.
Key changes:
- Proposals are now created during the
Hash()operation and reused duringCommit(), rather than being created and dropped during hashing - Recovery information (committed block hashes and heights) is now persisted to disk
- Configuration structure and naming conventions have been improved for clarity
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| graft/evm/firewood/triedb.go | Major refactoring to track proposals created during hashing for later commit, added recovery state persistence, renamed config fields |
| graft/evm/firewood/account_trie.go | Updated to use new proposal creation API, improved documentation |
| graft/evm/firewood/storage_trie.go | Enhanced documentation for storage trie behavior |
| graft/evm/firewood/recovery.go | New file implementing recovery functions for persisting/reading committed block hashes and heights |
| graft/evm/firewood/hash_test.go | Updated to use new config API |
| graft/subnet-evm/core/genesis.go | Added Firewood-specific handling for empty genesis blocks, added helper function |
| graft/subnet-evm/core/genesis_test.go | Updated to use new config API |
| graft/subnet-evm/core/blockchain.go | Updated config field names to match new API |
| graft/subnet-evm/tests/state_test_util.go | Updated to use new config API |
| graft/coreth/core/genesis.go | Added Firewood-specific handling for empty genesis blocks, added helper function |
| graft/coreth/core/genesis_test.go | Updated to use new config API |
| graft/coreth/core/blockchain.go | Updated config field names to match new API |
| graft/coreth/tests/state_test_util.go | Updated to use new config API |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments - you definitely know this code better than I do
| if _, err := statedb.Commit(0, false, stateconf.WithTrieDBUpdateOpts(triedbOpt)); err != nil { | ||
| panic(fmt.Sprintf("unable to commit genesis block to statedb: %v", err)) | ||
| } | ||
| if root == types.EmptyRootHash && isFirewood(triedb) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the empty genesis case a little more? This isn't the case for hashdb or pathdb and as I understand it, this is because they only track state roots, while firewood tracks state roots and block hashes.
Sure this is clear for others, just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely not obvious, it took me a bit to figure out exactly what was wrong. We need to know the block hash, otherwise I have to do some janky stuff like "how do I find the parent of an empty block?". There's still corresponding block hashes, so it's kind of necessary to tell it for tracking.
Honestly, handling an empty genesis block is really strange anyway. If there's nothing in the genesis block, then all blocks afterward will always be empty, which is super weird. The only reason I need this is at all is to make it pass the tests, which isn't a very good reason
There's two options here:
- Add a comment clarifying why it's necessary at all
- Remove this, and remove all tests that make an empty genesis block, since that will never happen in production
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for the explanation! If this isn't something that will ever happen in production, I don't think we should modify our prod code to get some tests to pass, so I'd vote for option 2? Up to your discretion
| @@ -0,0 +1,66 @@ | |||
| // Copyright (C) 2019-2025, Ava Labs, Inc. All rights reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to test this code? units tests for things like
- multiple children with same state root
- other edge cases like this
- concurrent operations
could be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible, I have started working on some tests, but really what we rely on for testing all of this is the blockchain tests. They do cover all of those things you are suggesting, but it's not great that we rely on the integration tests to ensure it works. Makes tracking down errors pretty tricky.
One thing I talked with @michaelkaplan13 about is that since most of this code will not be used in production, it might be better to only spend time guaranteeing the post-SAE behavior (no proposal tree). Any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that makes a lot of sense -- if this isn't going to be used in prodw, it would be a time waster. I do agree that yeah without unit tests it makes it harder to track down errors. Maybe just have some sort of tracker to make sure that we do guarantee the post SAE behavior?
Why this should be merged
This is about a 30% performance improvement for bootstrapping nodes.
How this works
See ava-labs/libevm#240 for comments from an initial review. All proposals created at hash time are stored in a map, and some shutdown recovery is added for block hashes and heights.
How this was tested
UT and re-execution of state.
Need to be documented in RELEASES.md?
No