trie: simplify StackTrie implementation#23950
Conversation
There was a problem hiding this comment.
I need to lookup why we implemented binary marshalling, but if we somewhere do encode this to disk, then this is a breaking change.
|
The benchmarks don't seem affected -- so sure, simplicity is better. |
There was a problem hiding this comment.
Here I'm not sure. Previously this worked as a copy? (the st.key is always empty after taking an object from the pool).
There was a problem hiding this comment.
Yes, so previously the incoming key was always copied, but the append operation made it so that we could sometimes use an existing buffer, so no alloc was needed. WIth this change, leafs will reference the same backing slice.
We need to investigate whether that can be a problem.
There was a problem hiding this comment.
(the st.key is always empty after taking an object from the pool).
That's because in Reset, which is called when returning things to the pool, we do
st.key = st.key[:0]
That means we just truncate it, but don't dereference the backing-slice. So the actual underlying slice is reused, whenever we at a later point do append(st.key, ...).
A safer alternative to what you're doing would simply be to change it back into
st.key = append(st.key, key...)
There was a problem hiding this comment.
I'm running the stacktrie fuzzer on it now, let's see if it finds anything
There was a problem hiding this comment.
I now see these were for both leaf and ext node (the only ones which have keys). I can restore these. Are they only for memory management optimization unless someone would change the content underneath?
There was a problem hiding this comment.
Are they only for memory management optimization unless someone would change the content underneath?
Not sure I understand the question. But basically, you assume that the caller will not modify the input key ever again. Which may be true, but I'm not sure we have any guarantee.
The previous code always copied the key, but did so in a way which for the most part never caused any allocs to happen. I think the previous way to do it is better.
There was a problem hiding this comment.
I think it's ok to direct hold the byte slice. We do have one place with modify the underlying slice, https://github.com/ethereum/go-ethereum/blob/master/trie/stacktrie.go#L437 but it's only for the leaf node.
|
Triage: we should run a full-sync and a snap-sync to verify it. |
There was a problem hiding this comment.
I think this method woud become clearer by writing it like this
func (st *StackTrie) getDiffIndex(key []byte) int {
for idx, k := range st.key{
if k != key[idx]{
return idx
}
}
return len(st.key)
}|
Full sync started on |
|
snap-sync done, full-sync at 5.9M as of now |
There was a problem hiding this comment.
Please also remove the KeyOffset field here https://github.com/ethereum/go-ethereum/pull/23950/files#diff-450c04b40c4e2f8c12ed10930cd0a8b8a6c25b77b1f3469e7205b43b5fad058aL132
There was a problem hiding this comment.
Thanks.
This mistake might have hidden a binary incompatibility issue: we could properly read old binaries by ignoring KeyOffset byte.
This also looks like UnmarshalBinary() is currently incompatible with MarshalBinary(). Why is this not detected by any test?
There was a problem hiding this comment.
I think it's ok to direct hold the byte slice. We do have one place with modify the underlying slice, https://github.com/ethereum/go-ethereum/blob/master/trie/stacktrie.go#L437 but it's only for the leaf node.
gballet
left a comment
There was a problem hiding this comment.
So the first version of StackTrie was doing exactly that, with a limited amount of copies. This lead to a whole host of problems in which a key, being modified, modified another key that wasn't supposed to be. Now that we do a lot more fuzzing, this is likely no longer a big risk, however I would urge caution before merging this.
I don't see problem with reverting back to |
That would alleviate our concerns. IMO: Please do |
Trim the search key from head as it's being pushed deeper into the trie. Previously the search key was never modified but each node kept information how to slice and compare it in keyOffset. Now the keyOffset is not needed as this information is included in the slice of the search key. This way the keyOffset can be removed and key manipulation simplified.
Trim the search key from head as it's being pushed deeper into the trie. Previously the search key was never modified but each node kept information how to slice and compare it in keyOffset. Now the keyOffset is not needed as this information is included in the slice of the search key. This way the keyOffset can be removed and key manipulation simplified.
Trim the search key from head as it's being pushed deeper into the trie. Previously the search key was never modified but each node kept information how to slice and compare it in keyOffset. Now the keyOffset is not needed as this information is included in the slice of the search key. This way the keyOffset can be removed and key manipulation simplified.
Trim the search key from head as it's being pushed deeper into the trie. Previously the search key was never modified but each node kept information how to slice and compare it in keyOffset. Now the keyOffset is not needed as this information is included in the slice of the search key. This way the keyOffset can be removed and key manipulation simplified.
Trim the search key from head as it's being pushed deeper into the trie. Previously the search key was never modified but each node kept information how to slice and compare it in keyOffset. Now the keyOffset is not needed as this information is included in the slice of the search key. This way the keyOffset can be removed and key manipulation simplified.
Trim the search key from head as it's being pushed deeper into the trie.
Previously the search key was never modified but each node kept
information how to slice and compare it in
keyOffset. Now thekeyOffsetis not needed as this information is included in the slice of the
search key. This way the
keyOffsetcan be removed and key manipulationsimplified.
BTW, I ported this Trie implementation (without hashing implementation) to C++. This helped me a lot, thanks.