-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EN Performance] Optimize Ledger.Get() by making Forest.Read() use ~5x fewer allocs/op and run ~20% faster #2477
[EN Performance] Optimize Ledger.Get() by making Forest.Read() use ~5x fewer allocs/op and run ~20% faster #2477
Conversation
Change Forest.Read to return []ledger.Value without deep copying payload keys. This avoids 4 heap allocation per key. This change doesn't affect the caller (Ledger.Get) because it discards the payload keys. name old time/op new time/op delta TrieRead-4 524µs ± 1% 420µs ± 1% -19.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta TrieRead-4 190kB ± 0% 95kB ± 0% -50.04% (p=0.000 n=10+10) name old allocs/op new allocs/op delta TrieRead-4 1.52k ± 0% 0.32k ± 0% -79.17% (p=0.000 n=10+10)
Changed Forest.ReadSingleValue to return ledger.Value without deep copying payload keys. This avoid 4 heap allocation per key. This change doesn't affect the caller (Ledger.GetSingleValue) because it discards the payload key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a great optimization 👏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement 👏🏼
I didn't know we have been returning the full Payload
all this time but what we needed was just Value
!
I was thinking that we could maybe push the same optimization further:
func (f *Forest) Read
callsfunc (mt *MTrie) UnsafeRead
which returns[]*Payload
, it then takes only theValue
part. I was wondering if we could makeUnsafeRead
return[]Value
instead. We won't be able to tracktotalPayloadSize
anymore, we would be able to tracktotalValueSize
instead (is that a problem?).UnsafeRead
seems to handle the case oflen(paths)==1
separately to avoid the recursive overhead. Do you think there is still advantage in handling that edge case insideRead
?
Happy to hear your thoughts about these suggestions, I might have missed something :)
@tarakby Thanks for taking a look and great questions!
The main optimization is to avoid payload key deep copy which happens in It probably won't give us much speedup or reduce allocs if we make
If we only return I don't know if it would be a problem to replace
Great question! Yes because they optimize different edge cases. In In |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @fxamacker for the detailed reply!
UnsafeRead() doesn't deep copy returned payloads, so there isn't extra allocs even if it returns []Value.
Ah I missed this point! I agree there isn't big advantage of returning values over payload pointers then.
In Forest.Read(), len(paths) == 1 optimizes read when there is only one path to batch read for entire mtrie. It avoids the overhead of deduplicating paths and reconstructing values in order after read
I agree it's still worth it to skip the deduplication and ordering 👌🏼
Changes
Change
Forest.Read()
andForest.ReadSingleValue()
to return[]ledger.Value
without deep copying payload keys. This avoids 4 heap allocation per key.This change doesn't affect
Ledger.Get()
(the caller) because it discards the payload keys.Closes #2475
Benchmark Comparisons (includes PR #2473 + this PR)
Bench comparison of master
Ledger.Get()
vsLedger.GetSingleValue()
reading a single value.Bench comparison of master
Ledger.Get()
vs newLedger.Get()
reading 100 values.Benchmarks used Go 1.17 on linux_amd64 (Haswell).