Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EN Performance] Reduce memory used for ledger.Payload by 32+ GB, eliminate 1+ billion allocs/op, speedup various ops #2930

Merged
merged 14 commits into from
Aug 10, 2022

Conversation

fxamacker
Copy link
Member

@fxamacker fxamacker commented Aug 4, 2022

This PR replaces decoded payload key with encoded key buffer because decoded payload key is only used for migration and reports. This change made ledger.Payload immutable.

Closes #2569
Closes #2248 (together with changes in PR #2560)
Updates #1744

Goals

  • (main goal) reduce memory required by EN (operational RAM).
  • reduce number of allocations on the heap.
  • have zero negative performance tradeoffs.

Impact

  • operational RAM should be reduced by dozens of GB (very roughly 25-50GB initially and more as data grows)
  • eliminate over 1+ billion heap allocations for mtrie in memory (about 1+ billion allocs when mtrie is created and additional savings during activities that update mtrie)
  • as positive side-effects, speedup
    • ledger update (see TrieUpdate benchstats)
    • EN startup
    • checkpoint (de)serialization
    • TrieProof (de)serialization, and etc.

Example positive side-effect beyond operational RAM reduction

Benchstats are only for ledger update. Other improvements are not benchmarked yet.

name          old time/op    new time/op    delta
TrieUpdate-4     439ms ± 2%     409ms ± 1%   -6.94%  (p=0.000 n=18+20)

name          old alloc/op   new alloc/op   delta
TrieUpdate-4    73.5MB ± 0%    34.1MB ± 0%  -53.60%  (p=0.000 n=20+20)

name          old allocs/op  new allocs/op  delta
TrieUpdate-4      187k ± 0%      147k ± 0%  -21.44%  (p=0.000 n=20+20)

Caveats

  • Updating benchmark results or adding missing benchmarks can be done at a later date.
  • Custom functions for CBOR and JSON serialization was added because immutable fields are no longer exported, but the memory reduction and speedups in other ops outweigh this.

Reviewers

This PR is fairly simple because the most important changes are in the small commit e35bd94.

The large number of lines changed by other commit is to:

  • commit b251754 - make ledger.Payload immutable
  • commit 74ee21d - eliminate circular dependencies (moving ledger/common/encoding/encodiing.go to ledger/trie_encoder.go, split common/utils/testutils.go into 2 packages)

Moved common/encoding/encoding.go from encoding package to ledger
package.

Split common/utils/testutils.go into two files.  Common utility
functions are in common/utils/utils.go.  Common test utility functions
such as creating fixtures are in common/testutils/testutils.go.

This is needed for an upcoming commit that needs to use encoded key
in Payload (which requires encoding functions).
ledger.Payload is immutable value object with key and value fields.

This commit is needed by upcoming commit that will use encoded
payload key in the key field.
Reduce data held in RAM by dozens of GB, reduce allocs,
and improve speed of: ledger update, checkpoint serialization,
and rebuilding Mtrie at startup by:
- keeping mtrie leaf node's payload key encoded in memory
- using encoded key directly while serializing checkpoint/WAL/TrieProof
- using encoded key directly while deserializing checkpoint/WAL/TrieProof

Benchmark is only for TrieUpdate.  Other improvements are not
benchmarked yet.

name          old time/op    new time/op    delta
TrieUpdate-4     439ms ± 2%     409ms ± 1%   -6.94%  (p=0.000 n=18+20)

name          old alloc/op   new alloc/op   delta
TrieUpdate-4    73.5MB ± 0%    34.1MB ± 0%  -53.60%  (p=0.000 n=20+20)

name          old allocs/op  new allocs/op  delta
TrieUpdate-4      187k ± 0%      147k ± 0%  -21.44%  (p=0.000 n=20+20)
@fxamacker fxamacker added Performance Execution Cadence Execution Team labels Aug 4, 2022
@fxamacker fxamacker requested a review from ramtinms as a code owner August 4, 2022 17:26
@fxamacker fxamacker self-assigned this Aug 4, 2022
@fxamacker fxamacker changed the title [Execution Node] Reduce memory used for ledger.Payload by dozens of GB and eliminate 1+ billion allocs/op [Execution Node] Reduce memory used for ledger.Payload by dozens of GB, eliminate 1+ billion allocs/op, speedup various ops Aug 4, 2022
Comment on lines +77 to +78
owner := k.KeyParts[0].Value
key := k.KeyParts[2].Value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some inherent structure to KeyParts? Maybe we can convert it to either a Struct or an Array to remove even more dynamic allocations?

Copy link
Member Author

@fxamacker fxamacker Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SaveTheRbtz

Is there some inherent structure to KeyParts? Maybe we can convert it to either a Struct or an Array to remove even more dynamic allocations?

You're right, the structure of ledger.Key can be optimized to reduce number of allocs but it probably won't be worth the effort to do that after I eliminate use of ledger.Key to just migration and reporting.

Ledger key is designed to be flexible, so it can contain variable number of KeyPart. But we have plans to limit uses of Ledger key to migration and reports so that we can reduce number of heap allocs.

This PR eliminates ledger.Key from mtrie leaf nodes' payload.

My next PR related to ledger.Key is to eliminate it being created outside of migration and reporting which would reduce number of heap allocs caused by ledger.Key.

EDIT: add a one-sentence summary to the beginning of my reply and highlight variable names.

ledger.Payload's fields are unexported, so custom JSON and CBOR
encoding/decoding are needed.
@codecov-commenter
Copy link

codecov-commenter commented Aug 4, 2022

Codecov Report

Merging #2930 (209f3ea) into master (845304c) will decrease coverage by 2.42%.
The diff coverage is 59.34%.

@@            Coverage Diff             @@
##           master    #2930      +/-   ##
==========================================
- Coverage   57.16%   54.73%   -2.43%     
==========================================
  Files         693      714      +21     
  Lines       63112    66184    +3072     
==========================================
+ Hits        36076    36227     +151     
- Misses      24078    26959    +2881     
- Partials     2958     2998      +40     
Flag Coverage Δ
unittests 54.73% <59.34%> (-2.43%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cmd/util/ledger/migrations/accounts.go 0.00% <0.00%> (ø)
cmd/util/ledger/migrations/prune_migration.go 0.00% <0.00%> (ø)
cmd/util/ledger/reporters/atree_reporter.go 0.00% <0.00%> (ø)
ledger/common/pathfinder/pathfinder.go 21.42% <0.00%> (-1.08%) ⬇️
utils/unittest/logging.go 0.00% <0.00%> (ø)
...d/util/ledger/migrations/storage_fees_migration.go 8.16% <10.00%> (-0.73%) ⬇️
...md/util/ledger/reporters/fungible_token_tracker.go 90.09% <40.00%> (-2.44%) ⬇️
...ledger/migrations/storage_used_update_migration.go 65.33% <42.85%> (-3.24%) ⬇️
...util/ledger/migrations/account_status_migration.go 83.14% <46.15%> (-2.74%) ⬇️
ledger/ledger.go 30.16% <47.05%> (+13.61%) ⬆️
... and 47 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Comment on lines -1041 to -1042
func TestNow(t *testing.T) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know what was the original reason for this test ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, TestNow was a throwaway test that wasn't meant to be pushed to the repo. I use TestNow to verify my understanding locally and forgot to remove it in a prior PR.

Comment on lines +322 to +330
// Value returns payload value.
// CAUTION: do not modify returned value because it shares underlying data with payload value.
func (p *Payload) Value() Value {
if p == nil {
return Value{}
}
return p.value
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great, this can be a good start for the future PRs to remove deep copies.

Copy link
Contributor

@ramtinms ramtinms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, nice work.

@fxamacker fxamacker changed the title [Execution Node] Reduce memory used for ledger.Payload by dozens of GB, eliminate 1+ billion allocs/op, speedup various ops [EN Performance] Reduce memory used for ledger.Payload by dozens of GB, eliminate 1+ billion allocs/op, speedup various ops Aug 9, 2022
@fxamacker fxamacker requested a review from zhangchiqing August 9, 2022 23:43
Copy link
Member

@zhangchiqing zhangchiqing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good. Great works! Thanks

@fxamacker fxamacker changed the title [EN Performance] Reduce memory used for ledger.Payload by dozens of GB, eliminate 1+ billion allocs/op, speedup various ops [EN Performance] Reduce memory used for ledger.Payload by 32+ GB, eliminate 1+ billion allocs/op, speedup various ops Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Execution Cadence Execution Team Performance
Projects
None yet
5 participants