Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EN Performance] Optimize MTrie reading single path by adding Ledger.GetSingleValue() to boost speed and reduce memory use #2473

Merged
merged 11 commits into from
May 25, 2022

Conversation

fxamacker
Copy link
Member

@fxamacker fxamacker commented May 24, 2022

Changes

When reading a payload from a single path, avoid needlessly path partitioning and recursive calls. Also, optimize MTrie batch read.

  • Added Ledger.GetSingleValue() to improve time/op, alloc/op, and allocs/op
  • Modified delta.View.readFunc to use Ledger.GetSingleValue()
  • Optimized existing MTrie batch read for single path

Closes #1876

Benchmark comparisions

Current Ledger.Get() vs Ledger.GetSingleValue()

Benchstat results:
name                           old time/op    new time/op    delta
LedgerGetOneValue/batch_get-4    6.54µs ± 0%    5.24µs ± 0%  -19.92%
(p=0.000 n=9+10)

name                           old alloc/op   new alloc/op   delta
LedgerGetOneValue/batch_get-4    1.74kB ± 0%    1.62kB ± 0%   -6.85%
(p=0.000 n=10+10)

name                           old allocs/op  new allocs/op  delta
LedgerGetOneValue/batch_get-4      21.0 ± 0%      15.0 ± 0%  -28.57%
(p=0.000 n=10+10)

Optimized new Ledger.Get() vs Ledger.GetSingleValue()

Benchstat results:
name                           old time/op    new time/op    delta
LedgerGetOneValue/batch_get-4    5.70µs ± 0%    5.23µs ± 0%   -8.24%
(p=0.000 n=9+10)

name                           old alloc/op   new alloc/op   delta
LedgerGetOneValue/batch_get-4    1.69kB ± 0%    1.62kB ± 0%   -3.79%
(p=0.000 n=10+10)

name                           old allocs/op  new allocs/op  delta
LedgerGetOneValue/batch_get-4      18.0 ± 0%      15.0 ± 0%  -16.67%
(p=0.000 n=10+10)

- Added Ledger.GetSingleValue() to improve speed, alloc/op, and
allocs/op
- Modified delta.View.readFunc to use Ledger.GetSingleValue()
- Optimized existing mtrie batch read for single path

Bench comparision between current Ledger.Get() vs
Ledger.GetSingleValue()

Benchstat results:
name                           old time/op    new time/op    delta
LedgerGetOneValue/batch_get-4    6.54µs ± 0%    5.24µs ± 0%  -19.92%
(p=0.000 n=9+10)

name                           old alloc/op   new alloc/op   delta
LedgerGetOneValue/batch_get-4    1.74kB ± 0%    1.62kB ± 0%   -6.85%
(p=0.000 n=10+10)

name                           old allocs/op  new allocs/op  delta
LedgerGetOneValue/batch_get-4      21.0 ± 0%      15.0 ± 0%  -28.57%
(p=0.000 n=10+10)

Bench comparision between optimized new Ledger.Get() vs
Ledger.GetSingleValue()

Benchstat results:
name                           old time/op    new time/op    delta
LedgerGetOneValue/batch_get-4    5.70µs ± 0%    5.23µs ± 0%   -8.24%
(p=0.000 n=9+10)

name                           old alloc/op   new alloc/op   delta
LedgerGetOneValue/batch_get-4    1.69kB ± 0%    1.62kB ± 0%   -3.79%
(p=0.000 n=10+10)

name                           old allocs/op  new allocs/op  delta
LedgerGetOneValue/batch_get-4      18.0 ± 0%      15.0 ± 0%  -16.67%
(p=0.000 n=10+10)
@fxamacker fxamacker added enhancement New feature or request Performance labels May 24, 2022
@fxamacker fxamacker self-assigned this May 24, 2022
@fxamacker fxamacker changed the title Add Ledger.GetSingleValue to boost speed and reduce memory use Optimize MTrie reading single path by adding Ledger.GetSingleValue() to boost speed and reduce memory use May 24, 2022
@codecov-commenter
Copy link

codecov-commenter commented May 24, 2022

Codecov Report

Merging #2473 (cb8a1a3) into master (661c8de) will decrease coverage by 0.01%.
The diff coverage is 64.23%.

@@            Coverage Diff             @@
##           master    #2473      +/-   ##
==========================================
- Coverage   56.94%   56.93%   -0.02%     
==========================================
  Files         653      651       -2     
  Lines       60019    59979      -40     
==========================================
- Hits        34180    34149      -31     
+ Misses      22913    22897      -16     
- Partials     2926     2933       +7     
Flag Coverage Δ
unittests 56.93% <64.23%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
engine/access/rpc/backend/backend_scripts.go 7.76% <0.00%> (-0.66%) ⬇️
ledger/ledger.go 15.17% <0.00%> (-0.66%) ⬇️
ledger/trie.go 10.50% <ø> (ø)
network/p2p/compressed/mockStream.go 27.77% <0.00%> (-2.53%) ⬇️
network/p2p/middleware.go 0.00% <0.00%> (ø)
fvm/state/state_holder.go 33.33% <10.00%> (-5.38%) ⬇️
ledger/partial/ledger.go 53.57% <66.66%> (+2.18%) ⬆️
ledger/complete/ledger.go 60.77% <68.42%> (+0.46%) ⬆️
ledger/complete/mtrie/forest.go 63.52% <75.00%> (+0.56%) ⬆️
ledger/complete/mtrie/trie/trie.go 64.31% <78.26%> (+0.76%) ⬆️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74635a0...cb8a1a3. Read the comment docs.

Copy link
Contributor

@ramtinms ramtinms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, excited for this one

@ramtinms
Copy link
Contributor

we might actually add a single empty line to the FVM code to trigger the FVM benchmarking.

fxamacker added 3 commits May 24, 2022 18:20
Change Forest.Read to return []ledger.Value without deep copying
payload keys.  This avoids 4 heap allocation per key.

This change doesn't affect the caller (Ledger.Get) because it
discards the payload keys.

name        old time/op    new time/op    delta
TrieRead-4     524µs ± 1%     420µs ± 1%  -19.77%  (p=0.000 n=10+10)

name        old alloc/op   new alloc/op   delta
TrieRead-4     190kB ± 0%      95kB ± 0%  -50.04%  (p=0.000 n=10+10)

name        old allocs/op  new allocs/op  delta
TrieRead-4     1.52k ± 0%     0.32k ± 0%  -79.17%  (p=0.000 n=10+10)
Changed Forest.ReadSingleValue to return ledger.Value without deep
copying payload keys.  This avoid 4 heap allocation per key.

This change doesn't affect the caller (Ledger.GetSingleValue) because
it discards the payload key.
Copy link
Contributor

@tarakby tarakby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to write complete test 👏🏼

Comment on lines 208 to 216
for !head.IsLeaf() {
depth := ledger.NodeMaxHeight - head.Height() // distance to the tree root
bit := bitutils.ReadBit(pathBytes, depth)
if bit == 0 {
head = head.LeftChild()
} else {
head = head.RightChild()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe sliiightly faster :)

Suggested change
for !head.IsLeaf() {
depth := ledger.NodeMaxHeight - head.Height() // distance to the tree root
bit := bitutils.ReadBit(pathBytes, depth)
if bit == 0 {
head = head.LeftChild()
} else {
head = head.RightChild()
}
}
depth := ledger.NodeMaxHeight - head.Height() // distance to the tree root
for !head.IsLeaf() {
bit := bitutils.ReadBit(pathBytes, depth)
if bit == 0 {
head = head.LeftChild()
} else {
head = head.RightChild()
}
depth++
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tarakby Nice catch! 👍 Done in c8b2bc5.
This improves speed by 0.49%. I also added check for head == nil to avoid introducing a panic with empty mtrie.

fxamacker and others added 3 commits May 25, 2022 11:03
This optimization improves speed by 0.49% on Haswell CPU.

Thanks for suggestion @tarakby.

Co-authored-by: Tarak Ben Youssef <[email protected]>
…llocs

Optimize Ledger.Get() by making Forest.Read() use ~5x fewer allocs/op and run ~20% faster
@fxamacker fxamacker merged commit 8a964e8 into master May 25, 2022
@fxamacker fxamacker deleted the fxamacker/optimize-reading-single-register branch May 25, 2022 23:53
@fxamacker fxamacker added the Execution Cadence Execution Team label Jul 14, 2022
@fxamacker fxamacker changed the title Optimize MTrie reading single path by adding Ledger.GetSingleValue() to boost speed and reduce memory use [EN Performance] Optimize MTrie reading single path by adding Ledger.GetSingleValue() to boost speed and reduce memory use Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Execution Cadence Execution Team Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Execution state] Optimize MTrie reading a single path
4 participants