frame-omni-bencher: enable jemalloc-allocator#11069
Conversation
Fix huge benchmark regression for storage-heavy extrinsics, enabling jemalloc-allocator for benchmarking, marked as option in the scope of PR #10590.
|
/cmd prdoc --audience runtime-dev --bump patch |
|
Command "prdoc --audience runtime-dev --bump patch" has failed ❌! See logs here |
|
/cmd prdoc --audience runtime_dev --bump patch |
…time_dev --bump patch'
|
@sigurpol but then we could revert your old fix? |
yeah let me try that, good point |
| cumulus-primitives-proof-size-hostfunction = { workspace = true, default-features = true } | ||
| frame-benchmarking-cli = { workspace = true } | ||
| sc-cli = { workspace = true, default-features = true } | ||
| sc-client-db = { features = ["jemalloc-allocator"], workspace = true } |
There was a problem hiding this comment.
I assume we can also just import jemalloc here?
There was a problem hiding this comment.
And for sure some comment should be added.
There was a problem hiding this comment.
or polkadot-jemalloc-shim like polkadot-omni-node or tikv-jemallocator .... not sure what's the best tool for the job here 😄
There was a problem hiding this comment.
I went for the polkadot-jemalloc-shim similar to polkadot-omni-node and others - @bkchr let me know if it's ok
|
Created backport PR for
Please cherry-pick the changes locally and resolve any conflicts. git fetch origin backport-11069-to-stable2512
git worktree add --checkout .worktree/backport-11069-to-stable2512 backport-11069-to-stable2512
cd .worktree/backport-11069-to-stable2512
git reset --hard HEAD^
git cherry-pick -x bc42349097da2ad8e551e1dde174d3fc79fe8c5b
git push --force-with-lease |
|
Successfully created backport PR for |
Fix huge benchmark regression for storage-heavy extrinsics, enabling jemalloc-allocator via polkadot-jemalloc-shim for omni-bencher, marked as optional in the scope of PR #10590. This close paritytech/trie#230. Thanks @alexggh and @cheme for the help 🙇 Tested against `runtime / main` and [2.1.0](polkadot-fellows/runtimes#1065) as described [here](paritytech/trie#230 (comment)). For the `usual` exstrinsic `force_apply_min_commission` doing massive storage allocation/deallocation on benchmark setup and then just 1read - 2 write in the benchmark extrinsic itself, times goes down from ms to µs. The regression was introduced by #10590 `sc-client-db: Make jemalloc optional` ```bash runtimes git:(sigurpol-release-2_0_6) /home/paolo/github/polkadot-sdk/target/release/frame-omni-bencher v1 benchmark pallet --runtime ./target/release/wbuild/asset-hub-polkadot-runtime/asset_hub_polkadot_runtime.compact.compressed.wasm --pallet pallet_staking_async --extrinsic "force_apply_min_commission" --steps 2 --repeat 1 2026-02-13T15:06:30.145367Z INFO frame::benchmark::pallet: Initialized runtime log filter to 'INFO' 2026-02-13T15:06:31.784936Z INFO pallet_collator_selection::pallet: assembling new collators for new session 0 at #0 2026-02-13T15:06:31.784966Z INFO pallet_collator_selection::pallet: assembling new collators for new session 1 at #0 2026-02-13T15:08:29.701636Z INFO frame::benchmark::pallet: [ 0 % ] Starting benchmark: pallet_staking_async::force_apply_min_commission 2026-02-13T15:08:35.130403Z INFO frame::benchmark::pallet: [ 0 % ] Running benchmark: pallet_staking_async::force_apply_min_commission (overtime) Pallet: "pallet_staking_async", Extrinsic: "force_apply_min_commission", Lowest values: [], Highest values: [], Steps: 2, Repeat: 1 Raw Storage Info ======== Storage: `Staking::MinCommission` (r:1 w:0) Proof: `Staking::MinCommission` (`max_values`: Some(1), `max_size`: Some(4), added: 499, mode: `MaxEncodedLen`) Storage: `Staking::Validators` (r:1 w:1) Proof: `Staking::Validators` (`max_values`: None, `max_size`: Some(45), added: 2520, mode: `MaxEncodedLen`) Median Slopes Analysis ======== -- Extrinsic Time -- Model: Time ~= 50.31 µs Reads = 2 Writes = 1 Recorded proof Size = 564 Min Squares Analysis ======== -- Extrinsic Time -- Model: Time ~= 50.31 µs Reads = 2 Writes = 1 Recorded proof Size = 564 ``` --------- Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Bastian Köcher <git@kchr.de> (cherry picked from commit bc42349)
Backport #11069 into `stable2603` from sigurpol. See the [documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md) on how to use this bot. <!-- # To be used by other automation, do not modify: original-pr-number: #${pull_number} --> Co-authored-by: Paolo La Camera <paolo@parity.io> Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Bastian Köcher <git@kchr.de>
Backport #11069 into `stable2512` from sigurpol. See the [documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md) on how to use this bot. <!-- # To be used by other automation, do not modify: original-pr-number: #${pull_number} --> --------- Co-authored-by: Paolo La Camera <paolo@parity.io>
Force the linker to keep the `polkadot_jemalloc_shim` crate and its `#[global_allocator]` in all binaries that depend on it. Without it, the linker might drop it since it is seen as a dependency with no referenced symbols. The issue happens only on a subset of combination of rust version and linker (e.g. on CI with Ubunti 24.04, rust 1.88.0 + gcc/ld strips the jemalloc crate from the binary but not rust 1.92.0 and also rust 1.88.0 + clang/mold works fine. One way to reproduce the issue on my local Ubuntu machine using `frame-omni-bencher` as reference (current latest version v0.17.2 has the shim with jemalloc-allocator feature as dependency, as coming from PR #11069 ): 1. building with rust 1.92.0 / gcc +ld => the linker doesn't strip jemalloc allocator from the binary: ```bash nm frame-omni-bencher | grep -i jemalloc 000000000149bf60 t _GLOBAL__sub_I_jemalloc_nodump_allocator.cc 0000000000eacae0 t jemalloc_constructor 0000000000eacd50 t _rjem_je_jemalloc_postfork_child 0000000000eacc70 t _rjem_je_jemalloc_postfork_parent 0000000000eacaf0 t _rjem_je_jemalloc_prefork 000000000262bb08 b _ZN7rocksdbL18jemalloc_type_infoB5cxx11E ``` 2. building with rust 1.88.0 / gcc + ld => the linker strips aways it ```bash 000000000027f8c0 t _GLOBAL__sub_I_jemalloc_nodump_allocator.cc 00000000023a3598 b _ZN7rocksdbL18jemalloc_type_infoB5cxx11E ``` 3. building with rust 1.88.0 / clang + mold => the linker keeps it (same as 1.) Since currently CI relies on Ubuntu 24.04 gcc /ld + rust 1.88.0, we go here for the conservative approach to force the linker not to drop with the `extern crate` change in all impacted binaries. Next step - outside this PR - is to bump rust version. --------- Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Bastian Köcher <git@kchr.de>
Force the linker to keep the `polkadot_jemalloc_shim` crate and its `#[global_allocator]` in all binaries that depend on it. Without it, the linker might drop it since it is seen as a dependency with no referenced symbols. The issue happens only on a subset of combination of rust version and linker (e.g. on CI with Ubunti 24.04, rust 1.88.0 + gcc/ld strips the jemalloc crate from the binary but not rust 1.92.0 and also rust 1.88.0 + clang/mold works fine. One way to reproduce the issue on my local Ubuntu machine using `frame-omni-bencher` as reference (current latest version v0.17.2 has the shim with jemalloc-allocator feature as dependency, as coming from PR #11069 ): 1. building with rust 1.92.0 / gcc +ld => the linker doesn't strip jemalloc allocator from the binary: ```bash nm frame-omni-bencher | grep -i jemalloc 000000000149bf60 t _GLOBAL__sub_I_jemalloc_nodump_allocator.cc 0000000000eacae0 t jemalloc_constructor 0000000000eacd50 t _rjem_je_jemalloc_postfork_child 0000000000eacc70 t _rjem_je_jemalloc_postfork_parent 0000000000eacaf0 t _rjem_je_jemalloc_prefork 000000000262bb08 b _ZN7rocksdbL18jemalloc_type_infoB5cxx11E ``` 2. building with rust 1.88.0 / gcc + ld => the linker strips aways it ```bash 000000000027f8c0 t _GLOBAL__sub_I_jemalloc_nodump_allocator.cc 00000000023a3598 b _ZN7rocksdbL18jemalloc_type_infoB5cxx11E ``` 3. building with rust 1.88.0 / clang + mold => the linker keeps it (same as 1.) Since currently CI relies on Ubuntu 24.04 gcc /ld + rust 1.88.0, we go here for the conservative approach to force the linker not to drop with the `extern crate` change in all impacted binaries. Next step - outside this PR - is to bump rust version. --------- Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Bastian Köcher <git@kchr.de> (cherry picked from commit d774715)
Force the linker to keep the `polkadot_jemalloc_shim` crate and its `#[global_allocator]` in all binaries that depend on it. Without it, the linker might drop it since it is seen as a dependency with no referenced symbols. The issue happens only on a subset of combination of rust version and linker (e.g. on CI with Ubunti 24.04, rust 1.88.0 + gcc/ld strips the jemalloc crate from the binary but not rust 1.92.0 and also rust 1.88.0 + clang/mold works fine. One way to reproduce the issue on my local Ubuntu machine using `frame-omni-bencher` as reference (current latest version v0.17.2 has the shim with jemalloc-allocator feature as dependency, as coming from PR #11069 ): 1. building with rust 1.92.0 / gcc +ld => the linker doesn't strip jemalloc allocator from the binary: ```bash nm frame-omni-bencher | grep -i jemalloc 000000000149bf60 t _GLOBAL__sub_I_jemalloc_nodump_allocator.cc 0000000000eacae0 t jemalloc_constructor 0000000000eacd50 t _rjem_je_jemalloc_postfork_child 0000000000eacc70 t _rjem_je_jemalloc_postfork_parent 0000000000eacaf0 t _rjem_je_jemalloc_prefork 000000000262bb08 b _ZN7rocksdbL18jemalloc_type_infoB5cxx11E ``` 2. building with rust 1.88.0 / gcc + ld => the linker strips aways it ```bash 000000000027f8c0 t _GLOBAL__sub_I_jemalloc_nodump_allocator.cc 00000000023a3598 b _ZN7rocksdbL18jemalloc_type_infoB5cxx11E ``` 3. building with rust 1.88.0 / clang + mold => the linker keeps it (same as 1.) Since currently CI relies on Ubuntu 24.04 gcc /ld + rust 1.88.0, we go here for the conservative approach to force the linker not to drop with the `extern crate` change in all impacted binaries. Next step - outside this PR - is to bump rust version. --------- Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Bastian Köcher <git@kchr.de> (cherry picked from commit d774715)
Backport paritytech#11069 into `stable2512` from sigurpol. See the [documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md) on how to use this bot. <!-- # To be used by other automation, do not modify: original-pr-number: #${pull_number} --> --------- Co-authored-by: Paolo La Camera <paolo@parity.io>
Fix huge benchmark regression for storage-heavy extrinsics, enabling jemalloc-allocator via polkadot-jemalloc-shim for omni-bencher, marked as optional in the scope of PR #10590.
This close paritytech/trie#230.
Thanks @alexggh and @cheme for the help 🙇
Tested against
runtime / mainand 2.1.0 as described here.For the
usualexstrinsicforce_apply_min_commissiondoing massive storage allocation/deallocation on benchmark setup and then just 1read - 2 write in the benchmark extrinsic itself, times goes down from ms to µs.The regression was introduced by #10590
sc-client-db: Make jemalloc optional