Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/cmd followups #5533

Merged
merged 18 commits into from
Sep 5, 2024
Merged

/cmd followups #5533

merged 18 commits into from
Sep 5, 2024

Conversation

mordamax
Copy link
Contributor

@mordamax mordamax commented Aug 30, 2024

Closes: #5545

Tip: review this one with Whitespace hidden
image

@github-actions github-actions bot deleted a comment from mordamax Aug 30, 2024
@github-actions github-actions bot deleted a comment from mordamax Aug 30, 2024
@mordamax mordamax marked this pull request as ready for review September 4, 2024 12:37
@mordamax mordamax requested review from a team as code owners September 4, 2024 12:37
@github-actions github-actions bot deleted a comment from mordamax Sep 4, 2024
@github-actions github-actions bot deleted a comment from mordamax Sep 4, 2024
@github-actions github-actions bot deleted a comment from mordamax Sep 4, 2024
@github-actions github-actions bot deleted a comment from mordamax Sep 4, 2024
@github-actions github-actions bot deleted a comment from mordamax Sep 4, 2024
@github-actions github-actions bot deleted a comment from mordamax Sep 4, 2024
@github-actions github-actions bot deleted a comment from mordamax Sep 4, 2024
@mordamax mordamax changed the title [DNM] testing /cmd followups Sep 4, 2024
@github-actions github-actions bot requested a review from a team as a code owner September 4, 2024 13:45
@mordamax
Copy link
Contributor Author

mordamax commented Sep 4, 2024

/cmd bench --runtime asset-hub-westend --pallet pallet_balances --continue-on-fail --clean

@github-actions github-actions bot deleted a comment from mordamax Sep 4, 2024
Copy link
Contributor

github-actions bot commented Sep 4, 2024

Command "bench --runtime asset-hub-westend --pallet pallet_balances --continue-on-fail --clean" has started 🚀 See logs here

Copy link
Contributor

github-actions bot commented Sep 4, 2024

Command "bench --runtime asset-hub-westend --pallet pallet_balances --continue-on-fail --clean" has finished ✅ See logs here

Subweight results:
File Extrinsic Old New Change [%]
cumulus/pallets/collator-selection/src/weights.rs leave_intent - - ERROR
cumulus/pallets/collator-selection/src/weights.rs new_session - - ERROR
cumulus/pallets/collator-selection/src/weights.rs register_as_candidate - - ERROR
cumulus/pallets/collator-selection/src/weights.rs set_invulnerables - - ERROR
cumulus/pallets/collator-selection/src/weights.rs take_candidate_slot - - ERROR
cumulus/pallets/collator-selection/src/weights.rs update_bond - - ERROR
cumulus/parachains/runtimes/assets/asset-hub-rococo/src/weights/pallet_collator_selection.rs take_candidate_slot - - ERROR
cumulus/parachains/runtimes/assets/asset-hub-rococo/src/weights/pallet_collator_selection.rs update_bond - - ERROR
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_collator_selection.rs take_candidate_slot - - ERROR
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_collator_selection.rs update_bond - - ERROR
cumulus/parachains/runtimes/bridge-hubs/bridge-hub-rococo/src/weights/pallet_collator_selection.rs take_candidate_slot - - ERROR
cumulus/parachains/runtimes/bridge-hubs/bridge-hub-rococo/src/weights/pallet_collator_selection.rs update_bond - - ERROR
cumulus/parachains/runtimes/bridge-hubs/bridge-hub-westend/src/weights/pallet_collator_selection.rs take_candidate_slot - - ERROR
cumulus/parachains/runtimes/bridge-hubs/bridge-hub-westend/src/weights/pallet_collator_selection.rs update_bond - - ERROR
cumulus/parachains/runtimes/collectives/collectives-westend/src/weights/pallet_collator_selection.rs take_candidate_slot - - ERROR
cumulus/parachains/runtimes/collectives/collectives-westend/src/weights/pallet_collator_selection.rs update_bond - - ERROR
cumulus/parachains/runtimes/collectives/collectives-westend/src/weights/pallet_core_fellowship_ambassador_core.rs promote_fast - - ERROR
cumulus/parachains/runtimes/collectives/collectives-westend/src/weights/pallet_core_fellowship_fellowship_core.rs promote_fast - - ERROR
cumulus/parachains/runtimes/collectives/collectives-westend/src/weights/pallet_preimage.rs ensure_updated - - ERROR
cumulus/parachains/runtimes/people/people-rococo/src/weights/pallet_collator_selection.rs take_candidate_slot - - ERROR
cumulus/parachains/runtimes/people/people-rococo/src/weights/pallet_collator_selection.rs update_bond - - ERROR
cumulus/parachains/runtimes/people/people-westend/src/weights/pallet_collator_selection.rs take_candidate_slot - - ERROR
cumulus/parachains/runtimes/people/people-westend/src/weights/pallet_collator_selection.rs update_bond - - ERROR
polkadot/runtime/rococo/src/weights/pallet_preimage.rs ensure_updated - - ERROR
polkadot/runtime/rococo/src/weights/pallet_vesting.rs not_unlocking_merge_schedules - - ERROR
polkadot/runtime/rococo/src/weights/polkadot_runtime_parachains_inclusion.rs enact_candidate - - ERROR
polkadot/runtime/westend/src/weights/pallet_preimage.rs ensure_updated - - ERROR
polkadot/runtime/westend/src/weights/polkadot_runtime_parachains_inclusion.rs enact_candidate - - ERROR
substrate/frame/election-provider-support/src/weights.rs phragmen - - ERROR
substrate/frame/election-provider-support/src/weights.rs phragmms - - ERROR
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs burn_keep_alive 19.48us 32.52us +66.96
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs burn_allow_death 28.98us 47.89us +65.27
substrate/frame/balances/src/weights.rs burn_allow_death 30.97us 47.92us +54.73
substrate/frame/balances/src/weights.rs burn_keep_alive 20.71us 31.92us +54.11
substrate/frame/balances/src/weights.rs force_adjust_total_issuance 6.59us 8.73us +32.35
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs transfer_all 171.29us 202.17us +18.03
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs transfer_allow_death 171.76us 201.97us +17.59
substrate/frame/balances/src/weights.rs transfer_all 172.38us 201.82us +17.07
substrate/frame/balances/src/weights.rs transfer_allow_death 173.36us 202.29us +16.69
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs transfer_keep_alive 161.49us 186.84us +15.69
substrate/frame/balances/src/weights.rs transfer_keep_alive 163.16us 186.29us +14.18
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs force_adjust_total_issuance 30.73us 34.01us +10.67
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs force_transfer 297.98us 328.73us +10.32
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs force_set_balance_killing 142.73us 156.26us +9.48
substrate/frame/balances/src/weights.rs force_transfer 299.94us 328.18us +9.42
substrate/frame/balances/src/weights.rs force_set_balance_killing 144.93us 156.29us +7.84
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs force_unreserve 140.41us 150.26us +7.02
substrate/frame/balances/src/weights.rs force_unreserve 142.23us 150.03us +5.48
substrate/frame/balances/src/weights.rs upgrade_accounts 138.99ms 146.59ms +5.47
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs force_set_balance_creating 137.67us 144.86us +5.22
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_balances.rs upgrade_accounts 138.57ms 145.81ms +5.22

Copy link
Contributor

@muharem muharem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new to this code, so please take this approval with that in mind.

@muharem
Copy link
Contributor

muharem commented Sep 5, 2024

maybe worth having @ggwpez eyes on this

@mordamax mordamax enabled auto-merge September 5, 2024 16:01
@mordamax mordamax added this pull request to the merge queue Sep 5, 2024
Merged via the queue into master with commit 8d81f1e Sep 5, 2024
198 of 203 checks passed
@mordamax mordamax deleted the mak-test-cmd branch September 5, 2024 17:04
@s0me0ne-unkn0wn
Copy link
Contributor

@mordamax

How did it happen that this PR changed benchmarked weights for the balances pallet so much? Is this an expected outcome?

From what I see in the output generated by the benchmark, I'm not even sure it was run on the reference hardware.

@mordamax
Copy link
Contributor Author

@s0me0ne-unkn0wn

1.as for reference or not - it was generated using a new github runner (vs gitlab runner) - parity-weights - check its specs here https://github.com/paritytech/ci_cd/wiki/GitHub
2. last update was on May 8, so not sure what exactly might have impacted in between of this vs that

And to test old vs new I think we can run the same pallet somewhere via old and new command bot

/cmd bench --runtime westend --pallet pallet_balances
or
bot bench polkadot-pallet --pallet=pallet_balances

Let's see here #6095

@mordamax
Copy link
Contributor Author

@s0me0ne-unkn0wn you're right, looks like there's a consistent and significant difference, thanks for flagging!

frame-omni-bencher + github runner (parity-weights) -> cbb16b4
benchmark + gitlab runner (weights) -> 2eeba65

@alvicsam could you please check and share more details what could lead to that consistent difference between the 2 runners?
@ggwpez do you know if the switch to frame-omni-bencher could potentially be part of a reason?

@alvicsam
Copy link
Contributor

The hardware is the same: n2-standard-8 with option threads_per_core=1 and local ssd. The runners environment differs (VM + docker in case of gitlab and k8s + docker-in-docker in case of github) but IMO it would be nice to test github runner with old benchmark to check the result first.

@ggwpez
Copy link
Member

ggwpez commented Oct 17, 2024

We could also add run polkadot benchmark machine at the beginning of the jobs to check the performance.

@mordamax
Copy link
Contributor Author

mordamax commented Oct 22, 2024

I tested GH runner - frame-omni-bencer (logs vs GH runner old cli (polkadot benchmark --chain.... (logs
) they both give the relevantly same output, while if I run it in the gitlab - it gives a lot better results, so it seems like the environment problem and not a CLI or frame-omni-bencher specificly

I added and run benchmark machine to both environments and I see no significant difference

GitHub Selfhosted runner VM -> Logs
image

GitLab runner VM -> Pipeline Logs
image

So we chatted about that with @alvicsam and we think that it might worth just recalculating the new baselines with new VMs

wdyt @ggwpez @s0me0ne-unkn0wn ?

@s0me0ne-unkn0wn
Copy link
Contributor

we think that it might worth just recalculating the new baselines with new VMs

While I don't have better ideas myself, I'm still not sure it's the best solution. The first reason is the sTPS benchmark. Having 40% more reference time per transaction means we pack 40% fewer transactions into a block and thus we have 40% less TPS. Technically it doesn't make much difference, but from the marketing perspective 40% fall of TPS value of the whole network out of the blue is something that can make a lot of people unhappy.

The second reason is our attitude towards the end users, namely, validator operators. Like, we guarantee that someone having the given hardware spec can run a validator, and that's why we measure reference times with reference hardware. Increasing reference times by 40% means we either were undershooting before, which is terrible, or are overshooting now, which is not great as well. Our end-user-facing metrics should be as consistent as possible.

Last but not least, I'm just curious what the hell is going on there ;) I understand that different VMs having the same amount of resources will behave differently from the performance perspective, but a 40% difference is well over the threshold of something that can be explained solely by VM architecture differences, given that they both have very similar machine benchmarks.

That said, I feel like more justification is needed to re-run benchmarks and increase all the weights.

CC @eskimor

@ggwpez
Copy link
Member

ggwpez commented Oct 22, 2024

What runtime are you using for sTPS testing @s0me0ne-unkn0wn? The Polkadot and Kusama runtimes have their own weights in the runtimes repo. Any weights we do here are for testnets only. Nobody should use them in prod, but generate their own anyway.

But I think if this Docker-in-docker-in-kubernets is 40% slower than an actual server, then we should not use it. Nobody would reasonably run a validator like this either, so its not really "reference hardware" anymore. Seems like the machine benchmarks dont really show this.
Even though it not being a recommendation, it should still "somewhat resemble" the correct hardware.

@s0me0ne-unkn0wn
Copy link
Contributor

What runtime are you using for sTPS testing

We've always used Rococo runtime for that and consciously used the Substrate weights without re-weighting. Substrate weights are the reference. I mean, if your runtime has u128 balances and AccountId32 accounts, etc., there should be no chance of getting very different measurements for your runtime IIUC.

Besides that, while relay chain runtimes, indeed, have their own weights, parachains often do not. And even if they had, it wouldn't help that much in measuring parachain throughput to have a hundred different weights for different parachains. So, for parachain throughput measurements, we also use Substrate weights as a reference.

Nobody would reasonably run a validator like this either, so its not really "reference hardware" anymore.

Exactly! That's the point. Reference should be, well, a reference, that is, something we don't change easily.

@ggwpez
Copy link
Member

ggwpez commented Oct 23, 2024

We've always used Rococo runtime for that and consciously used the Substrate weights without re-weighting. Substrate weights are the reference. I mean, if your runtime has u128 balances and AccountId32 accounts, etc., there should be no chance of getting very different measurements for your runtime IIUC.

If you look only at the balances pallet, then I guess, yes. Generally though, no. The weight of each pallet depends on the runtime config. The callbacks that are configured and the general pallet config can cause hugely different weights for production cases.

The weights in the runtimes repo are more reliable, if you want to do some testing. Either way, for getting the maximal number of TPS, the weights are not really important. You should just increase the block weight limit until the relay block time increases or the PVF times out.
The weights are always an over-estimate, so if you want maximal numbers then you need to counter-correct for that.

@s0me0ne-unkn0wn
Copy link
Contributor

The weights are always an over-estimate, so if you want maximal numbers then you need to counter-correct for that.

I understand that, but we don't want maximum numbers but fair numbers. Like, if someone asks, "If I start a parachain on Polkadot, what TPS will I get?" and we answer, "You'll get 5550 balance transfers per block given that you're not using weird data types in your setup", that's fair. Yes, we could maximize even that (like, parameterize the balance transfer benchmark and account for the shortest execution path where we transfer from a pre-funded to a pre-existing account and thus get a lot more TPS in the sTPS test), but that wouldn't be that fair anymore. So, I believe we're just trying to get the right balance point between being precise and being fair.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-silent Changes should not be mentioned in any release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

/cmd bot followups
7 participants