update elastic scaling guide #6739

alindima · 2024-12-03T09:44:36Z

Resolves #5050

Updates the elastic scaling guide, taking into consideration:

the completed implementation of RFC-103, which enables an untrusted collator set for elastic scaling. Adds the necessary instructions for configuring the collator so that it can leverage this implementation
general updates for bits that became out of date

This PR should not be merged until:

the CandidateReceiptV2 node feature bit is enabled on all networks
the functionality hidden under the experimental-ump-signals feature of the parachain-system pallet is turned on by default (which can only be done after 1)

TODO:

update after elastic scaling: rework core selector handling #6939 is merged
add info about the possibility of needing to modify authoring duration if using more than 3 cores

docs/sdk/src/guides/enable_elastic_scaling_mvp.rs

prdoc/pr_6739.prdoc

docs/sdk/src/guides/enable_elastic_scaling_mvp.rs

eskimor · 2024-12-13T18:44:45Z

docs/sdk/src/guides/enable_elastic_scaling_mvp.rs

+//!
+//! - The `DefaultCoreSelector` implements a round-robin selection on the cores that can be
+//! occupied by the parachain at the very next relay parent. This is the equivalent to what all
+//! parachains on production networks have been using so far.


Hmm. Shall we rename this as part of this PR? It seems like LookaheadCoreSelector should be the "default" as we expect any new parachain to use asynchronous backing?

eskimor · 2024-12-13T18:54:00Z

docs/sdk/src/guides/enable_elastic_scaling_mvp.rs

+//! <div class="warning">If you configure a velocity which is different from the number of assigned
+//! cores, the measured velocity in practice will be the minimum of these two. However, be mindful
+//! that if the velocity is higher than the number of assigned cores, it's possible that
+//! <a href="https://github.com/paritytech/polkadot-sdk/issues/6667"> only a subset of the collator set will be authoring blocks.</a></div>


The question is why do we need to configure a velocity at all, seems redundant.

Once the slot based collator can produce multiple blocks per slot we should also add that we recommend slot durations of at least 6s, preferably even 12. (better censorship resistance)

…stic-scaling-guide

alindima · 2025-02-13T12:52:39Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!    `overseer_handle` and `relay_chain_slot_duration` params passed to `start_consensus` and pass
+//!    in the `slot_based_handle`.
+//!
+//! ### Phase 2 - Configure core selection policy in the parachain runtime


Phase 2 assumes candidate receipt v2 feature bit was enabled.
This phase will change after the feature bit is enabled on all networks and a form of #6939 is merged

alindima · 2025-02-13T12:54:19Z

templates/parachain/node/src/service.rs

@@ -15,7 +15,9 @@ use polkadot_sdk::*;
 use cumulus_client_cli::CollatorOptions;
 use cumulus_client_collator::service::CollatorService;
 #[docify::export(lookahead_collator)]
-use cumulus_client_consensus_aura::collators::lookahead::{self as aura, Params as AuraParams};
+use cumulus_client_consensus_aura::collators::slot_based::{


Changes in this file will be rolled back before merge, but currently showcase what a parachain team using the template would need to do on the node-side to use elastic scaling

docs/sdk/src/guides/enable_elastic_scaling.rs

sandreim · 2025-02-13T16:08:01Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!
+//! ### Phase 3 - Configure maximum scaling factor in the runtime
+//!
+//! First of all, you need to decide the upper limit to how many parachain blocks you need to


Actually the thinking is the other way around - what is the minimum target block time? It is then no longer needed to configure any other parameters manually as you can compute them from this value.

you can also make all the calculations based on the velocity, which is what I describe here

I can see what is described here, but I want a better DX.

As you've noticed recently, people didn't ask "how many parachains blocks can I produce per relay chain block ?", Instead they ask "How can I get 500ms blocks ?" because that is what their end users care about. The velocity of the parachain is largely an implementation detail.

With that being said, we can then remove all of the details about velocity and concerns around they need to compute all sorts of other constants.

sandreim · 2025-02-13T16:10:23Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!
+//! ## Current constraints
+//!
+//! Elastic scaling is still considered experimental software, so stability is not guaranteed.


After launching on Polkadot this is not true.

True, will update when that is the case

Lets remove it at this point, since we are really close :D

docs/sdk/src/guides/enable_elastic_scaling.rs

sandreim · 2025-02-13T16:23:49Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!    duration of 2 seconds per block.** Using the current implementation with multiple collators
+//!    adds additional latency to the block production pipeline. Assuming block execution takes
+//!    about the same as authorship, the additional overhead is equal the duration of the authorship
+//!    plus the block announcement. Each collator must first import the previous block before
+//!    authoring a new one, so it is clear that the highest throughput can be achieved using a
+//!    single collator. Experiments show that the peak performance using more than one collator
+//!    (measured up to 10 collators) is utilising 2 cores with authorship time of 1.3 seconds per
+//!    block, which leaves 400ms for networking overhead. This would allow for 2.6 seconds of
+//!    execution, compared to the 2 seconds async backing enabled.
+//!    The development required for enabling maximum compute throughput for multiple collators is tracked by
+//!    [this issue](https://github.com/paritytech/polkadot-sdk/issues/5190).


I think we can do much better in terms of structure here vs a large blob of text which is not that easy to read and focus important information.

I rewrote this section. let me know how it looks

docs/sdk/src/guides/enable_elastic_scaling.rs

sandreim · 2025-02-14T09:44:47Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!    this should obviously only be used for testing purposes, due to the clear lack of decentralisation
+//!    and resilience. Experiments show that the peak compute throughput using more than one collator
+//!    (measured up to 10 collators) is utilising 2 cores with authorship time of 1.3 seconds per block,
+//!    which leaves 400ms for networking overhead. This would allow for 2.6 seconds of execution, compared


Let's add the formula as a function of latency to compute the max usable execution time.

docs/sdk/src/guides/enable_elastic_scaling.rs

sandreim · 2025-02-14T10:13:18Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!
+//! ### Phase 3 - Configure maximum scaling factor in the runtime
+//!
+//! First of all, you need to decide the upper limit to how many parachain blocks you need to


I can see what is described here, but I want a better DX.

As you've noticed recently, people didn't ask "how many parachains blocks can I produce per relay chain block ?", Instead they ask "How can I get 500ms blocks ?" because that is what their end users care about. The velocity of the parachain is largely an implementation detail.

With that being said, we can then remove all of the details about velocity and concerns around they need to compute all sorts of other constants.

docs/sdk/src/guides/enable_elastic_scaling.rs

skunert

Overall looking pretty good!

skunert · 2025-03-25T14:20:07Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//! # Enable elastic scaling for a parachain
+//!
+//! <div class="warning">This guide assumes full familiarity with Asynchronous Backing and its
+//! terminology, as defined in <a href="https://wiki.polkadot.network/docs/maintain-guides-async-backing">the Polkadot Wiki</a>.


This link is now broken.

skunert · 2025-03-25T14:21:56Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!
+//! ## Current constraints
+//!
+//! Elastic scaling is still considered experimental software, so stability is not guaranteed.


Lets remove it at this point, since we are really close :D

skunert · 2025-03-25T14:22:59Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!    the relay chain. Therefore, assuming the full 2 seconds are used, a parachain can only
+//!    utilise at most 3 cores in a relay chain slot of 6 seconds. If the full execution time is not
+//!    being used or if all collators are able to author blocks faster than the reference hardware,
+//!    higher core counts can be achieved.


Suggested change

//! higher core counts can be achieved.

//! higher core counts can be utilized.

skunert · 2025-03-25T14:25:48Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!    2 seconds building the block and announces it. The next collator fetches and executes it, wasting
+//!    2 seconds plus the block fetching duration out of its 2 second slot. Therefore, the next collator
+//!    cannot build a subsequent block in due time and ends up authoring a fork, which defeats the purpose
+//!    of elastic scaling. The highest throughput can therefore be achieved with a single collator but


Suggested change

//! of elastic scaling. The highest throughput can therefore be achieved with a single collator but

//! of elastic scaling. The highest throughput can therefore be achieved with a single collator, but

skunert · 2025-03-25T14:30:15Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!    of elastic scaling. The highest throughput can therefore be achieved with a single collator but
+//!    this should obviously only be used for testing purposes, due to the clear lack of decentralisation
+//!    and resilience. In other words, to fully utilise the cores, the following formula needs to be
+//!    satisfied: `2 * authorship duration + network overheads <= slot time`. For example, you can use


From a users perspective, I think this paragraph is a bit dense. What do you think about making this a bit shorter, stating that we need some import time between blocks. We could have the full details at the end maybe. Not insisting, is just an idea.

skunert · 2025-03-25T14:36:37Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!
+//! - Ensure Asynchronous Backing (6-second blocks) has been enabled on the parachain using
+//!   [`crate::guides::async_backing_guide`].
+//! - Ensure the `AsyncBackingParams.max_candidate_depth` value is configured to a value that is at


We need to remove this.

skunert · 2025-03-25T14:37:37Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!   least double the maximum targeted parachain velocity. For example, if the parachain will build
+//!   at most 3 candidates per relay chain block, the `max_candidate_depth` should be at least 6.
+//! - Ensure enough coretime is assigned to the parachain.
+//! - Ensure the `CandidateReceiptV2` node feature is enabled on the relay chain configuration (node


This sounds a bit technical, can we remove this ? All relays should support it at this point.

skunert · 2025-03-25T14:37:41Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!   least double the maximum targeted parachain velocity. For example, if the parachain will build
+//!   at most 3 candidates per relay chain block, the `max_candidate_depth` should be at least 6.
+//! - Ensure enough coretime is assigned to the parachain.
+//! - Ensure the `CandidateReceiptV2` node feature is enabled on the relay chain configuration (node


This sounds a bit technical, can we remove this ? All relays should support it at this point.

skunert · 2025-03-25T19:30:32Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//!
+//! <div class="warning">Phase 1 is NOT needed if using the <code>polkadot-parachain</code> or
+//! <code>polkadot-omni-node</code> binary, or <code>polkadot-omni-node-lib</code> built from the
+//! latest polkadot-sdk release! Simply pass the <code>--experimental-use-slot-based</code>


Suggested change

//! latest polkadot-sdk release! Simply pass the <code>--experimental-use-slot-based</code>

//! latest polkadot-sdk release! Simply pass the <code>--authoring slot-based</code>

skunert · 2025-03-25T19:32:19Z

docs/sdk/src/guides/enable_elastic_scaling.rs

+//! ```ignore
+//! type ParachainBlockImport = TParachainBlockImport<
+//! 	    Block,
+//! 	    SlotBasedBlockImport<Block, Arc<ParachainClient>, ParachainClient>,


I think we don't need SlotBasedBlockImport anymore if we only support 6s slots or more. wdyt @bkchr

alindima added 2 commits December 2, 2024 17:11

update elastic scaling guide

57dfd25

add explanation about SelectCore types provided

e289086

alindima added the T11-documentation This PR/Issue is related to documentation. label Dec 3, 2024

add prdoc

4233601

alindima marked this pull request as draft December 3, 2024 09:59

sandreim mentioned this pull request Dec 9, 2024

Elastic scaling: launch checklist #5051

Open

10 tasks

alindima requested a review from sandreim December 10, 2024 09:38

sandreim reviewed Dec 13, 2024

View reviewed changes

eskimor reviewed Dec 13, 2024

View reviewed changes

alindima mentioned this pull request Dec 18, 2024

elastic scaling: rework core selector handling #6939

Open

alindima added 3 commits February 13, 2025 12:04

Merge remote-tracking branch 'origin/master' into alindima/update-ela…

47a93ac

…stic-scaling-guide

resolve block import handle

aaff968

update

68a9d61

alindima commented Feb 13, 2025

View reviewed changes

alindima added 2 commits February 13, 2025 15:03

add info about authoring_duration

b94e855

prdoc title update

2d71a28

sandreim reviewed Feb 13, 2025

View reviewed changes

alindima added 2 commits February 14, 2025 11:03

update

41bc7f2

adding some examples

4703c3f

sandreim reviewed Feb 14, 2025

View reviewed changes

alindima added 3 commits February 14, 2025 14:42

feedback

b037294

add formula

ccf21ff

rephrase

49f7f47

skunert reviewed Mar 25, 2025

View reviewed changes

alindima mentioned this pull request Apr 16, 2025

Document Elastic Scaling polkadot-developers/polkadot-docs#564

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update elastic scaling guide #6739

update elastic scaling guide #6739

alindima commented Dec 3, 2024 •

edited

Loading

eskimor Dec 13, 2024

eskimor Dec 13, 2024

eskimor Dec 13, 2024

alindima Feb 13, 2025

alindima Feb 13, 2025

sandreim Feb 13, 2025

alindima Feb 14, 2025

sandreim Feb 14, 2025

sandreim Feb 13, 2025

alindima Feb 14, 2025

skunert Mar 25, 2025

sandreim Feb 13, 2025

alindima Feb 14, 2025

sandreim Feb 14, 2025

sandreim Feb 14, 2025

skunert left a comment

skunert Mar 25, 2025

skunert Mar 25, 2025

skunert Mar 25, 2025

skunert Mar 25, 2025

skunert Mar 25, 2025

skunert Mar 25, 2025

skunert Mar 25, 2025

skunert Mar 25, 2025

skunert Mar 25, 2025

skunert Mar 25, 2025

	//! higher core counts can be achieved.
	//! higher core counts can be utilized.

	//! of elastic scaling. The highest throughput can therefore be achieved with a single collator but
	//! of elastic scaling. The highest throughput can therefore be achieved with a single collator, but

	//! latest polkadot-sdk release! Simply pass the <code>--experimental-use-slot-based</code>
	//! latest polkadot-sdk release! Simply pass the <code>--authoring slot-based</code>

update elastic scaling guide #6739

Are you sure you want to change the base?

update elastic scaling guide #6739

Conversation

alindima commented Dec 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skunert left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alindima commented Dec 3, 2024 •

edited

Loading