Fix on-different-forks metrics during initialization#1468
Conversation
serban300
left a comment
There was a problem hiding this comment.
Looks good ! I just left a couple of nits which I'd be happy to implement if they make sense.
| Some(head) => (*head.number(), head.hash()), | ||
| None => (Default::default(), Default::default()), | ||
| } | ||
| >::best_parachain_head(RIALTO_PARACHAIN_ID.into())?; |
There was a problem hiding this comment.
Nit: I would use bp_rialto_parachain::RIALTO_PARACHAIN_ID directly and remove the use statement above.
| let header = BridgeRialtoGrandpa::best_finalized(); | ||
| (header.number, header.hash()) | ||
| fn best_finalized() -> Option<(bp_rialto::BlockNumber, bp_rialto::Hash)> { | ||
| BridgeRialtoGrandpa::best_finalized().map(|header| (header.number, header.hash())) |
There was a problem hiding this comment.
Nit: I think we could use the new HeaderIdProvider::id() in these implementations. Could we also return an Option<HeaderId> here ? Or are we limited to basic types like tuples for the values returned by the API methods ?
There was a problem hiding this comment.
Yeah - I think it could be done. Thank you!
| RelayState::RelayingRelayHeader(_) => unreachable!("processed by previous match; qed"), | ||
| RelayState::RelayingParaHeader(para_header_id) => { | ||
| if data.para_header_at_target < para_header_id.0 { | ||
| if para_header_at_target_or_zero < para_header_id.0 { |
There was a problem hiding this comment.
2 small Nits not related to this PR:
- I would return
statehere instead ofRelayState::RelayingParaHeader(para_header_id). - I think I would use 2
ifstatements here instead of 2matches.
| let (required_para_header, para_header_at_target) = match data.para_header_at_target { | ||
| Some(para_header_at_target) => (data.required_para_header, para_header_at_target), | ||
| None => (para_header_at_source.0, Zero::zero()), | ||
| }; |
There was a problem hiding this comment.
Would it make sense to have something like the following ?
let required_para_header = max(data.required_para_header, para_header_at_source.0);
let para_header_at_target = para_header_at_target_or_zero;
Since we want to sync the latest header if I understand correctly.
There was a problem hiding this comment.
IIUC max(data.required_para_header, para_header_at_source.0) isn't fully correct. So we are choosing what we need to sync using following algorithm:
- if there are no parachain headers at the target (i.e.
data.para_header_at_targetisNone) - then we want to sync current best header at source (para_header_at_source.0); - otherwise, we shall never sync past the an artificial limit (
data.required_para_header).
There was a problem hiding this comment.
Thanks ! I think this part makes sense now. I have one more question. Sorry for insisting on this and sorry if it doesn't make much sense, but I'm trying to understand this logic (and the parachain heads syncing logic in general) better since I got to it here. One thing that stands out to me is that we do a lot of parachain validations:
// this switch is responsible for processing `RelayingParaHeader` state
let para_header_at_target_or_zero = data.para_header_at_target.unwrap_or_else(Zero::zero);
match state {
RelayState::Idle => (),
RelayState::RelayingRelayHeader(_) => unreachable!("processed by previous match; qed"),
RelayState::RelayingParaHeader(para_header_id) => {
if para_header_at_target_or_zero < para_header_id.0 {
// parachain header hasn't yet been relayed
return RelayState::RelayingParaHeader(para_header_id)
}
},
}
// if we haven't read para head from the source, we can't yet do anyhting
let para_header_at_source = match data.para_header_at_source {
Some(ref para_header_at_source) => para_header_at_source.clone(),
None => return RelayState::Idle,
};
...
And at the end we do:
// we need relay chain header first
if data.relay_header_at_target < data.relay_header_at_source {
return RelayState::RelayingRelayHeader(data.relay_header_at_source)
}
And I was wondering if this order of operations is absolutely required. Do we need to know for example that para_header_at_source is Some, before returning RelayState::RelayingRelayHeader ? Or could we rearrange this logic a bit, starting with something like:
// We need relay chain header first
if data.relay_header_at_target < data.relay_header_at_source {
return RelayState::RelayingRelayHeader(data.relay_header_at_source);
}
// this switch is responsible for processing `RelayingRelayHeader` state
match state {
RelayState::Idle | RelayState::RelayingParaHeader(_) => (),
RelayState::RelayingRelayHeader(relay_header_number) => {
if data.relay_header_at_target < relay_header_number {
// required relay header hasn't yet been relayed
return RelayState::RelayingRelayHeader(relay_header_number)
}
// we may switch to `RelayingParaHeader` if parachain head is available
if let Some(para_header_at_relay_header_at_target) =
data.para_header_at_relay_header_at_target.clone()
{
state = RelayState::RelayingParaHeader(para_header_at_relay_header_at_target);
} else {
// otherwise, we'd need to restart (this may happen only if parachain has been
// deregistered)
state = RelayState::Idle;
}
},
}
// this switch is responsible for processing `RelayingParaHeader` state
let para_header_at_target_or_zero = data.para_header_at_target.unwrap_or_else(Zero::zero);
match state {
RelayState::Idle => (),
RelayState::RelayingRelayHeader(_) => unreachable!("processed by previous match; qed"),
RelayState::RelayingParaHeader(para_header_id) => {
if para_header_at_target_or_zero < para_header_id.0 {
// parachain header hasn't yet been relayed
return RelayState::RelayingParaHeader(para_header_id)
}
},
}
...
This would create a separation between the relay headers logic and para headers logic, and personally I would have found it easier to follow.
There was a problem hiding this comment.
Please keep asking questions - that's absolutely fine!
If we put this part
// We need relay chain header first
if data.relay_header_at_target < data.relay_header_at_source {
return RelayState::RelayingRelayHeader(data.relay_header_at_source);
}at the beginning of the code, then we'll end up always syncing relay headers and parachain headers won't ever be synced. Just because (as a rule) it'll take more time for us to craft + send + mine transaction at the target chain vs just generating new block of the source chain.
There's also a big story behind this on-demand relay - it is called on-demand for a reason :) So we don't need to sync all source headers to the target chain. That's because then we'll:
- lose a ton of tokens - e.g. cost of import-Kusama-header-to-Polkadot iirc was somewhere between 1 and 2 DOTs. So if we'll sync all Kusama headers to Polkadot, we'll be losing
~24*60*60/6*2=28800DOTs per day; - we'll be generating unneeded congestion at the target network - i.e. we'll be generating quite large (in both size and weight terms) transactions at almost every block;
- we simply don't need it.
So initially relayers architecture was layered - similar to how our pallet are also layered (GRANDPA at the bottom, then parachains, then messaging on app level). And we have 3 separate relayer subcommands - one that relays GRANDPA proofs and headers (substrate-relay relay-headers), one that relays parachain headers (substrate-relay relay-parachains) and the messages relay (relay-messages). They all have their own task and e.g. messages relay will stall if other relays are not working (simply because it needs relay and parachain headers to be able to prove messages).
But running standalone headers relay (substrate-relay relay-headers) would lead us to the problem that I've described above. And the solution was to introduce so called complex relay (substrate-relay relay-headers-and-messages), which is a messages relay with "paused" background headers (and parachains) relays. So when messages relay is seeing that it needs header-100 to deliver message-1 from source to the target chain, it "awakes" the background headers relay and asks it to relay header-100. Once it is relayed, headers relay starts sleeping again and messages relay does its job.
There are also other caveats there - e.g. we need to awake background relay sometimes and relay mandatory headers, but overall - when complex relay is running (which is as I said: messages relay + on-demand headers and parachains relay), we are trying to minimize number of transactions that we're submitting.
There was a problem hiding this comment.
Oh, ok, this is super useful both for this review and for understanding more details about the project. Thanks ! I was thinking that I might be missing something.
|
|
||
| # last time when we have been asking for conversion rate update | ||
| LAST_CONVERSION_RATE_UPDATE_TIME=0 | ||
| # current conversion rate |
There was a problem hiding this comment.
All these scripts look very similar. I wonder if we could move the common logic in a function that's accessible to all. Just saying, but probably it's not worth doing now.
There was a problem hiding this comment.
Yeah - it has became complicated with that conversion rate override stuff, but we'll probably would need to remove it soon (it isn't usable in the XCM infra). But overall idea looks good (if there's enough code that may be shared)
|
The comments have been addressed as part of #1525 . Marking this PR as reviewed. |
Bumps [clap](https://github.com/clap-rs/clap) from 3.2.13 to 3.2.15. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/v3.2.15/CHANGELOG.md) - [Commits](clap-rs/clap@v3.2.13...v3.2.15) --- updated-dependencies: - dependency-name: clap dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix on-different-forks metrics during initialization * "initialize" parachain finality pallet in on-demand parachains relay * decrease converstion rate requests count * more error logging * fix compilation * clippy
* fix on-different-forks metrics during initialization * "initialize" parachain finality pallet in on-demand parachains relay * decrease converstion rate requests count * more error logging * fix compilation * clippy
This PR fixes point (3) from #1462 (comment)
Currently It breaks RialtoParachain<>Millau relay, since it thinks that with-RialtoParachain finality pallet at Millau. Needs some debugging => draft
Update below:
There's another important commit in this PR, which "initializes" parachains finality pallet when on-demand parachains relay is being used (that's what we have now). Without it, messages are never delivered, because message relay keep getting "BridgePalletsIsNotInitialized" error.
During (nontrivial) debugging, I've seen couple of other issues locally. One is #2477 and another one is stall similar to the #1463. The latter one may be fixed by adding more accounts, but it is rare (not as many stalls as in #1463) and stall recovery works well here. So I'm leaving it as is && will fix properly if it'll cause excess alerts.
This PR breaks existing runtime APIs -
best_finalizednow returnsOption<(BlockNumber, BlockHash)>instead of(BlockNumber, BlockHash).