-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Consensus Observer] Add state sync fallback mode. #14955
Conversation
⏱️ 9h 59m total CI duration on this PR
|
518de81
to
8efa226
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
8efa226
to
e16b121
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
e16b121
to
69b869f
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
7175d7b
to
b0b8985
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
b297670
to
a503349
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
a503349
to
400121d
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@@ -837,8 +949,8 @@ impl ConsensusObserver { | |||
Some(network_message) = consensus_observer_message_receiver.next() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we reject messages during fallback mode to avoid accumulating too many data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah! We'll continue to pop them off the channel, and then drop them. This is because we unsubscribe from all peers before we fallback, and so when we do message sender verification, it will fail, e.g., here:
// Verify the message is from the peers we've subscribed to |
also the performance looks worse than without CO? |
Hmm... the graphs look clean though 🤔 Let me rebase. Maybe a noisy run? |
the graph looks clean, but the e2e latency looks exactly the same as CO off, I was expecting we see a clear win from it, but it may be related to how the test is setup |
Aah, I see. If you look at the high priority traffic the latency is around 1.8 seconds, but without CO, it often seems closer to 2.7? I assume the rest of the traffic is bottlenecked somewhere else? |
8e79a21
to
d4452a4
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@zekun000, yeah, after the rebase the results are the same. It shaves off around ~1 second from the high-priority traffic in this setup. The rest must be bottlenecked elsewhere. 🤔 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did you test the fallback mode?
Aah, take a look at the land blocking run on this PR: #14960. I hacked the code so that CO only processes commit notifications periodically. So, when commits are not processed, the node falls back to this type of syncing. Seems to work okay 😄 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
✅ Forge suite
|
✅ Forge suite
|
Description
This PR adds a "state sync fallback" mode to consensus observer (CO). If CO does not make progress for some time (e.g., no increase in synced version for 30 seconds), it will fallback to state sync for a while (e.g., 10 minutes). After which, CO will regain control and attempt to make progress.
The PR offers the following commits:
sync_to()
tosync_to_target()
Testing Plan
New and existing test infrastructure, including manual verification, e.g., see #14960.