Skip to content

introduce cancel to streamer#8025

Merged
KirillLykov merged 8 commits intoanza-xyz:masterfrom
KirillLykov:klykov/introduce-cancel-to-streamer
Sep 23, 2025
Merged

introduce cancel to streamer#8025
KirillLykov merged 8 commits intoanza-xyz:masterfrom
KirillLykov:klykov/introduce-cancel-to-streamer

Conversation

@KirillLykov
Copy link
Copy Markdown

@KirillLykov KirillLykov commented Sep 11, 2025

Problem

When the exit signal is set to true, the streamer's tasks are not exit because not all of them track exit.

Summary of Changes

Use CancellationToken instead of exit for nonblocking::streamer. This is the only way to gracefully close connection tasks.
It introduces spawn_server_with_cancel and deprecates spawn_server which uses exit. So all the tests has been changed to reflect this change.

@KirillLykov KirillLykov requested a review from a team September 11, 2025 20:07
Comment thread vortexor/src/main.rs
@@ -83,6 +84,7 @@ pub fn main() {
let tpu_forward_address = args.tpu_forward_address;
let max_streams_per_ms = args.max_streams_per_ms;
let exit = Arc::new(AtomicBool::new(false));
Copy link
Copy Markdown
Author

@KirillLykov KirillLykov Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found where exit is set to true. So also don't cancel this token. @lijunwangs am I right that exit is not really used in Vortexor main (only in tests) so I don't need to cancel cancellation token as well? Just to be sure that I haven't missed this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The exit is for testing only. The production vortexor does not have a mechanism to exit gracefully on signals or events currently.

Comment thread streamer/src/quic.rs
staked_nodes: Arc<RwLock<StakedNodes>>,
quic_server_params: QuicServerParams,
) -> Result<SpawnServerResult, QuicServerError> {
let cancel = CancellationToken::new();
Copy link
Copy Markdown
Author

@KirillLykov KirillLykov Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is workaround to preserve the method with the old signature. It is not used anywhere, because we propagate token from the validator.rs since has been created here before.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Sep 11, 2025

Codecov Report

❌ Patch coverage is 73.60406% with 52 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.0%. Comparing base (d0d5666) to head (d6151ea).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #8025     +/-   ##
=========================================
- Coverage    83.0%    83.0%   -0.1%     
=========================================
  Files         826      826             
  Lines      362199   362298     +99     
=========================================
+ Hits       300739   300816     +77     
- Misses      61460    61482     +22     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread streamer/src/nonblocking/quic.rs Outdated
cancel.cancel();
break;
}
sleep(Duration::from_millis(10)).await;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably too frequent for such a low frequency event. I would at least 10x it or maybe longer. Or make it configurable for test.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, 100-200ms is reasonable here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increased this in the last commit

@KirillLykov KirillLykov changed the title Klykov/introduce cancel to streamer introduce cancel to streamer Sep 12, 2025
@KirillLykov KirillLykov force-pushed the klykov/introduce-cancel-to-streamer branch from 02aeb4b to 5505123 Compare September 15, 2025 14:00
@KirillLykov
Copy link
Copy Markdown
Author

@alexpyattaev this PR has been rebased. I will create a new PR with runtimes creation separately because it happens to big quite big by itself.

// handle of the streamer doesn't wait for the child task to finish, so
// it is not deterministic if the tasks handling connections exit before
// the assertion below or after.
sleep(Duration::from_millis(100)).await;
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is addressed in follow up #8066


// Force connection close by exceeding max_connections_per_peer
let _pruning_connection = make_client_endpoint(&server_address, None).await;
// Exit server
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice that we can actually exit the thing now =)

alexpyattaev
alexpyattaev previously approved these changes Sep 19, 2025
Copy link
Copy Markdown

@alexpyattaev alexpyattaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this LGTM but wait for @lijunwangs to approve the change to vortexor.

Comment thread streamer/src/nonblocking/quic.rs Outdated
.fetch_add(1, Ordering::Relaxed);
}
stats.total_connections.fetch_sub(1, Ordering::Relaxed);
debug!("done with stream from {remote_addr}");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> "Done with connection from ..."

@KirillLykov
Copy link
Copy Markdown
Author

@lijunwangs addressed your comments and rebased with staked/unataked metric change, could you approve if looks good to you?

Copy link
Copy Markdown

@lijunwangs lijunwangs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KirillLykov KirillLykov merged commit a0e531e into anza-xyz:master Sep 23, 2025
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants