Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor connection check to be async #7390

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Refactor connection check to be async #7390

wants to merge 9 commits into from

Conversation

dlon
Copy link
Member

@dlon dlon commented Dec 20, 2024

This rewrites the pinger/connectivity checker, and some other pieces of talpid-wireguard to be async.

One benefit is that tests run faster, since we can mock time now. Running the tests for talpid-wireguard take about 1 ms instead of 20 seconds.

I only intend on making it less bad than before. There is still room for more refactoring. A lot of mixing of async/sync in places surrounding the Tunnel trait mixes sync and async, still.


This change is Reviewable

Copy link
Contributor

@MarkusPettersson98 MarkusPettersson98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 16 of 16 files at r1, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @dlon)


talpid-wireguard/src/connectivity/monitor.rs line 66 at r1 (raw file):

            // Sleep for a while
            tokio::time::sleep(iter_delay).await;

Should we tokio::time::interval here as well?

Code quote:

            // Sleep for a while
            tokio::time::sleep(iter_delay).await;

talpid-wireguard/src/connectivity/pinger/icmp.rs line 109 at r1 (raw file):

            }

            tokio::time::sleep(Duration::from_secs(1)).await;

Should tokio::time::interval be used here?

Code quote:

            tokio::time::sleep(Duration::from_secs(1)).await;

talpid-wireguard/src/connectivity/pinger/icmp.rs line 155 at r1 (raw file):

    // socket ops will do anything worse than fail if this assumption fails, as far as I can tell.
    UdpSocket::from_std(unsafe { std::net::UdpSocket::from_raw_fd(sock.into_raw_fd()) })
}

We could get rid of the explicit use of unsafe here

#[cfg(unix)]
fn into_tokio_socket(sock: Socket) -> io::Result<UdpSocket> {
    // socket2::Socket -> std::net::UdpSocket -> tokio::net::UdpSocket
    // This looks sketchy, as we're treating a raw socket as a DGRAM socket. But none of the
    // socket ops will do anything worse than fail if this assumption fails, as far as I can tell.
    let std_socket = std::net::UdpSocket::from(sock);
    let tokio_socket = UdpSocket::try_from(std_socket)?;
    Ok(tokio_socket)
}

Code quote:

#[cfg(unix)]
fn into_tokio_socket(sock: Socket) -> io::Result<UdpSocket> {
    use std::os::fd::{FromRawFd, IntoRawFd};

    debug_assert_eq!(sock.r#type().unwrap(), Type::RAW);

    // SAFETY: This looks sketchy, as we're treating a raw socket as a DGRAM socket. But none of the
    // socket ops will do anything worse than fail if this assumption fails, as far as I can tell.
    UdpSocket::from_std(unsafe { std::net::UdpSocket::from_raw_fd(sock.into_raw_fd()) })
}

talpid-wireguard/src/lib.rs line 222 at r1 (raw file):

            cancel_receiver,
        )
        .map_err(Error::ConnectivityMonitorError)?;

If a CancelToken is needed to invoke Check::new, why can't Check::new return a CancelToken? Much like the previous API 😊

Code quote:

        let (cancel_token, cancel_receiver) = connectivity::CancelToken::new();
        let mut connectivity_monitor = connectivity::Check::new(
            gateway,
            #[cfg(any(target_os = "macos", target_os = "linux"))]
            iface_name.clone(),
            args.retry_attempt,
            cancel_receiver,
        )
        .map_err(Error::ConnectivityMonitorError)?;

talpid-wireguard/src/ephemeral.rs line 309 at r1 (raw file):

    let lock = tunnel.lock().await;
    let tunnel = lock.as_ref().expect("The tunnel was dropped unexpectedly");

What do you think about accepting a TunnelType value here instead, and push the panic up to the caller instead?

Code quote:

#[cfg(force_wireguard_handshake)]
async fn establish_tunnel_connection(
    tunnel: Arc<AsyncMutex<Option<TunnelType>>>,
    connectivity: &mut connectivity::Check,
) -> Result<(), CloseMsg> {
    use talpid_types::ErrorExt;

    let lock = tunnel.lock().await;
    let tunnel = lock.as_ref().expect("The tunnel was dropped unexpectedly");

talpid-wireguard/src/connectivity/check.rs line 118 at r1 (raw file):

    #[cfg(test)]
    /// Create a new [Check] with a custom initial state. To use the [Cancellable] strategy,
    /// see [Check::with_cancellation].

The Cancellable thingy doesn't exist anymore 😊

Code quote:

    /// Create a new [Check] with a custom initial state. To use the [Cancellable] strategy,
    /// see [Check::with_cancellation].

talpid-wireguard/src/connectivity/check.rs line 190 at r1 (raw file):

                tokio::time::sleep(Duration::from_millis(20)).await;
            }
        };

Instead of sleeping, we could use tokio::time::interval instead. Read the documentation of interval to understand the difference between sleep and interval.

        // Begin polling tunnel traffic stats periodically
        let poll_check = async {
            let mut interval = time::interval(Duration::from_millis(20));
            let ping = || Self::check_connectivity_interval(
                    &mut self.conn_state,
                    &mut self.ping_state,
                    Instant::now(),
                    check_timeout,
                    tunnel_handle,
                );
            while !ping().await? { interval.tick().await; }
            return Ok(true);
            }

Code quote:

        // Begin polling tunnel traffic stats periodically
        let poll_check = async {
            loop {
                if Self::check_connectivity_interval(
                    &mut self.conn_state,
                    &mut self.ping_state,
                    Instant::now(),
                    check_timeout,
                    tunnel_handle,
                )
                .await?
                {
                    return Ok(true);
                }
                tokio::time::sleep(Duration::from_millis(20)).await;
            }
        };

talpid-wireguard/src/connectivity/check.rs line 261 at r1 (raw file):

    ///
    /// NOTE: will panic if called from within a tokio runtime.
    async fn get_stats(tunnel_handle: &TunnelType) -> Result<Option<StatsMap>, TunnelError> {

The NOTE is no longer true, right?

Code quote:

    /// NOTE: will panic if called from within a tokio runtime.
    async fn get_stats(tunnel_handle: &TunnelType) -> Result<Option<StatsMap>, TunnelError> {

Copy link
Contributor

@MarkusPettersson98 MarkusPettersson98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @dlon)


talpid-wireguard/src/connectivity/pinger/icmp.rs line 155 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

We could get rid of the explicit use of unsafe here

#[cfg(unix)]
fn into_tokio_socket(sock: Socket) -> io::Result<UdpSocket> {
    // socket2::Socket -> std::net::UdpSocket -> tokio::net::UdpSocket
    // This looks sketchy, as we're treating a raw socket as a DGRAM socket. But none of the
    // socket ops will do anything worse than fail if this assumption fails, as far as I can tell.
    let std_socket = std::net::UdpSocket::from(sock);
    let tokio_socket = UdpSocket::try_from(std_socket)?;
    Ok(tokio_socket)
}

Also, remember to call UdpSocket::set_nonblocking on the std socket!

@dlon dlon force-pushed the async-conncheck branch from b13dffa to 8ce7eac Compare January 3, 2025 14:19
Copy link
Member Author

@dlon dlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 10 of 16 files reviewed, 4 unresolved discussions (waiting on @MarkusPettersson98)


talpid-wireguard/src/ephemeral.rs line 309 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

What do you think about accepting a TunnelType value here instead, and push the panic up to the caller instead?

This has been removed now.


talpid-wireguard/src/lib.rs line 222 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

If a CancelToken is needed to invoke Check::new, why can't Check::new return a CancelToken? Much like the previous API 😊

The reason is that cancel_receiver (terrible name) is also cloned on Android. So Check::new would have to return both CancelToken and CancelReceiver.


talpid-wireguard/src/connectivity/check.rs line 118 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

The Cancellable thingy doesn't exist anymore 😊

Done.


talpid-wireguard/src/connectivity/check.rs line 190 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

Instead of sleeping, we could use tokio::time::interval instead. Read the documentation of interval to understand the difference between sleep and interval.

        // Begin polling tunnel traffic stats periodically
        let poll_check = async {
            let mut interval = time::interval(Duration::from_millis(20));
            let ping = || Self::check_connectivity_interval(
                    &mut self.conn_state,
                    &mut self.ping_state,
                    Instant::now(),
                    check_timeout,
                    tunnel_handle,
                );
            while !ping().await? { interval.tick().await; }
            return Ok(true);
            }

I'm skeptical of the default missed tick behavior (bursts). I think using sleep(ivl).await is simpler than

let mut interval = interval(ivl);
interval.set_missed_tick_behavior(MissedTickBehavior::Delay);
interval.tick().await;

(or interval.reset())

What do you think?


talpid-wireguard/src/connectivity/check.rs line 261 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

The NOTE is no longer true, right?

Done.


talpid-wireguard/src/connectivity/monitor.rs line 66 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

Should we tokio::time::interval here as well?

Done.


talpid-wireguard/src/connectivity/pinger/icmp.rs line 109 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

Should tokio::time::interval be used here?

See above.


talpid-wireguard/src/connectivity/pinger/icmp.rs line 155 at r1 (raw file):

Previously, MarkusPettersson98 (Markus Pettersson) wrote…

Also, remember to call UdpSocket::set_nonblocking on the std socket!

Done. Nice catch! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants