Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasi-sockets: Fix shutdown bugs #9225

Merged
merged 8 commits into from
Sep 12, 2024

Conversation

badeend
Copy link
Contributor

@badeend badeend commented Sep 11, 2024

  • Fix Linux-specific peculiarity where recv continues to work even after shutdown(sock, SHUT_RD) has been called.
  • Calling shutdown after data has been accepted by write, but before it has been fully flushed, caused the in-flight data to be lost. Now, the native shutdown syscall is deferred until the background write finishes.

To facilitate the parent/child resource communication, I've placed the TCP stream implementations behind an Arc<Mutex<..>>. WASI is single threaded, so in practice these Mutexes should never be contended. It does make async functions like ready a bit awkward, though.

Tangentially related:

  • Renamed:
    • abort_wait -> cancel
    • LastWrite -> WriteState
    • Done -> Ready
    • Waiting -> Writing

LastWrite -> WriteState,
Done -> Ready,
Waiting -> Writing
And rename `abort_wait` to `cancel`
…ter `shutdown(sock, SHUT_RD)` has been called.
Calling `shutdown` _after_ data has been accepted by `write`, but _before_ it has been fully flushed, caused the in-flight data to be lost.
@badeend badeend requested a review from a team as a code owner September 11, 2024 13:51
@badeend badeend requested review from fitzgen and removed request for a team September 11, 2024 13:51
@github-actions github-actions bot added the wasi Issues pertaining to WASI label Sep 11, 2024
@alexcrichton
Copy link
Member

Thanks for this! Would it be possible to use tokio::sync::Mutex and try_lock to simplify some of the poll_* functions? That should enable holding a lock across .await while still being able to return errors on contention which may make this a bit easier in a few places. (I think we use tokio::sync::Mutex for stdio related things too)

@badeend
Copy link
Contributor Author

badeend commented Sep 11, 2024

Certainly. Just pushed the changes.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'll indulge me a bit more, in thinking more about this I'm second-guessing the coupling where TcpState owns the reader/writer streams instead of just the socket. That to me seems possible indicative of future changes being required to wasi:sockets which wouldn't require such coupling, but we can't do that now so that's less the issue.

In this situation though, if I'm understanding this right, the only reason for this split is so an active write isn't blocked out by accident when the tcp socket is shut down without going through the writer stream, right? If that's the case do you think it would be worthwhile to change the current storage of Arc<TcpStream> to just that plus a Arc<Mutex<SomeWriterState>>? That way it's clear that there's only one shared piece of state and it's just the internals of the writer.

My hope is that it would be more obvious that the intention is to not hand out multiple readers/writers via the Arc<Mutex<...>> and also minimize the scope/breadth of the mutex in question.

If you think the refactoring isn't worth it though then I think this is ok as-is.

#[async_trait::async_trait]
impl Subscribe for TcpWriteStream {
async fn ready(&mut self) {
self.0.lock().await.ready().await
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this use the try_lock_* helpers instead of lock() to consistently trap on contention? That might help prevent covering up bugs by accident where some contention is handled in some places but not others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subscribe::ready has no return type, so I can't return a trap

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm well actually as I look into looks like stdio is already inconsistent, so let's leave this as-is.

@badeend
Copy link
Contributor Author

badeend commented Sep 11, 2024

Apologies. Maybe its getting too late for me, but I don't understand what you're saying. Could you maybe clarify with a simple code snippet?

@alexcrichton
Copy link
Member

I'm wondering if we can change the previous Connected(Arc<tokio::net::TcpStream>), to Connected(Arc<tokio::net::TcpStream>, Arc<Mutex<WriteState>>), instead of having reader/writer streams. There's be no change for a read-level shutdown but for a write-level shutdown it would call shutdown on WriteState which would handle "if there's a pending write wait for that first" and if there's nothing it would immediately do the shutdown.

@badeend
Copy link
Contributor Author

badeend commented Sep 12, 2024

no change for a read-level shutdown

shutdown needs access to the reader to fix the first issue:

Linux-specific peculiarity where recv continues to work even after shutdown(sock, SHUT_RD) has been called.


A completely different way to go around this all is to revert the Mutex entirely and let the shutdown binding go through the resource table to mutate the child resources directly. E.g.

fn shutdown(&mut self, this: Resource<tcp::TcpSocket>, shutdown_type: ShutdownType) -> SocketResult<()> {
    let table = self.table();

    if let ShutdownType::Receive | ShutdownType::Both = shutdown_type {
        let input = table.iter_children(&this)?.find_map(|c| c.downcast_mut::<InputStream>());
        input.shutdown();
    }

    // etc.
}

but this would require some kind of upcasting feature to get from InputStream back to a TcpReadStream (in this case). I didn't want to pollute the HostInput/OutputStream traits to fix a TCP-specific problem. If you're fine with it, I can go this direction too if you want.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, nah if this is needed for the read bits anyway this is fine to land. Let's go ahead and land as-is and can always consider refactorings later if necessary, but I don't think it's urgent or really required.

@alexcrichton alexcrichton added this pull request to the merge queue Sep 12, 2024
Merged via the queue into bytecodealliance:main with commit 696d19f Sep 12, 2024
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasi Issues pertaining to WASI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants