Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancellation of QuicStream.ReadAsync/WriteAsync should either abort the stream or allow subsequent reads/writes #55485

Closed
geoffkizer opened this issue Jul 12, 2021 · 21 comments
Labels
area-System.Net.Quic enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@geoffkizer
Copy link
Contributor

Currently, if a ReadAsync or WriteAsync operation is cancelled, we set the read or write state to Aborted, but we don't actually abort the stream on the wire. This means the peer could get hung waiting for their read or write to complete, assuming that the stream is still valid in the appropriate direction. That's bad.

When cancellation aborts a stream direction, we should also abort to MsQuic and ensure the peer is notified.

@geoffkizer geoffkizer added this to the 6.0.0 milestone Jul 12, 2021
@ghost
Copy link

ghost commented Jul 12, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Currently, if a ReadAsync or WriteAsync operation is cancelled, we set the read or write state to Aborted, but we don't actually abort the stream on the wire. This means the peer could get hung waiting for their read or write to complete, assuming that the stream is still valid in the appropriate direction. That's bad.

When cancellation aborts a stream direction, we should also abort to MsQuic and ensure the peer is notified.

Author: geoffkizer
Assignees: -
Labels:

area-System.Net.Quic

Milestone: 6.0.0

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jul 12, 2021
@wfurt
Copy link
Member

wfurt commented Jul 12, 2021

If we agreed cancellation makes the stream unusable I agree we should cleanup the stream. It would get done eventually on Dispose but there is probably more reason to wait.

@geoffkizer
Copy link
Contributor Author

If we agreed cancellation makes the stream unusable I agree we should cleanup the stream.

Cancellation sets the Read/Send State to Aborted, which makes it unusable.

@ManickaP ManickaP removed the untriaged New issue has not been triaged by the area owner label Jul 12, 2021
@CarnaViire
Copy link
Member

Note that for HTTP/3 we need to pass a correct error code for aborting (Http3ErrorCode.RequestCancelled). If we choose to do it that way, we need to decide how we would pass it to MsQuicStream before cancellation fires

@ManickaP
Copy link
Member

Shouldn't we than rather handle the cancellation on HTTP/3 side and manually abort the stream with proper error code? If that's possible.

@CarnaViire
Copy link
Member

Shouldn't we than rather handle the cancellation on HTTP/3 side and manually abort the stream with proper error code? If that's possible.

That is what we do now

catch (OperationCanceledException ex)
when (ex.CancellationToken == requestCancellationSource.Token) // It is possible for user's Content code to throw an unexpected OperationCanceledException.
{
// We're either observing GOAWAY, or the cancellationToken parameter has been canceled.
if (cancellationToken.IsCancellationRequested)
{
_stream.AbortWrite((long)Http3ErrorCode.RequestCancelled);
throw new OperationCanceledException(ex.Message, ex, cancellationToken);
}
else
{
Debug.Assert(_goawayCancellationToken.IsCancellationRequested == true);
throw new HttpRequestException(SR.net_http_request_aborted, ex, RequestRetryType.RetryOnConnectionFailure);
}
}

case OperationCanceledException oce when oce.CancellationToken == cancellationToken:
_stream.AbortWrite((long)Http3ErrorCode.RequestCancelled);
ExceptionDispatchInfo.Throw(ex); // Rethrow.
return; // Never reached.

(it should be AbortRead in the second block, I fix that in my PR)

@ManickaP
Copy link
Member

@CarnaViire is this fixed with your today's PR? Probably not the most elegant way, but functionally is it fixed?

@karelz
Copy link
Member

karelz commented Jul 15, 2021

Triage: This is handled well in HTTP/3. We have to solve it in QUIC, but ok post-6.0.

@karelz karelz modified the milestones: 6.0.0, 7.0.0 Jul 15, 2021
@karelz karelz added the enhancement Product code improvement that does NOT require public API changes/additions label Nov 2, 2021
@karelz
Copy link
Member

karelz commented Nov 2, 2021

Triage: We should investigate what the behavior of Sockets and SslStream is and what we want to do here.
We might want to avoid undefined behavior.
We should decide if Read and Write can behave differently.

@geoffkizer
Copy link
Contributor Author

I chatted with @stephentoub offline about this a bit. We came to the conclusion that the best course here is to just do what NetworkStream (i.e. Socket) and SslStream do, where the behavior of the stream is undefined after an operation has been cancelled, but subsequent read/write operations don't need to be explicitly failed, nor does the stream itself need to be explicitly aborted.

This should be the simplest approach for us to implement, and matches the current behavior of similar streams like NetworkStream and SslStream. If we ever decide to be more strict here, we would want to do it in a holistic way for all of these stream types.

Practically speaking, this means we should not set the stream to Aborted when a read or write is cancelled.

@CarnaViire
Copy link
Member

the behavior of the stream is undefined after an operation has been cancelled, but subsequent read/write operations don't need to be explicitly failed, nor does the stream itself need to be explicitly aborted

In order for that to work, we need to have an ability to cancel an ongoing operation in msquic. I am not sure it exists at the moment. Does it? @nibanks @thhous-msft
Otherwise, if we issue a big send, "cancel" it and then issue a second small send, it will have to wait until the first big send is finished, making cancellation pointless.

As a side note, I am not sure it is the simplest approach to implement... As it would require careful changes in the state machine and adapting msquic cancellation changes. Whereas issuing abort will only need an API to set up a default Abort error code and a one-liner call. But having a parity between QuicStream and NetworkStream is a big argument.

@CarnaViire CarnaViire changed the title Cancellation of QuicStream.ReadAsync/WriteAsync should also abort the stream Cancellation of QuicStream.ReadAsync/WriteAsync should either abort the stream or allow subsequent reads/writes Feb 16, 2022
@nibanks
Copy link

nibanks commented Feb 16, 2022

Otherwise, if we issue a big send, "cancel" it and then issue a second small send, it will have to wait until the first big send is finished, making cancellation pointless.

You're dealing with a reliable stream of data. You cannot cancel a send and then keep using it. The send could have been at any state in the middle and there is no way to recover and keep using it after you cancel. So you don't cancel sends, you cancel (abort) the whole send direction of the stream. This is no different than with TCP. Once you give TCP a send your only option is killing the connection if you want to cancel the send.

@CarnaViire
Copy link
Member

I'll let @geoffkizer or @stephentoub reply why NetworkStream and Sockets are usable after cancellation (even though it is mentioned to be an undefined state)

@wfurt
Copy link
Member

wfurt commented Feb 16, 2022

Unlike QUIC or SSL stream there is no state or framing and operations can be fully independent. I think we make no quarantine about the state and that makes in impossible to reconvert in practice IMHO. For example, while the Socket may be OK the unknown state still makes it unusable for HTTP on top of that.

As far as the notification: Theres is nothing for UDP but when somebody Close/Dispose TCP Socket or NetworkStream there will be closure sequence and the peer will be notified. I forget if we do that now in closing for Quic. But if we do it may be sufficient. We should just avoid cases when we abort silently without any notification to peer.

@alexrp
Copy link
Contributor

alexrp commented Feb 16, 2022

do what NetworkStream (i.e. Socket) and SslStream do, where the behavior of the stream is undefined after an operation has been cancelled

By the way, is this even documented anywhere?

A problem I've had with cancellation support in .NET for a long time now is that it seems nearly impossible to find any documentation on what the guarantees (or lack thereof) around cancellation are. If I want to know whether cancellation is properly plumbed through to native interrupts, or whether an object can still be used after cancellation, I almost always have to either make an educated guess or read all of the relevant source code.

@CarnaViire
Copy link
Member

Theres is nothing for UDP but when somebody Close/Dispose TCP Socket or NetworkStream there will be closure sequence and the peer will be notified.

With Dispose on QuicStream we will issue a proper Abort for Read, if not done before (but we will use a hardcoded error code 0xffffffff)

if (abortRead)
{
try
{
StartShutdown(QUIC_STREAM_SHUTDOWN_FLAGS.ABORT_RECEIVE, 0xffffffff);
} catch (ObjectDisposedException) { };
}

The behavior in question is what happens after CancellationToken fires in ReadAsync or WriteAsync. Currently the respective direction will enter an Aborted state on S.N.Quic level, throwing on subsequent operations, but msquic stream will not be touched. It will wait until an explicit Abort(Read|Write) call, or Dispose.


P.S.:
I believe we actually have a bug if Write was aborted, because there we won't call Shutdown if state=Aborted

if (_state.SendState < SendState.Aborted)
{
callShutdown = true;
}

@wfurt
Copy link
Member

wfurt commented Feb 16, 2022

By the way, is this even documented anywhere?

A problem I've had with cancellation support in .NET for a long time now is that it seems nearly impossible to find any documentation on what the guarantees (or lack thereof) around cancellation are. If I want to know whether cancellation is properly plumbed through to native interrupts, or whether an object can still be used after cancellation, I almost always have to either make an educated guess or read all of the relevant source code.

We should improve the doc for sure. Cancellation is best effort and it is really difficult to get predictable outcome. Also for example SslStream defers the IO to provided Stream. So even if SslStream may pass cancellation through it still depends on underlying stream. And also the observed behavior may differ among supported OSes. Putting details to documentation is going to be impractical IMHO. Also in some cases we made improvements but the documentation is generally same for all the versions.

From all above, I feel it may be better just make the Stream unusable to get consistent and predictable behavior.

@geoffkizer
Copy link
Contributor Author

I'll let @geoffkizer or @stephentoub reply why NetworkStream and Sockets are usable after cancellation (even though it is mentioned to be an undefined state)

The short answer is that they are usable because we never did the work to make them unusable.

It's also not entirely obvious what action to take in this case, so that the stream is no longer "usable". We could actually close the socket handle; we could shutdown, either only in the direction in which cancellation happened or possibly in both cases; we could simply have a flag (or two flags, one for read and one for write) on Socket itself that indicates that cancellation happened and then check this flag on subsequent reads/writes; or possibly something else. Each of these has tradeoffs (e.g. do we want to abort both read and write, or just the side that cancelled?) and it's not obvious what the right thing here is. (By the way, calling shutdown in the direction in which cancellation happened seems like the best option to me.)

This also gets more complicated when you consider stuff like SslStream or the various HTTP request/response streams that are layered on top of other streams.

Instead, we simply say "don't do that" and let the user decide what action to take after cancellation occurs. In practice, after any exception, you can't know the state the stream/socket is in (for example, what data has actually been sent/received or not), and so you can't take any meaningful action aside from close anyway.

I do think that having "undefined behavior" here is not ideal, and it would be better to throw after cancellation. It just doesn't seem like it really adds that much value in practice, and thus figuring out the issues associated with it implementing it is not a priority.

The observation above is that the same logic applies to QuicStream. That is, it's not obvious what action to take here (especially since we need an error code to properly abort the stream), and the value of taking any action here seems low since the user should promptly abort/close the stream themselves anyway.

@geoffkizer
Copy link
Contributor Author

Otherwise, if we issue a big send, "cancel" it and then issue a second small send, it will have to wait until the first big send is finished, making cancellation pointless.

Note this is exactly how NetworkStream behaves today -- if you cancel a send, then try to send again, it will potentially block because the send buffer is full.

@nibanks
Copy link

nibanks commented Feb 16, 2022

The observation above is that the same logic applies to QuicStream. That is, it's not obvious what action to take here (especially since we need an error code to properly abort the stream), and the value of taking any action here seems low since the user should promptly abort/close the stream themselves anyway.

IMO for the QUIC protocol it is clear:

Send and receive directions of streams are independent. You should not abort one because of an operation on another was canceled. It is entirely a supported scenario for you to abort the rest of a receive if you figure out you can't handle the rest, but continue sending a response in the opposite direction.

Additionally, if you do abort an operation for one direction of a stream, the stream is now closed in that direction. No ambiguity about it. But since you aborted it mid way, you have no guarantees of what exactly the peer saw. All future operations should immediately fail. The stream is unusable now.

The biggest special case that QUIC has over everything else (including the .NET stream APIs) is that canceling has an associated error code that must be supplied and has meaning to that particular protocol on top of QUIC. We've had many discussions on this in the past, but it is still super important to keep in mind.

@ManickaP
Copy link
Member

This is covered by #68902 and #69675 where we introduce DefaultStreamErrorCode for exactly this purpose. Closing as covered by those issues.

@ghost ghost locked as resolved and limited conversation to collaborators Jul 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Quic enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

7 participants