-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add guidance on handling of PTO #467
Conversation
draft-ietf-quic-multipath.md
Outdated
An implementation should follow the mechanism specified in {{QUIC-RECOVERY}} | ||
for detecting packet loss on each individual path. | ||
When an endpoint transmits a significant number of packets on a specific path, | ||
and the path turned into a blackhole while acknowledgements can not be received from the path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the part after "into a blackhole" is a bit strange. One reason is that either acks sent on this path can't be received, or the equally relevant reason they may not be sent is that no forward packets on this path reaches the endpoint. Thus I suggest:
When an endpoint transmits a significant number of packets on a specific path,
and the path turned into a blackhole resulting in that either no ACK is sent when no packets are received, or no ACKs sent on this path arrive at the packet sender, then the packet sender's probe timeout (PTO) will trigger following {{QUIC-RECOVERY}}. However, no packet's will be declared as lost until the packet sender receives an ACK for this path. To utilise the advantages of the multipath extension, when endpoints detect
that one of the paths has turned into a blackhole, endpoints could choose to
retransmit on other available paths if the congestion control window allows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "black hole" is jargon. How about:
An implementation should follow the mechanism specified in {{QUIC-RECOVERY}}
for detecting packet loss on each individual path. A special case happens when
the PTO timer expires. According to {{QUIC-RECOVERY}}, no packet will be declared
lost until either the packet sender receives a new ACK for this path, or the path itself is finally declared
broken. This cautious process minimizes the risk of spurious retransmissions,
but is may cause significant delivery delay for the frames contained in these "lost packets".
Endpoints could take advantage of the multipath extension, and retransmit the content
of the delayed packets on other available paths if the congestion control window on these
paths allows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal works for me.
Thanks @huitema, your proposed text looks good. I would suggest to not only say "retransmit on another path" but say "retransmit on another path earlier even before the originally used path is abandon" or something. Do we even want to recommend or hint when to retransmit? Also do we need to add some normative language in the connection closure section that all packets need to be declared lost after path abandon? |
We already have this text in Path Close: ...However, knowledge of the connection identifiers received from the peer and of the state of the number space associated to the path SHOULD be retained while packets from the peer might still be in transit, i.e., for a delay of 3 PTO after the PATH_ABANDON frame has been received from the peer, both to avoid generating spurious stateless packets as specified in {{spurious-stateless-reset}} and to be able to acknowledge the last packets received from the peer as specified in {{ack-after-abandon}}. After receiving or sending a PATH_ABANDON frame, the endpoints SHOULD promptly send PATH_ACK frames to acknowledge all packets received on the path and not yet acknowledged, as specified in {{ack-after-abandon}}. When an endpoint finally deletes all resource associated with the path, the packets sent over the path and not yet acknowledged MUST be considered lost. I think that's pretty clear, no need to add more details on when packets are considered lost. |
Ah, thanks I missed that last MUST. So that is pretty clear and already addresses. No change needed. Then only my other question remains: Do we want to give more clear advice when to retransmit? Like after one PTO? |
I don't think we should give more guidance. Mostly because we don't really know. When we will have lots of deployment experience, maybe. But for now, just trust implementers. They cannot merely translate a spec into code, they have to think, and I really expect that different implementers will need to make different tradeoffs based on their deployments. |
oops! Wrong window. I was working on a PR on a different project and just merged this one. Sorry. |
There is a "revert" button if needed. However, I guess this was anyway more or less ready to merge. I would just have had a few more editorial things but I guess I can also create another (editorial) PR. For this sentence I think we should say somehow that you can retransmit earlier before the packet is declared lost, otherwise the term "retransmit" might be confusing: OLD NEW Also not sure if we need the "if the congestion control window on these paths allows" part but I guess it doesn't hurt. And then I would propose to replace "the path itself is finally declared broken" with "the path is explicitly abandon". However, also note that this is exactly the issue because as we don't have/require an idle timeout, you can basically keep a broken path open forever (potentially hoping it will come back one day) and then never declare those packets as lost. I think we need to mention that explicitly somehow. |
To fix issue #457.
We could add some implementation guidance, but how aggressive depends on the implementations. (according to the discussion in IETF 121.)