Commit 0ec986e
tcp: fix incorrect undo caused by DSACK of TLP retransmit
Loss recovery undo_retrans bookkeeping had a long-standing bug where a
DSACK from a spurious TLP retransmit packet could cause an erroneous
undo of a fast recovery or RTO recovery that repaired a single
really-lost packet (in a sequence range outside that of the TLP
retransmit). Basically, because the loss recovery state machine didn't
account for the fact that it sent a TLP retransmit, the DSACK for the
TLP retransmit could erroneously be implicitly be interpreted as
corresponding to the normal fast recovery or RTO recovery retransmit
that plugged a real hole, thus resulting in an improper undo.
For example, consider the following buggy scenario where there is a
real packet loss but the congestion control response is improperly
undone because of this bug:
+ send packets P1, P2, P3, P4
+ P1 is really lost
+ send TLP retransmit of P4
+ receive SACK for original P2, P3, P4
+ enter fast recovery, fast-retransmit P1, increment undo_retrans to 1
+ receive DSACK for TLP P4, decrement undo_retrans to 0, undo (bug!)
+ receive cumulative ACK for P1-P4 (fast retransmit plugged real hole)
The fix: when we initialize undo machinery in tcp_init_undo(), if
there is a TLP retransmit in flight, then increment tp->undo_retrans
so that we make sure that we receive a DSACK corresponding to the TLP
retransmit, as well as DSACKs for all later normal retransmits, before
triggering a loss recovery undo. Note that we also have to move the
line that clears tp->tlp_high_seq for RTO recovery, so that upon RTO
we remember the tp->tlp_high_seq value until tcp_init_undo() and clear
it only afterward.
Also note that the bug dates back to the original 2013 TLP
implementation, commit 6ba8a3b ("tcp: Tail loss probe (TLP)").
However, this patch will only compile and work correctly with kernels
that have tp->tlp_retrans, which was added only in v5.8 in 2020 in
commit 76be93f ("tcp: allow at most one TLP probe per flight").
So we associate this fix with that later commit.
Fixes: 76be93f ("tcp: allow at most one TLP probe per flight")
Signed-off-by: Neal Cardwell <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Cc: Yuchung Cheng <[email protected]>
Cc: Kevin Yang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>1 parent 842c361 commit 0ec986e
2 files changed
+10
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2129 | 2129 | | |
2130 | 2130 | | |
2131 | 2131 | | |
| 2132 | + | |
2132 | 2133 | | |
2133 | | - | |
| 2134 | + | |
| 2135 | + | |
| 2136 | + | |
| 2137 | + | |
| 2138 | + | |
| 2139 | + | |
| 2140 | + | |
| 2141 | + | |
2134 | 2142 | | |
2135 | 2143 | | |
2136 | 2144 | | |
| |||
2209 | 2217 | | |
2210 | 2218 | | |
2211 | 2219 | | |
| 2220 | + | |
2212 | 2221 | | |
2213 | 2222 | | |
2214 | 2223 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
536 | 536 | | |
537 | 537 | | |
538 | 538 | | |
539 | | - | |
540 | | - | |
541 | 539 | | |
542 | 540 | | |
543 | 541 | | |
| |||
0 commit comments