forked from deepin-community/kernel
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'vsock-updates-for-so_rcvlowat-handling'
Arseniy Krasnov says: ==================== vsock: updates for SO_RCVLOWAT handling This patchset includes some updates for SO_RCVLOWAT: 1) af_vsock: During my experiments with zerocopy receive, i found, that in some cases, poll() implementation violates POSIX: when socket has non- default SO_RCVLOWAT(e.g. not 1), poll() will always set POLLIN and POLLRDNORM bits in 'revents' even number of bytes available to read on socket is smaller than SO_RCVLOWAT value. In this case,user sees POLLIN flag and then tries to read data(for example using 'read()' call), but read call will be blocked, because SO_RCVLOWAT logic is supported in dequeue loop in af_vsock.c. But the same time, POSIX requires that: "POLLIN Data other than high-priority data may be read without blocking. POLLRDNORM Normal data may be read without blocking." See https://www.open-std.org/jtc1/sc22/open/n4217.pdf, page 293. So, we have, that poll() syscall returns POLLIN, but read call will be blocked. Also in man page socket(7) i found that: "Since Linux 2.6.28, select(2), poll(2), and epoll(7) indicate a socket as readable only if at least SO_RCVLOWAT bytes are available." I checked TCP callback for poll()(net/ipv4/tcp.c, tcp_poll()), it uses SO_RCVLOWAT value to set POLLIN bit, also i've tested TCP with this case for TCP socket, it works as POSIX required. I've added some fixes to af_vsock.c and virtio_transport_common.c, test is also implemented. 2) virtio/vsock: It adds some optimization to wake ups, when new data arrived. Now, SO_RCVLOWAT is considered before wake up sleepers who wait new data. There is no sense, to kick waiter, when number of available bytes in socket's queue < SO_RCVLOWAT, because if we wake up reader in this case, it will wait for SO_RCVLOWAT data anyway during dequeue, or in poll() case, POLLIN/POLLRDNORM bits won't be set, so such exit from poll() will be "spurious". This logic is also used in TCP sockets. 3) vmci/vsock: Same as 2), but i'm not sure about this changes. Will be very good, to get comments from someone who knows this code. 4) Hyper-V: As Dexuan Cui mentioned, for Hyper-V transport it is difficult to support SO_RCVLOWAT, so he suggested to disable this feature for Hyper-V. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
- Loading branch information
Showing
7 changed files
with
162 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters