Fix: closing system fd is thread unsafe#16289
Fix: closing system fd is thread unsafe#16289straight-shoota merged 3 commits intocrystal-lang:masterfrom
Conversation
a0cb837 to
c309e3c
Compare
2c17971 to
ef5d08d
Compare
This patch implements a reference counted lock to protect IO objects that depend on a reusable system fd (IO::FileDescriptor, File and Socket) to protect them against thread safety issues around close: - Thread 1 wants to read from fd 123; - The OS preempts Thread 1; - Thread 2 closes fd 123; - Thread 2 opens something else and the OS reuses fd 123; - The OS resumes Thread 1; - Thread 1 reads from the reused fd 123!!! The issue arises for any operation that would mutate the fd: write, fchown, ftruncate, setsockopt, ... as they risk affecting a reused fd instead of the expected one.
Only operations that can affect the file descriptor are counted, for example read or write, truncating a file or changing file permissions. Mere queries with no side effects go through normally because at worst they will fail (they would have anyway).
ef5d08d to
3150772
Compare
|
Is there a particular reason why we're rolling this out only on Unix targets instead of globally? |
|
Because the issue is on UNIX. I can move it out of |
|
It seems useful to share the same implementation across platforms. Even if it's not strictly necessary on Windows, it's easier to maintain if we only have to worry one mechanism. |
|
Close doesn't create a contention point. The problem is concurrency to the same stdio, file or socket, because we must atomically increment. Many fibers frantically writing to STDOUT will see an impact. The next step to have a single reader and a single writer (#16209) could be useful on Windows to replace the custom thread communication to read async from the console: when we could merely detach the current thread (#15871) yet make sure only one thread is blocked —which we could use on UNIX to replace the TTY hack (#16353). |
That probably produces a big jumble anyway, so it doesn't seem like a very relevant use case. |
|
If you're careful to buffer your message and to fit within The tracing feature heavily relies on this. In practice you don't need to write so frantically as printing every |
|
Anyway: I'll move |
|
I started moving The explicit relationship between the lock and the fd, for example That looks bad and feels brittle. |
|
I'd prefer to duplicate the behavior in Crystal::System for Windows to protect the handle, and that could come as a follow up. |
|
There are already a number of indirect reference where the locked block delegates to the event loop that I'm concerned about. The complexity of delegation is already quite high between the public API, system implementations and event loop. This is totally not a stopper, though. Maybe we figure out something later (probably not, though 🤷). |
|
Tried again, and from the point of view of "protecting the system_ methods" it feels better. I hit a blocker though: we must implement It's easy for Socket, but
|
Would that make it simpler for IOCP as well? |
|
Yes, this is what I meant. |
This patch implements a reference counted lock to protect IO objects that depend on a reusable system fd (`IO::FileDescriptor`, `File` and `Socket`) to protect them against thread safety issues around close: - Thread 1 wants to read from fd 123; - The OS preempts Thread 1; - Thread 2 closes fd 123; - Thread 2 opens something else and the OS reuses fd 123; - The OS resumes Thread 1; - Thread 1 reads from the reused fd 123!!! The same issue arises for any operation that would mutate the fd: `write`, `fchown`, `ftruncate`, `setsockopt`, ... as they risk affecting a reused fd.
This patch extends the fdlock to **serialize reads and writes** by extending the reference counted lock with a read lock and a write lock, so taking a reference and locking acts as a single operation instead of two (1. acquire/release the lock; 2. take/return a reference). This avoids a race condition in the polling event loops: - Fiber 1 then Fiber 2 try to read from `fd`; - Since `fd` isn't ready, both fibers start waiting; - When `fd` becomes ready then Fiber 1 is resumed; - Fiber 1 doesn't read everything and _returns_; - Since events are edge-triggered, Fiber 2 won't be resumed!!! With the read lock, fiber 2 will wait on the lock then be resumed by fiber 1 when it returns. A concrete example is multiple fibers waiting to accept on a socket where fiber 1 would keep handling connections, while fiber 2 sits idle. The other benefit is that it can help to simplify the evloops that will now only deal with a single reader + single writer per `IO` and is required for the io_uring evloop (the MT version requires it). **NOTE**: While this patch only serializes reads/writes on UNIX at the `Crystal::System`, which is where the bugs are, we will move it into stdlib for all targets in a follow-up. See #16289 (comment)
|
This pull request has been mentioned on Crystal Forum. There might be relevant details there: https://forum.crystal-lang.org/t/there-is-a-way-to-optimize-this-program/6947/64 |
This patch implements a reference counted lock to protect IO objects that depend on a reusable system fd (
IO::FileDescriptor,FileandSocket) to protect them against thread safety issues around close:The same issue arises for any operation that would mutate the fd:
write,fchown,ftruncate,setsockopt, ... as they risk affecting a reused fd.NOTE: The lock is currently implemented on the UNIX target only, but we might want to use it on every target. Go uses its fdMutex on every targets.
Extracted from #16209 (follow-up with single reader/writer)
Depends on #16288 (EventLoop#shutdown)
Closes #16127
Obsoletes #16128