Do not hang in poll if event loop is destroyed #17

stepancheg · 2016-09-06T15:37:46Z

~~Partial~~ fix for #12.

Edit: updated a patch.

Loop destructor marks all loop handles dead, so any further poll requests fail (they currently panic, but should probably return an error).

Also Loop destructor unparks all tasks waiting for IO on self. It has no effect on tasks bound to self event loop, but tasks running on different executors call poll right afterwards, so they
immediately fail instead of hanging.

stepancheg · 2016-09-08T17:57:56Z

Rebased against master.

alexcrichton · 2016-09-09T20:22:32Z

Thanks for the PR @stepancheg! This seems reasonable to try to get all connections and such to return an error ASAP as soon as the associated Core is gone. I think though we don't want to do this through poll returning an error, but rather through the I/O methods themselves. That is poll would simply return Ready(()) and then the next I/O operation would fail.

stepancheg · 2016-09-09T21:03:51Z

That is poll would simply return Ready(()) and then the next I/O operation would fail.

Are you sure this is going to work? If poll would return ready, then the next I/O would return WouldBlock (because even if core is destroyed, file descriptors are not), and I/O implementation will try to re-register itself in the poller.

alexcrichton · 2016-09-09T22:11:44Z

Ah yeah I figure it'd look like:

When doing I/O you first always call poll, and that'd return Ready
The actual I/O would return WouldBlock
Seeing WouldBlock, you'd register interest to block the current task
This operation would fail, and that'd propagate outwards.

stepancheg · 2016-09-09T22:36:11Z

I think though we don't want to do this through poll returning an error
...
Seeing WouldBlock, you'd register interest to block the current task
This operation would fail, and that'd propagate outwards.

Which operation exactly should fall? Interest is registered inside poll, e. g. PollEvented::poll_read. So if schedule_read should fail, then poll_read should fail too.

alexcrichton · 2016-09-12T05:23:52Z

Yes what I'm thinking is that the error in the poll_read implementation (where it calls schedule_read should be ignored. That should just say the object is ready and the next I/O operation will fail with the same erorr.

stepancheg · 2016-09-14T09:15:27Z

I don't understand. That next "I/O" operation won't fall.

Have a look at the implementation of Stream for Receiver:

impl<T> Stream for Receiver<T> {
    fn poll(&mut self) -> Poll<Option<T>, io::Error> {
        // falling down if it returns Ready after drop of the Loop
        if let Async::NotReady = self.rx.poll_read() {
            return Ok(Async::NotReady)
        }
        // try_recv would return TryRecvError::Empty
        // because it is non-blocking queue
        match self.rx.get_ref().try_recv() {
            Ok(t) => Ok(Async::Ready(Some(t))),
            // so we fall here
            Err(TryRecvError::Empty) => {
                // need_read also won't return error
                self.rx.need_read();
                // so we return Ok(NotReady), and not Err
                // so the caller will hang
                Ok(Async::NotReady)
            }
            Err(TryRecvError::Disconnected) => Ok(Async::Ready(None)),
        }
    }
}

If we are talking about TcpStream, it is the same:

impl<E: Read> Read for PollEvented<E> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        // Ready here
        if let Async::NotReady = self.poll_read() {
            return Err(mio::would_block())
        }
        // `read` returns would block
        let r = self.get_mut().read(buf);
        if is_wouldblock(&r) {
            // `need_read` does not return error
            self.need_read();
        }
        // returning WouldBlock
        return r
    }
}

and in callers of read we convert WouldBlock into NotReady by try_nb! macro. Still no I/O error.

alexcrichton · 2016-09-14T17:13:40Z

Ah so to clarify, poll_read wouldn't return the error but need_read would. Does that make sense?

stepancheg · 2016-09-14T20:37:24Z

Does that make sense?

Yes, it does. I don't full understand motivation behind this design decision (and it looks fragile to me), but anyway it seems to work.

I've updated the patch to return Ready when loop is dropped.

alexcrichton · 2016-09-26T17:25:40Z

src/reactor/mod.rs

-    rx: Receiver<Message>,
+    // `rx` is `Option` here only because it is needed to be dropped explicitly
+    // in `drop` before other things.
+    rx: Option<Receiver<Message>>,


Instead of using an Option here, perhaps a method could be used instead? We end up using this as a custom type anyway.

alexcrichton · 2016-09-26T17:26:22Z

src/reactor/mod.rs

@@ -548,15 +566,15 @@ impl Remote {
    ///
    /// Note that while the closure, `F`, requires the `Send` bound as it might
    /// cross threads, the future `R` does not.
-    pub fn spawn<F, R>(&self, f: F)
+    pub fn spawn<F, R>(&self, f: F) -> io::Result<()>


Can you add documentation here for why io::Result<()> is returned? (e.g. what an error means)

alexcrichton · 2016-09-26T17:26:50Z

src/reactor/poll_evented.rs

@@ -107,7 +114,7 @@ impl<E> PollEvented<E> {
    /// The flag indicating that this stream is readable is unset and the
    /// current task is scheduled to receive a notification when the stream is
    /// then again readable.
-    pub fn need_read(&self) {
+    pub fn need_read(&self) -> io::Result<()> {


Like above, can you add documentation here for why these functions return an error?

stepancheg · 2016-09-27T00:45:44Z

Added several comments about why functions return errors, and replaced Option<Receiver> with Receiver and library-private close_receiver function.

alexcrichton · 2016-09-27T19:51:07Z

Ok this seems reasonable to me. I'm gonna hold off on merging it just yet until we get closer to the 0.2 release (due to breaking changes), however. Does that sound ok?

If channel is dropped, receiver may still return EOF, and if channel is alive, receiver produces an error.

stepancheg · 2017-01-01T15:07:45Z

Rebased against master.

jmlMetaswitch · 2017-08-02T12:46:08Z

I'd be keen to see this in a release soon, if possible. It sounds like v0.2 is waiting for futures crate to stabilize, so can this go into the next v0.1 release? Any idea when that might be please?

alexcrichton · 2017-08-02T13:00:32Z

Certainly yeah we could and this sooner! Right now it's a breaking change so we can't land it but it should be possible to add new versions of these functions and deprecate the old!

Would y'all be willing to work on a patch that does that?

jmlMetaswitch · 2017-08-02T13:08:17Z

@stepancheg - This is your fix. (Thank you!) Are you happy to fix it up for release?

stepancheg · 2017-08-02T13:09:27Z

Cannot do it right now, very busy with my new job.

This commit is targeted at solving tokio-rs/tokio-core#12 and incorporates the solution from tokio-rs/tokio-core#17. Namely the `need_read` and `need_write` functions on `PollEvented` now return an error when the connected reactor has gone away and the task cannot be blocked. This will typically naturally translate to errors being returned by various connected I/O objects and should help tear down the world in a clean-ish fashion.

stepancheg force-pushed the unhang branch 2 times, most recently from 04ce3f7 to c5e394e Compare September 6, 2016 15:57

stepancheg mentioned this pull request Sep 6, 2016

Recv from channel hangs forever if other party is gone #12

Open

stepancheg force-pushed the unhang branch 6 times, most recently from 3440120 to 2b2f615 Compare September 8, 2016 17:57

stepancheg force-pushed the unhang branch from 2b2f615 to 771ccd1 Compare September 14, 2016 19:44

alexcrichton reviewed Sep 26, 2016

View reviewed changes

stepancheg force-pushed the unhang branch from 771ccd1 to be67640 Compare September 27, 2016 00:43

alexcrichton added this to the 0.2 release milestone Sep 27, 2016

Do not hang in poll if reactor is destroyed

0b20ef5

If channel is dropped, receiver may still return EOF, and if channel is alive, receiver produces an error.

stepancheg force-pushed the unhang branch from be67640 to 0b20ef5 Compare January 1, 2017 15:07

alexcrichton added the 0.2-cleanup label Jan 28, 2017

alexcrichton mentioned this pull request Dec 5, 2017

Change need_read and need_write to return an error tokio-rs/tokio#53

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not hang in poll if event loop is destroyed #17

Do not hang in poll if event loop is destroyed #17

stepancheg commented Sep 6, 2016 •

edited

Loading

stepancheg commented Sep 8, 2016

alexcrichton commented Sep 9, 2016

stepancheg commented Sep 9, 2016

alexcrichton commented Sep 9, 2016

stepancheg commented Sep 9, 2016

alexcrichton commented Sep 12, 2016

stepancheg commented Sep 14, 2016

alexcrichton commented Sep 14, 2016

stepancheg commented Sep 14, 2016

alexcrichton Sep 26, 2016

alexcrichton Sep 26, 2016

alexcrichton Sep 26, 2016

stepancheg commented Sep 27, 2016

alexcrichton commented Sep 27, 2016 •

edited

Loading

stepancheg commented Jan 1, 2017

jmlMetaswitch commented Aug 2, 2017

alexcrichton commented Aug 2, 2017

jmlMetaswitch commented Aug 2, 2017

stepancheg commented Aug 2, 2017

Do not hang in poll if event loop is destroyed #17

Are you sure you want to change the base?

Do not hang in poll if event loop is destroyed #17

Conversation

stepancheg commented Sep 6, 2016 • edited Loading

stepancheg commented Sep 8, 2016

alexcrichton commented Sep 9, 2016

stepancheg commented Sep 9, 2016

alexcrichton commented Sep 9, 2016

stepancheg commented Sep 9, 2016

alexcrichton commented Sep 12, 2016

stepancheg commented Sep 14, 2016

alexcrichton commented Sep 14, 2016

stepancheg commented Sep 14, 2016

alexcrichton Sep 26, 2016

Choose a reason for hiding this comment

alexcrichton Sep 26, 2016

Choose a reason for hiding this comment

alexcrichton Sep 26, 2016

Choose a reason for hiding this comment

stepancheg commented Sep 27, 2016

alexcrichton commented Sep 27, 2016 • edited Loading

stepancheg commented Jan 1, 2017

jmlMetaswitch commented Aug 2, 2017

alexcrichton commented Aug 2, 2017

jmlMetaswitch commented Aug 2, 2017

stepancheg commented Aug 2, 2017

stepancheg commented Sep 6, 2016 •

edited

Loading

alexcrichton commented Sep 27, 2016 •

edited

Loading