Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not hang in poll if event loop is destroyed #17

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

stepancheg
Copy link

@stepancheg stepancheg commented Sep 6, 2016

Partial fix for #12.

Edit: updated a patch.

Loop destructor marks all loop handles dead, so any further poll requests fail (they currently panic, but should probably return an error).

Also Loop destructor unparks all tasks waiting for IO on self. It has no effect on tasks bound to self event loop, but tasks running on different executors call poll right afterwards, so they
immediately fail instead of hanging.

@stepancheg
Copy link
Author

Rebased against master.

@alexcrichton
Copy link
Contributor

Thanks for the PR @stepancheg! This seems reasonable to try to get all connections and such to return an error ASAP as soon as the associated Core is gone. I think though we don't want to do this through poll returning an error, but rather through the I/O methods themselves. That is poll would simply return Ready(()) and then the next I/O operation would fail.

@stepancheg
Copy link
Author

That is poll would simply return Ready(()) and then the next I/O operation would fail.

Are you sure this is going to work? If poll would return ready, then the next I/O would return WouldBlock (because even if core is destroyed, file descriptors are not), and I/O implementation will try to re-register itself in the poller.

@alexcrichton
Copy link
Contributor

Ah yeah I figure it'd look like:

  • When doing I/O you first always call poll, and that'd return Ready
  • The actual I/O would return WouldBlock
  • Seeing WouldBlock, you'd register interest to block the current task
  • This operation would fail, and that'd propagate outwards.

@stepancheg
Copy link
Author

I think though we don't want to do this through poll returning an error
...
Seeing WouldBlock, you'd register interest to block the current task
This operation would fail, and that'd propagate outwards.

Which operation exactly should fall? Interest is registered inside poll, e. g. PollEvented::poll_read. So if schedule_read should fail, then poll_read should fail too.

@alexcrichton
Copy link
Contributor

Yes what I'm thinking is that the error in the poll_read implementation (where it calls schedule_read should be ignored. That should just say the object is ready and the next I/O operation will fail with the same erorr.

@stepancheg
Copy link
Author

I don't understand. That next "I/O" operation won't fall.

Have a look at the implementation of Stream for Receiver:

impl<T> Stream for Receiver<T> {
    fn poll(&mut self) -> Poll<Option<T>, io::Error> {
        // falling down if it returns Ready after drop of the Loop
        if let Async::NotReady = self.rx.poll_read() {
            return Ok(Async::NotReady)
        }
        // try_recv would return TryRecvError::Empty
        // because it is non-blocking queue
        match self.rx.get_ref().try_recv() {
            Ok(t) => Ok(Async::Ready(Some(t))),
            // so we fall here
            Err(TryRecvError::Empty) => {
                // need_read also won't return error
                self.rx.need_read();
                // so we return Ok(NotReady), and not Err
                // so the caller will hang
                Ok(Async::NotReady)
            }
            Err(TryRecvError::Disconnected) => Ok(Async::Ready(None)),
        }
    }
}

If we are talking about TcpStream, it is the same:

impl<E: Read> Read for PollEvented<E> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        // Ready here
        if let Async::NotReady = self.poll_read() {
            return Err(mio::would_block())
        }
        // `read` returns would block
        let r = self.get_mut().read(buf);
        if is_wouldblock(&r) {
            // `need_read` does not return error
            self.need_read();
        }
        // returning WouldBlock
        return r
    }
}

and in callers of read we convert WouldBlock into NotReady by try_nb! macro. Still no I/O error.

@alexcrichton
Copy link
Contributor

Ah so to clarify, poll_read wouldn't return the error but need_read would. Does that make sense?

@stepancheg
Copy link
Author

Does that make sense?

Yes, it does. I don't full understand motivation behind this design decision (and it looks fragile to me), but anyway it seems to work.

I've updated the patch to return Ready when loop is dropped.

rx: Receiver<Message>,
// `rx` is `Option` here only because it is needed to be dropped explicitly
// in `drop` before other things.
rx: Option<Receiver<Message>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using an Option here, perhaps a method could be used instead? We end up using this as a custom type anyway.

@@ -548,15 +566,15 @@ impl Remote {
///
/// Note that while the closure, `F`, requires the `Send` bound as it might
/// cross threads, the future `R` does not.
pub fn spawn<F, R>(&self, f: F)
pub fn spawn<F, R>(&self, f: F) -> io::Result<()>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add documentation here for why io::Result<()> is returned? (e.g. what an error means)

@@ -107,7 +114,7 @@ impl<E> PollEvented<E> {
/// The flag indicating that this stream is readable is unset and the
/// current task is scheduled to receive a notification when the stream is
/// then again readable.
pub fn need_read(&self) {
pub fn need_read(&self) -> io::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above, can you add documentation here for why these functions return an error?

@stepancheg
Copy link
Author

Added several comments about why functions return errors, and replaced Option<Receiver> with Receiver and library-private close_receiver function.

@alexcrichton alexcrichton added this to the 0.2 release milestone Sep 27, 2016
@alexcrichton
Copy link
Contributor

alexcrichton commented Sep 27, 2016

Ok this seems reasonable to me. I'm gonna hold off on merging it just yet until we get closer to the 0.2 release (due to breaking changes), however. Does that sound ok?

If channel is dropped, receiver may still return EOF, and if channel is
alive, receiver produces an error.
@stepancheg
Copy link
Author

Rebased against master.

@jmlMetaswitch
Copy link

I'd be keen to see this in a release soon, if possible. It sounds like v0.2 is waiting for futures crate to stabilize, so can this go into the next v0.1 release? Any idea when that might be please?

@alexcrichton
Copy link
Contributor

Certainly yeah we could and this sooner! Right now it's a breaking change so we can't land it but it should be possible to add new versions of these functions and deprecate the old!

Would y'all be willing to work on a patch that does that?

@jmlMetaswitch
Copy link

@stepancheg - This is your fix. (Thank you!) Are you happy to fix it up for release?

@stepancheg
Copy link
Author

Cannot do it right now, very busy with my new job.

alexcrichton added a commit to alexcrichton/tokio that referenced this pull request Dec 5, 2017
This commit is targeted at solving tokio-rs/tokio-core#12 and incorporates the
solution from tokio-rs/tokio-core#17. Namely the `need_read` and `need_write`
functions on `PollEvented` now return an error when the connected reactor has
gone away and the task cannot be blocked. This will typically naturally
translate to errors being returned by various connected I/O objects and should
help tear down the world in a clean-ish fashion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants