Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-39349: Add *cancel_futures* to Executor.shutdown() #18057

Merged

Conversation

aeros
Copy link
Contributor

@aeros aeros commented Jan 19, 2020

@aeros aeros added the type-feature A feature request or enhancement label Jan 19, 2020
@aeros aeros requested a review from gvanrossum January 19, 2020 04:13
@aeros
Copy link
Contributor Author

aeros commented Jan 22, 2020

@brianquinlan

Could you test some more scenarios? Like when wait is also true? When called and the interpreter exits.

The current version of the documentation and tests does cover what happens when cancel_futures and wait are both True, since wait is set to True by default. I figured that the vast majority of users would end up calling it as executor.shutdown(cancel_futures=True).

That being said, I think it might be worth elaborating on the behavior when wait=False and cancel_futures=True. The exact behavior varies a bit between executor implementations:

ThreadPoolExecutor - All work items still in the queue (meaning they haven't been assigned to a thread yet) are removed from the queue, and their associated futures are cancelled. This occurs regardless of the value of wait. As for the ones that have been assigned to a thread and are currently running, wait=True allows them to finish.

When wait=False (regardless of cancel_futures), whether or not running futures finish depends on the delay between the executor.shutdown() call and the shutdown of the interpreter. If the interpreter is shut down before the futures finish (terminating the threads w/o joining them), it gets stuck in an indefinite state of running.

The difference between using wait=False and cancel_futures=False vs wait=False and cancel_futures=True is that the pending futures will be cancelled instead of pending until interpreter shutdown (which I believe is prevented until the futures are completed).

ProcessPoolExecutor - Most of the above applies, but underlying details differ a bit. In PPE, the pending work items are not directly accessible in shutdown(), and instead of using a queue it's a dictionary who's items are transferred into a call queue and then removed from the dict. So for the purposes of cancel_futures, any work item that's still in the dict when shutdown() is called gets cancelled (specifically when executor._cancel_pending_work is set to True and after shutdown is detected in _queue_management_worker()).

This results in a bit of delay from when the flag is set to when the pending futures are cancelled, but I think this is the best way to implement cancel_futures for PPE without causing substantial performance losses. IMO, it's preferable to delay cancelling a few pending futures (allowing them to run) and have better overall performance.

Admittedly though, I don't have a strong understanding of the specifics that occur though when wait=False with PPE, mostly because it's quite difficult to examine in real-time. Even just calling executor.shutdown(wait=False) with PPE leads to deadlocks (not even accounting for cancel_futures), at least as of Python 3.7+.

I suspect it has to do with the way non-joined processes are finalized during interpreter shutdown, but that's not an area I'm particularly knowledgeable with. Maybe @pitrou could clarify?

I just found this out while writing an example to demonstrate the above. In my own personal usage of executor.shutdown(), I've never explicitly set wait to False or had a good reason to consider doing so. It's a separate issue, but I think it's worth addressing. I think the deadlock occurs that occurs for PPE will have to be addressed before I can add tests for executor.shutdown(wait=False). Unless they're specifically just added to separate TPE-only tests, but I tried to make the tests fully generic and applicable to both executors.

Examples:

Interaction between wait and cancel_futures: https://gist.github.com/aeros/2e73c8d6dccc94fd863967715826c78d

PPE deadlock demo: https://gist.github.com/aeros/d1ff62b730426584413bca0c8f2ed99d

Edit: I made the current documentation more explicit about what happens when wait=True and cancel_futures=True.

@aeros
Copy link
Contributor Author

aeros commented Jan 22, 2020

Thanks for the review @brianquinlan! I'll go ahead and make some of the recommend changes while waiting on your response to the above comment.

@gvanrossum
Copy link
Member

Hey Kyle, I know I've promised you a review for this, but I've been distracted by various other things. Now that you've got Brian reviewing your code that's actually much better. Once Brian approves I can rubber-stamp and merge it.

(Let me know if you have specific questions for me.)

@aeros aeros requested a review from brianquinlan January 24, 2020 22:35
Doc/library/concurrent.futures.rst Outdated Show resolved Hide resolved
Lib/concurrent/futures/thread.py Outdated Show resolved Hide resolved
Lib/test/test_concurrent_futures.py Show resolved Hide resolved
Lib/concurrent/futures/thread.py Outdated Show resolved Hide resolved
Lib/concurrent/futures/process.py Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@aeros
Copy link
Contributor Author

aeros commented Jan 29, 2020

I have made the requested changes; please review again

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and comments below. Sorry for the last-minute review.

Lib/concurrent/futures/process.py Show resolved Hide resolved
Lib/test/test_concurrent_futures.py Outdated Show resolved Hide resolved
Lib/test/test_concurrent_futures.py Show resolved Hide resolved
@aeros
Copy link
Contributor Author

aeros commented Jan 31, 2020

@brianquinlan

aeros Did you have some sort of agreement with gvanrossum that he would do the merge? Because I'm happy to do it.

Sounds good! I believe @gvanrossum was previously going to merge the PR since he had brought some attention to the feature, and I had agreed to implement it. But I'm perfectly happy with you or @pitrou merging it whenever you think it's ready, especially as the resident experts for concurrent.futures. (:

@pitrou

Some questions and comments below. Sorry for the last-minute review.

No problem, thanks for review. I'd rather make the implementation of cancel_futures as reliable as it reasonably can be (without hindering performance). I just replied to each of the comments/questions, and should have time to make any changes later tonight.

@aeros
Copy link
Contributor Author

aeros commented Feb 1, 2020

To ensure this works across platforms prior to merging the PR, I think it might be a good idea to make use of the new test-with-buildbots label; just to make sure that all of the stable buildbots pass.

Note: I noticed that buildbot/x86 Windows7 3.x and buildbot/AMD64 Windows7 SP1 3.x have been failing in the compilation stage for other commits, so those buildbots will likely not end up passing unless they were fixed recently.

@aeros aeros added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 1, 2020
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @aeros for commit d756264 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 1, 2020
@pitrou
Copy link
Member

pitrou commented Feb 1, 2020

The two buildbot failures as well as the Azure Pipelines failure look unrelated.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me. Thank you @aeros for doing this !

@pitrou
Copy link
Member

pitrou commented Feb 1, 2020

Can you push a dummy commit to restart the CI?

@aeros
Copy link
Contributor Author

aeros commented Feb 1, 2020

@pitrou

Can you push a dummy commit to restart the CI?

Yep, Steve Dower advised me to close and re-open the PR to reset the CI in python-dev when I asked about the Azure Pipelines failure, and informed me that the two failing buildbots are not working for 3.9, but haven't been disabled yet. I'll go ahead and close/re-open it.

@aeros aeros closed this Feb 1, 2020
@aeros aeros reopened this Feb 1, 2020
@aeros
Copy link
Contributor Author

aeros commented Feb 1, 2020

Thank you @aeros for doing this !

No problem, thanks again for the reviews @brianquinlan and @pitrou!

This issue turned out to be a bit more involved than I had initially anticipated, but I learned quite a lot about the internal implementation details of the executors, especially ProcessPoolExecutor.

@aeros
Copy link
Contributor Author

aeros commented Feb 2, 2020

There seemed to be an unrelated issue in the checkout stage for the most recent GitHub Actions CI for Tests / Ubuntu. Since the PR passed all of the required checks and succeeded for all of the working buildbots, I think it's safe to ignore that one.

@pitrou pitrou merged commit 339fd46 into python:master Feb 2, 2020
aeros pushed a commit that referenced this pull request Sep 1, 2020
* Add cancel_futures parameter to the Executor base class, since it was missed in the original PR (#18057) that added cancel_futures.
hauntsaninja added a commit to hauntsaninja/cpython that referenced this pull request Sep 1, 2020
…ythonGH-22023)

* Add cancel_futures parameter to the Executor base class, since it was missed in the original PR (python#18057) that added cancel_futures.
(cherry picked from commit 17dc1b7)

Co-authored-by: Shantanu <[email protected]>
aeros pushed a commit that referenced this pull request Sep 2, 2020
…H-22023) (GH-22048)

* Add cancel_futures parameter to the Executor base class, since it was missed in the original PR (#18057) that added cancel_futures.
(cherry picked from commit 17dc1b7)
xzy3 pushed a commit to xzy3/cpython that referenced this pull request Oct 18, 2020
…GH-22023)

* Add cancel_futures parameter to the Executor base class, since it was missed in the original PR (python#18057) that added cancel_futures.
jessecomeau87 pushed a commit to jessecomeau87/Python that referenced this pull request May 20, 2024
* Add cancel_futures parameter to the Executor base class, since it was missed in the original PR (python/cpython#18057) that added cancel_futures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants