-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread panics in SpawnedTask during shutdown. #12089
Comments
I suggest a fourth option:
Calling @DDtKey and @devinjdangelo for your thoughts I believe you added the code in question as part of #9422 |
Actually, I think the logic existed before #9422. There is a mention about shutdown: datafusion/datafusion/common-runtime/src/common.rs Lines 69 to 72 in e6e1eb2
I believe it's just something we need to handle. In general, I always prefer fallible API, so we can consider using But ignoring the error will be much easier (with some debug logging at least) - and that may be more than enough |
Maybe we can add a |
I think this makes sense. If for some reason a caller needs the exact error, they could call join instead. |
I totally see the use case for
|
PR fix is ready for review. |
Describe the bug
The existing unreachable assumes that no polling will occur after the runtime begins a shutdown. However, we found that while running datafusion in it's own runtime (own threadpool) we can actually hit this unreachable code -- when we start shutting down the executor and an internal poll still occurs.
We think that how we are executing our datafusion queries is not uncommon, and therefore the shutdown behavior should not be causing a thread panic in datafusion.
To Reproduce
we made a reproducer in this draft PR.
Expected behavior
Don't cause thread panics during shutdown.
Additional context
The state goal of SpawnedTask is
Provides guarantees of aborting on
Dropto keep it cancel-safe.
The API for SpawnedTask::join_unwind() is for returning the generic
R
with the result removed. Meaning, this API is called when we presume an error is not going to occur -- and panic if there is an error.Whereas a join error due to task cancelling is a "possible" error. So do we:
Result<R>
propagating the ErrorThe text was updated successfully, but these errors were encountered: