Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: more reliable & extensible ^C REPL interrupt #14032

Closed
wants to merge 2 commits into from

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Nov 17, 2015

Instead of making task_done_hook aware of the REPL specifically,
this gives clients (such as the REPL) the ability to register a
handler for dealing with any exception that would otherwise not have a handler.

For example, the following code is valid. And while not recommended in this form,
this code is essentially a simple produce/consume pattern.
If you wait a few seconds, it will even finish.

for i = 1:10^6; schedule(current_task()); yieldto(@task ()); end

However, on v0.4, hitting ^C might either accomplish nothing or infinitely cause a stack overflow segfault.

with this change, hitting ^C gets back to the REPL immediately.


This can handle other sorts of exceptions, although
in reality, the only unhandled exceptions are those that occur
when trying to run task_done_hook or early on the root task.
Accordingly, the most likely error is InterruptException,
but not exclusively.

This approach allows the REPL to print an error and return to a prompt,
even when the Task state of eval_user_input is too inconsistent or
unknown to be called directly.

If the last-chance exception handler returns or throws an error,
that that error is then considered to be finally fatal.
The return value when setting the exception handler allows the user
to chain and reset the handler.

…n hack

Instead of making task_done_hook aware of the REPL specifically,
this gives clients (such as the REPL) the ability to register a
handler for any unhandled exception.

In reality, the only unhandled exceptions are those that occur
when trying to run task_done_hook or early on the root task.
Accordingly, the most like error is InterruptException,
but not exclusively.

This approach allows the REPL to print an error and return to a prompt,
even when the Task state of eval_user_input is too inconsistent or
unknown to be called directly.

If the last-chance exception handler returns or throws an error,
that that error is then considered to be finally fatal.
The return value when setting the exception handler allows the user
to chain and reset the handler.
backend.in_eval = false
backend.ans = value
put!(backend.response_channel, (value, nothing))
eval(Main, :(ans = $(Expr(:quote, value))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you verify that #6763 doesn't crop back up with this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i hadn't, but trying it now it seems to work as expected. it looks like i need to delete the above comment too.

i was modifying this on the implementation of the similar function in client.jl:
07a0c09

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes to this function fix #13955

@stevengj
Copy link
Member

For IJulia, I really want SIGINT to only go to a certain task (the one executing user code); is that possible with this PR?

@vtjnash
Copy link
Member Author

vtjnash commented Nov 18, 2015

no, this would still kill every running task until it eventually managed to strike the user one. i think i may be able to put together a more general solution for that also helps with your IJulia case

@amitmurthy
Copy link
Contributor

Will this PR fix JuliaLang/Distributed.jl#34 ?

@vtjnash
Copy link
Member Author

vtjnash commented Mar 14, 2016

it should help. although calling SIGINT is usually a bad idea regardless.

@amitmurthy
Copy link
Contributor

Why? If we are handling Ctrl-C is the REPL shouldn't kill -2 have the same effect?

@yuyichao
Copy link
Contributor

Ref #2622

@amitmurthy
Copy link
Contributor

That is a different issue - the argument there is about state maintained in interrupted external libraries. My question is shouldn't Ctrl-C at the REPL and an external signal sent have the same behavior?

@yuyichao
Copy link
Contributor

It does have the same behavior, just that currently unless you know the process you are interrupting is in a safe state (which is hard due to the asynchronous nature of signals) it can cause crashes.

@amitmurthy
Copy link
Contributor

Practically it doesn't. On a fresh julia REPL session with no code run, Ctrl-C does nothing. Sending an external kill -2 to this process results in the session exiting with this displayed on screen:

julia> ERROR: InterruptException:

This was not the behavior a while back. interrupt() which effectively sends SIGINT to all workers used to work as expected.

@yuyichao
Copy link
Contributor

Seems that in REPL we actually handle the key event directly without going through the signal and that's why the behaviors are different.

I suppose this PR fixes the handling of SIGINT when no code is running (if it doesn't, it shouldn't be hard to fix either) and what @vtjnash meant by a bad idea is #2622.

@amitmurthy
Copy link
Contributor

no, this would still kill every running task until it eventually managed to strike the user one.

From the comment it appears that all active tasks are killed. Shouldn't we only kill the currently running task? As an example, the parallel code base has a background task initiated at Julia startup that pushes distributed gc messages to other workers. A Ctrl-C / SIGINT should not kill this task (and other user background tasks typically started with a @schedule) unless it was specifically running at the time.

@vtjnash
Copy link
Member Author

vtjnash commented Mar 14, 2016

currently running Task is transiently ill-defined -- there's no such thing as a background task, just currently running task. it's just as likely that you might happen to deliver the signal to the gc task or the client message handling loop and terminate it instead of some user task.

@amitmurthy
Copy link
Contributor

Would it help if we could mark tasks to ignore SIGINT ? This way tasks that wait on a socket (in multi.jl), or the gc task would call, say, sigint_ignore(current_task()) which would just ignore the signal and not crash. User tasks which would not set this would be terminated. The REPL would interrupt the foreground (for lack of a better word) task and display a prompt again.

@vtjnash
Copy link
Member Author

vtjnash commented Mar 14, 2016

that would be equivalent to making sigatomic task-local, which sounds like a good idea to me. we should have throwing sigint set that flag (actually, any error) and reset it in the catch handler, so we don't try to throw a second error while unwinding one error.

@ViralBShah
Copy link
Member

Is this relevant in any way for #35524 or is it really old and can be closed?

@ViralBShah
Copy link
Member

ViralBShah commented Sep 13, 2023

Closing this as old, but please reopen if still something we can use.

@ViralBShah ViralBShah closed this Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants