-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancelling a job doesn't run post hooks or upload artifacts #119
Comments
fyi: |
@graemej we do in fact send a Line 79 in 1636d0e
Line 188 in 76417b9
SIGKILL if it's still alive 10 seconds after sending a SIGTERM .
This issue is about having the agent kill the process that's invoked by the |
@graemej I've seen that with docker though! Especially if you don't use the |
This should be fixed now! |
@lox oh right! Since the |
Urgh, y'know, I mis-read this. I was actually looking for specifically this bug report to mention that |
As it stands, cancel will send a SIGTERM to the bootstrap, which will send it to any sub-process-groups it has created and then exit. We need to add some handling to run specific hooks in the bootstrap post-killing of it's subprocesses but before it exits itself. Which hooks should run? I'm kind of tempted to keep the current behaviour given it's been the status quo for so long. Perhaps we should add a |
I think unfortunately this one is going to be bumped to 3.1.0. |
Just going to put this here because it might be related: do we have any plans to make the force-kill timeout user-configurable? It seems that ten seconds is not enough time for some applications to clean up after themselves. |
@lox it will be useful to consider windows OS when discussing this, since on Windows:
Hence, I think issue #794 may be worth considering.
These separate hooks would be very useful to have. This in addition to #794 would improve the experience on Windows greatly. |
It turns out that we do run |
It's now acting correctly under windows in #879. |
When we cancel a job, we send a KILL signal to the
bootstrap.sh
process. This is troublesome because it doesn't allow us to upload artifacts, or run any post-script hooks. So, the idea I have is to send the kill signals to the underlying script process, not the bootstrap one.To do this, first we need to find out the PID, I did some bash and came up with this:
Now the problem is communicating the PID back to the agent, but I think this can be done with
buildkite-agent meta-data
or something. Once we have that PID, the canceller just kills the process PID, not the boostrap one. And all the after script tasks + cleanup should all "just work"...I think!The text was updated successfully, but these errors were encountered: