Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supercronic doesn't like to send SIGTERM/SIGINT to my script #62

Closed
icy opened this issue Apr 19, 2020 · 10 comments
Closed

supercronic doesn't like to send SIGTERM/SIGINT to my script #62

icy opened this issue Apr 19, 2020 · 10 comments

Comments

@icy
Copy link

icy commented Apr 19, 2020

I start the tool with supercronic -debug my.crontab. When I press ^C, I expect SIGTERM/SIGINT is sent to my script (aka, supercronic would work as a signal proxy). However, supercronic doesn't send any signal to my script, it just waits for my script.

My configuration

*/5 * * * * * * ./cd.sh sys_sleep

(/cd.sh sys_sleep works perfectly with SIGTERM/SIGINT when I tried its own, see below)

Debug logging

INFO[2020-04-19T14:22:50+02:00] starting                                      iteration=7 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"                 
INFO[2020-04-19T14:22:50+02:00] :: sys_sleep: Sleeping 180 second(s). Use 'now' to wake up the sleeping script.  channel=stdout iteration=7 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"                                                                                                                                                                                                                 
^CINFO[2020-04-19T14:22:53+02:00] received interrupt, shutting down                                                                                                                                                  
INFO[2020-04-19T14:22:53+02:00] waiting for jobs to finish                                                                                                                                                           
WARN[2020-04-19T14:22:55+02:00] not starting: job is still running since 2020-04-19 14:22:50 +0200 CEST (5s elapsed)  iteration=7 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"      
WARN[2020-04-19T14:23:00+02:00] not starting: job is still running since 2020-04-19 14:22:50 +0200 CEST (10s elapsed)  iteration=7 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"     

My cd.sh signal trap is working

$ ./cd.sh sys_sleep
:: sys_sleep: Sleeping 180 second(s). Use 'now' to wake up the sleeping script.
^C:: sys_trap: Testing purpose: This is sys_trap
:: rclean: Cleaning up metric file /home/gfg/metrics.txt
:: sys_trap: Testing purpose: This is sys_trap
:: rclean: Cleaning up metric file /home/gfg/metrics.txt
@icy icy changed the title supercronic doesn't like to send SIGTERM to my script supercronic doesn't like to send SIGTERM/SIGINT to my script Apr 19, 2020
@icy
Copy link
Author

icy commented Apr 19, 2020

When I used dumb-init as a signal proxy for my script, everything is working well. So I'm probably sure this is a problem/feature of supercronic.

Using dumb-init --> supercronic --> my script

In this configuration, supercronic is in front my script. When I press ^C, dumb-int sends signal to supercronic, but supercronic stucks there . I have to pkill -9 supercronic to finish everything.

$ dumb-init --verbose -- ./testdump.sh 
[dumb-init] Detached from controlling tty, but was not session leader.
[dumb-init] Child spawned with PID 426096.
[dumb-init] Unable to attach to controlling tty (errno=1 Operation not permitted).
[dumb-init] setsid complete.
INFO[2020-04-19T14:38:31+02:00] read crontab: ./supercronic.crontab          
DEBU[2020-04-19T14:38:31+02:00] try parse(7): */5 * * * * * * ./cd.sh sys_sleep[0:15] = */5 * * * * * * 
DEBU[2020-04-19T14:38:31+02:00] job will run next at 2020-04-19 14:38:35 +0200 CEST  job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
INFO[2020-04-19T14:38:35+02:00] starting                                      iteration=0 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
INFO[2020-04-19T14:38:35+02:00] :: sys_sleep: Sleeping 180 second(s). Use 'now' to wake up the sleeping script.  channel=stdout iteration=0 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
^C[dumb-init] Received signal 2.
[dumb-init] Forwarded signal 2 to children.
INFO[2020-04-19T14:38:36+02:00] received interrupt, shutting down            
INFO[2020-04-19T14:38:36+02:00] waiting for jobs to finish                   
WARN[2020-04-19T14:38:40+02:00] not starting: job is still running since 2020-04-19 14:38:35 +0200 CEST (5s elapsed)  iteration=0 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
WARN[2020-04-19T14:38:45+02:00] not starting: job is still running since 2020-04-19 14:38:35 +0200 CEST (10s elapsed)  iteration=0 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
./testdump.sh: line 2: 426098 Killed                  ./supercronic -debug ./supercronic.crontab
[dumb-init] Received signal 17.
[dumb-init] A child with PID 426096 exited with exit status 137.
[dumb-init] Forwarded signal 15 to children.
[dumb-init] Child exited with status 137. Goodbye.

Using dumb-init --> my script

This is a simple setup and it's working well. I press ^C and dumb-init send correct signal to my script, and my script just exists correctly as expected.

$ dumb-init --verbose -- ./testdump.sh 
[dumb-init] Detached from controlling tty, but was not session leader.
[dumb-init] Child spawned with PID 426230.
[dumb-init] Unable to attach to controlling tty (errno=1 Operation not permitted).
[dumb-init] setsid complete.
:: sys_sleep: Sleeping 180 second(s). Use 'now' to wake up the sleeping script.
^C[dumb-init] Received signal 2.
[dumb-init] Forwarded signal 2 to children.
:: rclean: Cleaning up metric file /home/gfg/metrics.txt
:: rclean: Cleaning up metric file /home/gfg/metrics.txt
[dumb-init] Received signal 17.
[dumb-init] A child with PID 426230 exited with exit status 130.
[dumb-init] Forwarded signal 15 to children.
[dumb-init] Child exited with status 130. Goodbye.

@krallin
Copy link
Collaborator

krallin commented Apr 19, 2020

This is by design. When you interrupt Supercronic, it stops scheduling new instances of your jobs, but it doesn't interrupt the current ones. If you want to interrupt the jobs themselves, then you should send them a signal.

IIRC, when signalled, dumb-init forwards signals to its descendant process group, so that sends it Supercronic and your children jobs, and those get terminated.

So, to summarize, it's not that Supercronic doesn't "like" to signal your script — it's just that it's not what it does, because as a default it's usually not a great default for a job runner to unconditionally kill everything when you ask it to stop running new jobs. The reason why that's not a great default is because you can simply kill everything yourself if that's what you want, which is what you did here.

As an aside, Supercronic doesn't create a new process group for your jobs, so if you signalled kill -$PID (note the - here), then that would signal the entire group, including your jobs.

@krallin krallin closed this as completed Apr 19, 2020
@icy
Copy link
Author

icy commented Apr 19, 2020

@krallin May I ask you how I support my script in Docker container?

Let's say I have a docker container that contains my script, which is working well. Now I want to add supercronic in between, so that I can have a cron-liked feature.

What I understand is that supercronic idea is to support not-well-behavior script (e.g, some script doesn't respect sigterm/sigkill). but my script / my job isn't that case.

Thanks so much

@krallin
Copy link
Collaborator

krallin commented Apr 19, 2020

I think there's a bit of confusion here. It seems to me that you think Supercronic is a process manager. It isn't — it's a cron runner.

What I understand is that supercronic idea is to support not-well-behavior script (e.g, some script doesn't respect sigterm/sigkill). but my script / my job isn't that case.

No; your understanding is incorrect here.

Supercronic's design is that when it is signalled, it'll stop scheduling new jobs, and wait for existing jobs to exit. Supercronic is not a process manager: it's a job runner. In that sense, it does have an expectation that the jobs it is taked with running will terminate at some point (and if they don't, then those are indeed misbehaved scripts from Supercronic's perspective).

This behavior is not about handling scripts that don't respect signals, it's simply the semantics of what termination means for Supercronic.

Let's say I have a docker container that contains my script, which is working well. Now I want to add supercronic in between, so that I can have a cron-liked feature.

You've provided very little detail about your cron-like feature is, so that's hard to say with certainty.

That said, Supercronic is designed to run periodic jobs, so if your job is actually a daemon that doesn't exit and expects to be signalled to exit, then perhaps running said job through Supercronic isn't what you should be doing?

If you want to mix some periodic tasks along with a daemon process (I'm guessing this is what you need here?), then I would recommend you use a process manager to start your app AND Supercronic as separate processes. Then, when your process manager is signalled, it'll signal your app to shut down, and Supercronic, which will wait for whichever tasks you have scheduled to finish running.

In general, for simplicity, I would strongly recommend just running your app in one container and your periodic tasks in another container that runs Supercronic. That being said, if that isn't an option for you (e.g. because they need to share some temporary files), then using a process manager is the way to go.

@icy
Copy link
Author

icy commented Apr 19, 2020

I understand that supercronic is not a process manager.

What confused me, actually, the description of the project itself (https://github.com/aptible/supercronic#why-supercronic)

They often don't respond gracefully to SIGINT / SIGTERM, and may leave running jobs orphaned when signaled. Again, this makes sense in a server environment where init will handle the orphan jobs and Cron isn't restarted often anyway, but it's inappropriate in a container environment as it'll result in jobs being forcefully terminated (i.e. SIGKILL'ed) when the container exits.

SIGTERM triggers a graceful shutdown (and so does SIGINT, which you can deliver via CTRL+C when used interactively)

So what is exactly graceful shutdown here?

@krallin
Copy link
Collaborator

krallin commented Apr 19, 2020

SIGTERM triggers a graceful shutdown (and so does SIGINT, which you can deliver via CTRL+C when used interactively)

I'm sorry if this was unclear — it means that Supercronic will stop starting new jobs and wait for exiting jobs to finish. I.e. nothing gets killed.

@icy
Copy link
Author

icy commented Apr 19, 2020

I think triggering a graceful shutdown is something else (aka, sending some signal to jobs...) So yes that's confusing me quite a lot.

Thanks a lot for your time, and I'm very sorry that I can't use the tool in my case. The tool is just so great.

@andersryanc
Copy link

I was similarly confused by the description of the project and also thought it would help facilitate the passing of the signals down to the running jobs as well...

@icy were you able to find a good solution for this? I'm currently having issues getting my configuration set up correctly to handle a graceful shutdown of a php script which is being run by cron in a debian based docker container...

@krallin is there a reason why supersonic can't pass these signals on to the jobs it's managing? Maybe this could be an option via a flag?

If term/int signals need to be sent to the child processes as well as supersonic itself, do you have a suggestion of how that would be done in the context of docker? or even further in a kubernetes cluster? I'm trying to test this locally by using docker stop, but eventually I would want this to work in kubernetes when we rollout updates to the pods.

@icy
Copy link
Author

icy commented May 5, 2023

Hi @andersryanc ,

Thanks for your asking. As said in my previous comment, signal processing is important for my tool and I can't use supercronic for that purpose. I use dump-init and also rewrite the tool to act as a cronjob, but it relies on a self counter: When the internal counter reaches some limit, the program signals itself to exit.

It's interesting, but the "root" cause for us is to avoid some memory "leak"-like issue, as seen in golang/go#20135 . I suggested an idea here kubernetes/kubernetes#85752 but not many people have the same issue. Well, I have an option to use k8s cronjob, or to rewrite the golang program/supportive script, but I think I decide to keep the golang code more friendly to me, and k8s cronjob is time-based .

Hope that helps and I wish you find your own way soon.

@antocorso
Copy link

Hello everyone, I'm experiencing an issue that bears some similarity to the current discussion.

@krallin, my perspective is that the lack of signal propagation to child processes isn't optimal. Ideally, such a behavior should be controlled by a configurable flag.

This becomes crucial when dealing with cron tasks that are time-intensive and may need intervention in the form of a SIGTERM signal.

To illustrate, in my specific scenario, I have a task set up to generate and email PDF invoices for a customer list. This task is quite lengthy, hence, when a single invoice is completed, and a SIGTERM signal is received, I'd prefer the task to terminate rather than commence with a new invoice. While I can allow for the completion of a single invoice, waiting for the entire process isn't feasible, especially during a scale-in phase. For instance, with AWS ECS, a task must be terminated within a maximum of 120 seconds!

Therefore, I believe it's essential to have a mechanism that allows sending signals to child processes.

Nevertheless, I appreciate the excellent work you've shared. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants