-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: wait for all child processes to terminate (fixes issue #1476) #1579
Conversation
9e7a2e2
to
f157443
Compare
Is there a test for this that shows it failing and then passing with the change? |
Also, how is this different to the current kill logic? |
The current kill logic is unchanged. After issuing the kill command in run.js:376 (which only sends signals to the other processes), the processes may need some time finishing. |
And I assume this doesn't apply to windows? (I see Windows is being skipped) |
As mentioned in my comment in #1476 I assume that the issue is only occuring on linux platform. |
This issue has been automatically marked as idle and stale because it hasn't had any recent activity. It will be automtically closed if no further activity occurs. If you think this is wrong, or the problem still persists, just pop a reply in the comments and @remy will (try!) to follow up. |
@PePe79 Could you please look through the Travis error? I'm not sure if it is possible to fix it otherwise. |
@Cerberuser yes I will have a look within the next 2 days. |
Reworked this and moved the waitForOldKids function inside the kill function. I changed the behaviour in the way, that at first all kids are killed and only if all kids have been killed successfully (wait for it), killing also the parent process (the sh-process). This triggers the exit-event like before and restarts the monitored exec command. |
I think this may address the issue but it spams the console with messages "waiting for 1 child process to exit". In our system we actually never exit when we receive the signal, we do a hot reload and ignore the signal. I think nodemon should allow this without spamming the console or using a lot of CPU cycles. |
hey @dobesv , ok that sounds interesting. I did not thought about this use-case. It's conceptually the opposite of what is addressed in the issue #1476. I don't think that both use-cases
and
work together without a specific command line switch that tells nodemon what to do. @remy What do you think about theses use-cases? |
lib/monitor/run.js
Outdated
} else { | ||
// make sure we kill from smallest to largest | ||
const pids = kids.concat(child.pid).sort(); | ||
const pids = kids.sort(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are mutating your parameter here, which is kind of a no-no. Perhaps copy the array first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I fixed that by using .slice()
now.
lib/monitor/run.js
Outdated
@@ -16,6 +16,27 @@ var psTree = require('pstree.remy'); | |||
var path = require('path'); | |||
var signals = require('./signals'); | |||
|
|||
function waitForOldKids(oldKids, callback) { | |||
if (oldKids) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also check oldKids.length > 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, changed this line to if (Array.isArray(oldKids) && oldKids.length > 0) {
lib/monitor/run.js
Outdated
spawn('kill', ['-s', sig, child.pid].concat(kids)) | ||
.on('close', callback); | ||
spawn('kill', ['-s', sig].concat(kids)) | ||
.on('close', waitForOldKids.bind(this, kids, function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we filter out child.pid
from kids
here, we can just rely on the exit
event from ChildProcess
to detect the root process exit, which avoids the extra polling in the case that there's no extra child processes. e.g. kids.filter(p => p !== child.pid)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I know what you're meaning. But some commands won't create any sub processes. Maybe it is esoteric but for example nodemon --exec "read VAR"
won't start a subprocess because read
is a shell builtin command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read VAR
will start a shell subprocess, although there will be just one subprocess from nodemon's perspective. But in that case the exit
event will be sent to us for that process so we don't need to poll for it to exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or to clarify what I'm trying to say. nodemon --exec "read VAR"
will at least start /bin/sh
to run the read
built-in. So there's still a child process there, but only one (/bin/sh
). nodemon --exec "python foo.py"
runs two subprocesses - /bin/sh
and python
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, you think kids
contains also the sh
process PID. I thought it doesn't (because of the previous old code in line 346: ..., child.pid].concat(kids)
). I have to clarify what is right here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually I might have been confused about this one, I did a quick test and it seems like pstree
is not including the parent id in the result.
e.g.
> child = require('child_process').exec('sleep 300s')
ChildProcess {
> child.pid
22955
> require('pstree.remy')(child.pid, (e, k) => console.log(k))
undefined
> [ 22956 ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deleted some of my erroneous comments about this...
The current release of nodemon supports both scenarios perfectly, as long as there is only one child process. I think a few simple changes will make your change more compatible with this:
|
lib/monitor/run.js
Outdated
exec('kill -0 ' + oldKids.join(' '), (error) => { | ||
const returnCode = error ? error.code : 0; | ||
if (returnCode < 126) { // ignore command not found error | ||
const stillRunningKids = oldKids.length - returnCode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I run kill
it always returns 0
or 1
so I'm not sure if this math is reliable for cases where there are multiple child processes. You might have to run psTree
again for each child to see whether it is still running and whether it has added any new child processes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe what you could do is just poll kids[0]
. If it isn't running any more, call waitForOldKids(oldKids.slice(1), callback)
. Then when oldKids.length === 0
all the kids are done.
Instead of using kill
to poll you could use psTree
. It'll return an empty list if the process has exited, but if the list is not empty you can union that with the list you have so that you can detect any new child processes that may have been spawned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, using psTree
to wait for the child process might work in windows as well, where kill
is not available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually since psTree
doesn't return the pid given as part of the list, you can't use it to tell whether a process is still running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this part of code I only want to wait for the subprocesses to terminate. I think using pstree here instead of kill -0
is a good idea. I will try that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, if you pass in the root pid you can just us pstree to check if there are any children or not instead of trying to kill each child.
lib/monitor/run.js
Outdated
if (stillRunningKids > 0) { | ||
utils.log.status('still waiting for ' + stillRunningKids + | ||
' child process(es) to finish...'); | ||
setTimeout(waitForOldKids.bind(this, oldKids, callback), 100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A slower polling interval might be nice for processes that aren't actually planning to exit.
760d1d9
to
f6c2841
Compare
I think your assumption is wrong, because nodemon itself starts a |
If you run Currently these bugs don't affect me because I only ran into them while trying pass arguments to Your fix, were it to be merged, might affect me if it changes the behavior of the case that currently works correctly for me (single child process). |
This issue has been automatically marked as idle and stale because it hasn't had any recent activity. It will be automtically closed if no further activity occurs. If you think this is wrong, or the problem still persists, just pop a reply in the comments and @remy will (try!) to follow up. |
@PePe79, is there any progress, or the PR is really stale? |
No progress from my side, because of lack of time. Next steps will be looking at some of the findings of @dobesv and thinking about how to create a testcase that shows the errorcase. Unfortunately there will be no progress from my side in the next few weeks. |
This issue has been automatically marked as idle and stale because it hasn't had any recent activity. It will be automtically closed if no further activity occurs. If you think this is wrong, or the problem still persists, just pop a reply in the comments and @remy will (try!) to follow up. |
Hope this issue isn't stale in fact. |
This issue has been automatically marked as idle and stale because it hasn't had any recent activity. It will be automtically closed if no further activity occurs. If you think this is wrong, or the problem still persists, just pop a reply in the comments and @remy will (try!) to follow up. |
Eagerly awaiting this PR, thanks for everyone's efforts so far. Definitely not stale, issue persists. |
I'm also waiting for this. @remy Could you check if all is okay? |
@Remzi1993 & @PePe79 I'm blocking out time these next few days for nodemon bits - I had one request on this code (I'll add as review comment too), can we rename |
Hi @remy I will take a look at the code this sunday (but I can't promise anything, because I'm very busy with a course/academy till the end of November). Kind regards, |
Thanks for understanding, and cheers.
…On Thu, 24 Oct 2019, 09:18 Remzi Cavdar, ***@***.***> wrote:
Hi @remy <https://github.com/remy>
I'm so sorry to hear this. Maybe I can help renaming everything, I'm also
for logical names and I also avoid in my coding analogies like this to
avoid confusing what it actually might mean.
I will take a look at the code this sunday (but I can't promise anything,
because I'm very busy with a course/academy till the end of November).
Kind regards,
@Remzi1993 <https://github.com/Remzi1993>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1579?email_source=notifications&email_token=AAADLBHY7XRMEY6JZVBJ7FDQQFK4BA5CNFSM4HVXYBG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECEEXEA#issuecomment-545803152>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADLBDXIN56T6SBBYGOUA3QQFK4BANCNFSM4HVXYBGQ>
.
|
Hi @remy , I am sorry for you. That's totally understandable and renaming shouldn't be a problem. Sorry that I did not have the time to create a testcase that can be automated. I had some ideas to test this inside docker containers, but hadn't any time to try that... |
f6c2841
to
9a951fa
Compare
I also created a git repo with a testcase to reproduce this issue and to test this fix (see README): https://github.com/pepe79/test-nodemon-issue-1476 |
9a951fa
to
90fc149
Compare
Picking this up this week - thank you @PePe79 for the changes. |
Merged and going live in PR #1632 |
I'm in the process or looking at #1633 in detail because this change isn't working as expected in Ubuntu with a standard setup. I think the problem is when there's more than one sub process, the The logic should either run this function for each sub process or it should revalidate whether the processes are running with pstree. I need to look back through the comments here as to why pstree wasn't used though. |
@remy , sorry didn't see that coming. On my system
Using |
See #1476