-
Notifications
You must be signed in to change notification settings - Fork 7.3k
server threads die with assertion error #6222
Comments
to give more context, this is very closely related to this discussion joyent/libuv#826 we are using express 3.1.0 and cluster, which forks upto 7-10 servers that listen on the same port. i would love to get a fix, but the discussion also mentions a workaround with a 5ms delay. any idea if that would work? |
Can you try this?
Please post the trace.log here. |
fwiw, i used this script http://sprunge.us/FPjU with trace. the output is quite large with our code because the socket close event does not occur for ~60 minutes. |
Thanks. Can you post the output of
It seems the file descriptor is supposed to be a listen socket but isn't. A test case would help because I can't exclude the possibility yet that there is a bug in your code somewhere. |
mike@mike-test-1:~$ uname -a |
Thanks. I received a couple of other reports and I'm reasonably sure by now that it's caused by stale events for a file descriptor that's been replaced by a file descriptor from another process. No ETA for a fix yet. |
thanks for looking at it. from what i understand, it looks connected to the way cluster listens to sockets. does it make sense to have a delay when restarting the thread as a workaround? or perhaps there is an alternative to cluster that we can use. |
If by 'thread' you mean 'worker process', then there's no real way to work around it currently (apart from not using the cluster module, of course.) |
yes. thank you! |
hi, saw that this is added to 0.10 milestone. any other ETA info? |
Not really, I'm afraid. I'm on leave currently, I only work on node sporadically. Unless someone else picks this up (unlikely), it'll have to wait until I get back and have time to look into it. |
I'm running into this too. For some weird reason it only happens on one of two exactly the same way configured servers (even after re-installing node and npm) and I can't get it to work on the first one. Is there any easy (or dirty) fix for this issue that doesn't involve not using the cluster module? Thanks. |
I am also running into this issue. Just adding to the list if it helps :) general stack is: cluster + express + redis
The symptom is 100% cpu utilization on master. Running strace shows a bunch of file descriptor errors. In my case, nothing really crashes and the application runs OK actually. Its just noticeable when using Thanks for confirming the issue @bnoordhuis, I can turn off clustering in my case. |
For me it was fixed by listening to the ports immediately instead of waiting for a redis connection etc. Quite ugly as the server can't really process requests the moment he comes online now. |
This issue repeated for me when |
Should be fixed by joyent/libuv#971 |
awesome! will try. |
Good |
@mxk1235 can you confirm that the issue has been fixed? |
Fixed in joyent/libuv@bbccafb and joyent/libuv@f50ccd5. Should get into the next v0.10 and v0.11 release. |
I'm still seeing this with 0.10.22 with our setup. What we're seeing is that when we run grunt as a task in Gradle, node dies with an assertion error when trying to launch a process in Karma: grunt: ../deps/uv/src/unix/stream.c:494: uv__server_io: Assertion `events == 1' failed. Oddly enough, it only happens if grunt runs a shell command before Karma attempts to run the browser, and only if grunt is run under Gradle. I've run strace over it -- it looks like epoll_ctl starts throwing these soon after the shell execute finishes (search for rm -Rf): 29687 <... epoll_ctl resumed> ) = -1 EBADF (Bad file descriptor) https://www.dropbox.com/s/geiat4cwb91sqt7/trace.log.gz When run outside of Gradle (where it works), you can see the epoll running after the rm -Rf is successful: |
This may be related to #6271, as the strace log was showing the same EPOLL_CTL_DEL behaviour. |
Reopening, will look into it soon. |
Also have this issue when running Grunt via a Jenkins Job. I have uploaded a truncated trace, via the following strace:
https://s3.amazonaws.com/grunt.errors/trace.log System:
Node is via the |
I've been seeing this issue in the last couple of weeks when using our Jenkins instance on Cloudbees:
|
I also have this issue when running grunt-contrib-connect from jenkins. |
Thanks @mmastrac, been trying to diagnose this for a day or two, removing a shell command fixes it. |
I got the same issue with Node v0.10.20
Should I upgrade Node version? |
Yes, I would suggest to update node first. |
Same problem here: seem reuse the bad file_descriptor, see strace :
|
I also receive the following error when running
|
For people having problems with running these commands from jenkins or cron, try adding the --norc flag to your bash script - it worked for us and we were seeing the exact same behavior (worked when running manually, failing on cron and jenkins). |
@indutny ... was there a resolution on this one? |
I suppose it is resolved now, didn't see any report except this. |
But no formal resolution either. |
Ok, closing then. Can reopen if new information is received. |
Any news on this one? I'm getting this error when jenkins executes I've tried searching the net for answers, but I got nothing. Slave:
|
just never use FD directly, it's not correctly supported and if you close one he can still try to use it and crash. that all for me. |
using node 0.10.18 on Ubuntu 12.04 LTS.
../deps/uv/src/unix/stream.c:494: uv__server_io: Assertion `events == 1' failed
happens several times a second, effectively killing the threads. I found an old thread that is closed now because of no repro.
thanks in advance.
The text was updated successfully, but these errors were encountered: