-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-mature exit callback or delayed data callback #3676
Comments
I'm inclined to agree that the test erroneously assumes a particular ordering of events but on the other hand, maybe it's a quality-of-implementation issue that we can address without too much difficulty. Does the patch below improve matters? diff --git a/deps/uv/src/unix/aix.c b/deps/uv/src/unix/aix.c
index c90b7e5..b129845 100644
--- a/deps/uv/src/unix/aix.c
+++ b/deps/uv/src/unix/aix.c
@@ -100,6 +100,7 @@ void uv__io_poll(uv_loop_t* loop, int timeout) {
uv__io_t* w;
uint64_t base;
uint64_t diff;
+ int have_signals;
int nevents;
int count;
int nfds;
@@ -207,6 +208,7 @@ void uv__io_poll(uv_loop_t* loop, int timeout) {
goto update_timeout;
}
+ have_signals = 0;
nevents = 0;
assert(loop->watchers != NULL);
@@ -237,10 +239,20 @@ void uv__io_poll(uv_loop_t* loop, int timeout) {
continue;
}
- w->cb(loop, w, pe->revents);
+ /* Run signal watchers last. This also affects child process watchers
+ * because those are implemented in terms of signal watchers.
+ */
+ if (w == &loop->signal_io_watcher)
+ have_signals = 1;
+ else
+ w->cb(loop, w, pe->revents);
+
nevents++;
}
+ if (have_signals != 0)
+ loop->signal_io_watcher.cb(loop, &loop->signal_io_watcher, UV__POLLIN);
+
loop->watchers[loop->nwatchers] = NULL;
loop->watchers[loop->nwatchers + 1] = NULL;
|
@bnoordhuis , thanks! this makes perfect sense. I will try this and get back. |
@bnoordhuis, yes - that resolves the issue: 19 failures out of 1000 runs without the fix, and 2000 runs with no failures with the fix. Thanks for the quick review and patch. |
@gireeshpunathil Some of those tests are old enough that they point to issues in the old repo. Specifically the formatting of the naming gives it away, to us that is. Maybe add comments with links pointing to the old repo? |
|
@Fishrock123, thanks - that explains! |
This looks like it's a libuv issue and that there is a pending libuv PR to fix it. I'm going to close this issue here, but if someone feels that it ought to remain open until the libuv fix happens, by all means either re-open it or comment here. (Just doing some issue-tidying for issue-tidying sake.) |
My bug report below may be inaccurate / irrelevant if I know the background of this test case. Can you please tell me the nomenclature of regression test cases? How do I find out the original problem (GH-1899) for which this test case was developed? #1899 seems to be not related?
The test case test/sequential/test-regress-GH-1899.js implements a simple asynchronous child spawn, which prints a hello world and exits. The parent has a child data callback and exit callback. The essence of the test case is to validate that the child exits gracefully, and the data is indeed received in the parent.
This fails in AIX intermittently with assertion failure:
AssertionError: '' == 'hello, world!\n'
at ChildProcess. (test/sequential/test-regress-GH-1899.js:18:10)
The root cause is that the exit callback is called before the data callback.
I would think that in asynchronous mode, the child's data event and exit event may not receive in the parent in the same order. As both data and exit events arrive as async events in the parent's uv loop, the order can be arbitrary, dictated by the OS based on factors outside the control of node.
If I print the fds appearing in the uv loop, I see that they are not in order always (AIX or Linux):
10 12 14 6
10 14 6 12
Here, fd 6 is for the async child exit event. As you can see, it is not deterministic.
Also, with artificial delay in the code, I am able to make the test fail in Linux.
My point is that while child exit happens chronologically later than the data event in the child, this order may not be maintained in the parent: it can be reversed, or it may trigger together, but depending on the order in which they appear in the resultant array of the epoll call, the execution order can differ, and no program should assume the order as it happen in the child.
Please let me know your views.
The text was updated successfully, but these errors were encountered: