Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky test-worker-prof #26401

Closed
Trott opened this issue Mar 2, 2019 · 86 comments · Fixed by #37372
Closed

Investigate flaky test-worker-prof #26401

Trott opened this issue Mar 2, 2019 · 86 comments · Fixed by #37372
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. worker Issues and PRs related to Worker support.

Comments

@Trott
Copy link
Member

Trott commented Mar 2, 2019

https://ci.nodejs.org/job/node-test-binary-arm/6472/RUN_SUBSET=0,label=pi1-docker/console

test-requireio_ceejbot-debian9-armv6l_pi1p-1

00:34:22 not ok 365 parallel/test-worker-prof
00:34:22   ---
00:34:22   duration_ms: 19.977
00:34:22   severity: fail
00:34:22   exitcode: 1
00:34:22   stack: |-
00:34:22     assert.js:85
00:34:22       throw new AssertionError(obj);
00:34:22       ^
00:34:22     
00:34:22     AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
00:34:22     
00:34:22     1 !== 2
00:34:22     
00:34:22         at Object.<anonymous> (/home/iojs/build/workspace/node-test-binary-arm/test/parallel/test-worker-prof.js:29:8)
00:34:22         at Module._compile (internal/modules/cjs/loader.js:807:30)
00:34:22         at Object.Module._extensions..js (internal/modules/cjs/loader.js:818:10)
00:34:22         at Module.load (internal/modules/cjs/loader.js:674:32)
00:34:22         at tryModuleLoad (internal/modules/cjs/loader.js:606:12)
00:34:22         at Function.Module._load (internal/modules/cjs/loader.js:598:3)
00:34:22         at Function.Module.runMain (internal/modules/cjs/loader.js:870:12)
00:34:22         at internal/main/run_main_module.js:21:11
00:34:22   ...
@Trott Trott added flaky-test Issues and PRs related to the tests with unstable failures on the CI. worker Issues and PRs related to Worker support. labels Mar 2, 2019
@gireeshpunathil
Copy link
Member

the assertion failure indicates that either the parent or the child - one those did not produce a trace output! (something beyond what we thought could potentially cause flake in this test - i.e., the process timings and the number of ticks etc.)

refack added a commit to refack/node that referenced this issue Mar 10, 2019
PR-URL: nodejs#26557
Refs: nodejs#26401
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
@refack
Copy link
Contributor

refack commented Mar 10, 2019

Now it's flaking on Windows
Worker: https://ci.nodejs.org/computer/test-azure_msft-win10-x64-1/

assert.js:340
    throw err;
    ^

AssertionError [ERR_ASSERTION]: 3 >= 15
    at Object.<anonymous> (c:\workspace\node-test-binary-windows\test\parallel\test-worker-prof.js:39:3)
    at Module._compile (internal/modules/cjs/loader.js:813:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:824:10)
    at Module.load (internal/modules/cjs/loader.js:680:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:612:12)
    at Function.Module._load (internal/modules/cjs/loader.js:604:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:876:12)
    at internal/main/run_main_module.js:21:11

@refack refack added windows Issues and PRs related to the Windows platform. arm Issues and PRs related to the ARM platform. labels Mar 10, 2019
@targos
Copy link
Member

targos commented Mar 11, 2019

Also on Linux without intl:

https://ci.nodejs.org/job/node-test-commit-linux-containered/11274/nodes=ubuntu1604_sharedlibs_withoutintl_x64/testReport/junit/(root)/test/parallel_test_worker_prof/

assert.js:340
    throw err;
    ^

AssertionError [ERR_ASSERTION]: 10 >= 15
    at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux-containered/test/parallel/test-worker-prof.js:39:3)
    at Module._compile (internal/modules/cjs/loader.js:813:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:824:10)
    at Module.load (internal/modules/cjs/loader.js:680:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:612:12)
    at Function.Module._load (internal/modules/cjs/loader.js:604:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:876:12)
    at internal/main/run_main_module.js:21:11

refack added a commit to refack/node that referenced this issue Mar 12, 2019
BridgeAR pushed a commit that referenced this issue Mar 13, 2019
PR-URL: #26557
Refs: #26401
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
BridgeAR pushed a commit that referenced this issue Mar 13, 2019
PR-URL: #26557
Refs: #26401
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
BridgeAR pushed a commit that referenced this issue Mar 13, 2019
PR-URL: #26600
Refs: #26401
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
BridgeAR pushed a commit that referenced this issue Mar 14, 2019
PR-URL: #26557
Refs: #26401
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
BridgeAR pushed a commit that referenced this issue Mar 14, 2019
PR-URL: #26600
Refs: #26401
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
@Trott
Copy link
Member Author

Trott commented Mar 15, 2019

This was on node-daily-master today:

https://ci.nodejs.org/job/node-test-commit-linux-containered/11398/nodes=ubuntu1604_sharedlibs_withoutintl_x64/console

test-digitalocean-ubuntu1604_sharedlibs_container-x64-10

00:09:19 not ok 2195 parallel/test-worker-prof # TODO : Fix flaky test
00:09:19   ---
00:09:19   duration_ms: 0.528
00:09:19   severity: flaky
00:09:19   exitcode: 1
00:09:19   stack: |-
00:09:19     assert.js:340
00:09:19         throw err;
00:09:19         ^
00:09:19     
00:09:19     AssertionError [ERR_ASSERTION]: 14 >= 15
00:09:19         at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux-containered/test/parallel/test-worker-prof.js:39:3)
00:09:19         at Module._compile (internal/modules/cjs/loader.js:813:30)
00:09:19         at Object.Module._extensions..js (internal/modules/cjs/loader.js:824:10)
00:09:19         at Module.load (internal/modules/cjs/loader.js:680:32)
00:09:19         at tryModuleLoad (internal/modules/cjs/loader.js:612:12)
00:09:19         at Function.Module._load (internal/modules/cjs/loader.js:604:3)
00:09:19         at Function.Module.runMain (internal/modules/cjs/loader.js:876:12)
00:09:19         at internal/main/run_main_module.js:21:11
00:09:19   ...

@Trott
Copy link
Member Author

Trott commented Mar 15, 2019

And for comparison, here's what the failure looks like with the changes in https://github.com/nodejs/node/pull/26608/files/840d31383703fdbc1fe7deb1074c306ee2415aea which is the current change proposed in #26608:

https://ci.nodejs.org/job/node-test-commit-linux-containered/11402/nodes=ubuntu1604_sharedlibs_withoutintl_x64/console

test-digitalocean-ubuntu1604_sharedlibs_container-x64-10

03:27:44 not ok 2198 parallel/test-worker-prof # TODO : Fix flaky test
03:27:44   ---
03:27:44   duration_ms: 0.519
03:27:44   severity: flaky
03:27:44   exitcode: 1
03:27:44   stack: |-
03:27:44     assert.js:85
03:27:44       throw new AssertionError(obj);
03:27:44       ^
03:27:44     
03:27:44     AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
03:27:44     
03:27:44     null !== 0
03:27:44     
03:27:44         at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux-containered/test/parallel/test-worker-prof.js:54:10)
03:27:44         at Module._compile (internal/modules/cjs/loader.js:813:30)
03:27:44         at Object.Module._extensions..js (internal/modules/cjs/loader.js:824:10)
03:27:44         at Module.load (internal/modules/cjs/loader.js:680:32)
03:27:44         at tryModuleLoad (internal/modules/cjs/loader.js:612:12)
03:27:44         at Function.Module._load (internal/modules/cjs/loader.js:604:3)
03:27:44         at Function.Module.runMain (internal/modules/cjs/loader.js:876:12)
03:27:44         at internal/main/run_main_module.js:21:11
03:27:44   ...

@Trott Trott pinned this issue Mar 15, 2019
@gireeshpunathil
Copy link
Member

thanks @Trott . I have no idea what null exit code means - exited with exception? shouldn't it set a valid exit code then? also, shouldn't we get something on the stderr too? weird. Will try to recreate in a --without-intl configuration.

@Trott
Copy link
Member Author

Trott commented Mar 15, 2019

I have no idea what null exit code means - exited with exception?

Maybe this?

node/src/spawn_sync.cc

Lines 683 to 686 in 169b7f1

if (exit_status_ >= 0) {
if (term_signal_ > 0) {
js_result->Set(context, env()->status_string(),
Null(env()->isolate())).FromJust();

If I'm understanding correctly, maybe it means that the process received a signal?

@Trott
Copy link
Member Author

Trott commented Mar 15, 2019

Looking at test/parallel/test-child-process-spawnsync-kill-signal.js also seems to indicate that status is null when a signal is received.

@gireeshpunathil
Copy link
Member

yes, looks like it is! but who could have sent it! probably an oom-killer from the container? I will see if I can replicate in a container; but is there a way to check what the upper memory limit these containers are configured with?

@Trott
Copy link
Member Author

Trott commented Mar 15, 2019

but is there a way to check what the upper memory limit these containers are configured with?

^^^^ @nodejs/build

@refack
Copy link
Contributor

refack commented Mar 15, 2019

> free
              total        used        free      shared  buff/cache   available
Mem:       32946876     2048256     4308300       17532    26590320    30328076
Swap:             0           0           0

@refack
Copy link
Contributor

refack commented Mar 15, 2019

dmesg (error happened around 10:27:44 UTC)

[Fri Mar 15 10:27:45 2019] do_general_protection: 14 callbacks suppressed

Might hint at a SIGSEGV

[Fri Mar 15 07:10:14 2019] traps: node[19847] trap invalid opcode ip:1622d49 sp:7ffe7ab1aa28 error:0 in node[400000+1dd7000]
[Fri Mar 15 07:10:14 2019] traps: node[19849] trap invalid opcode ip:1622d49 sp:7ffef2278e78 error:0 in node[400000+1dd7000]
[Fri Mar 15 07:10:16 2019] traps: node[19985] trap invalid opcode ip:15e92d9 sp:7f7ad841be48 error:0 in node[400000+19b9000]
[Fri Mar 15 07:10:16 2019] traps: node[20007] trap invalid opcode ip:1622d49 sp:7f08310a4e38 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:48:55 2019] traps: node[31881] trap invalid opcode ip:15e92d9 sp:7ffcd7250008 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:55 2019] traps: node[31907] trap invalid opcode ip:15e92d9 sp:7ffdda0d4448 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:55 2019] traps: node[31949] trap invalid opcode ip:15e92d9 sp:7fffc2d46c98 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:56 2019] traps: node[32058] trap invalid opcode ip:15e92d9 sp:7ffd9662f348 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:56 2019] traps: node[32086] trap invalid opcode ip:15e92d9 sp:7ffcee647c98 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:56 2019] traps: node[32088] trap invalid opcode ip:15e92d9 sp:7fff077d8ec8 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:56 2019] traps: node[32141] trap invalid opcode ip:15e92d9 sp:7ffd8365d438 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:56 2019] traps: node[32215] trap invalid opcode ip:15e92d9 sp:7ffe30d4b918 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:56 2019] traps: node[32251] trap invalid opcode ip:15e92d9 sp:7ffe4d7f9838 error:0 in node[400000+19b9000]
[Fri Mar 15 09:48:56 2019] traps: node[32258] trap invalid opcode ip:15e92d9 sp:7ffc7d4d79d8 error:0 in node[400000+19b9000]
[Fri Mar 15 09:49:05 2019] do_trap: 2 callbacks suppressed
[Fri Mar 15 09:49:05 2019] traps: node[2594] trap invalid opcode ip:1622cf9 sp:7ffe30970b58 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:05 2019] traps: node[2635] trap invalid opcode ip:1622cf9 sp:7fff1c56e128 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:05 2019] traps: node[2701] trap invalid opcode ip:1622cf9 sp:7ffc232b9f18 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:05 2019] traps: node[2790] trap invalid opcode ip:1622cf9 sp:7ffeb84c47e8 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:05 2019] traps: node[2813] trap invalid opcode ip:1622cf9 sp:7ffd0ae53348 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:05 2019] traps: node[2840] trap invalid opcode ip:1622cf9 sp:7ffdb3300df8 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:06 2019] traps: node[2904] trap invalid opcode ip:1622cf9 sp:7ffdef97dbc8 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:06 2019] traps: node[2969] trap invalid opcode ip:1622cf9 sp:7ffd427fa868 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:06 2019] traps: node[2977] trap invalid opcode ip:1622cf9 sp:7fffd67743b8 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:49:06 2019] traps: node[3002] trap invalid opcode ip:1622cf9 sp:7ffc84f22da8 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:50:34 2019] do_general_protection: 2 callbacks suppressed
[Fri Mar 15 09:50:34 2019] traps: node[32317] general protection ip:f67050 sp:7fa64fffd180 error:0 in node[400000+19b9000]
[Fri Mar 15 09:51:29 2019] traps: node[5946] trap invalid opcode ip:15e92d9 sp:7ffddc3e2008 error:0 in node[400000+19b9000]
[Fri Mar 15 09:51:29 2019] traps: node[5947] trap invalid opcode ip:15e92d9 sp:7ffff5791708 error:0 in node[400000+19b9000]
[Fri Mar 15 09:51:31 2019] traps: node[6058] trap invalid opcode ip:15e92d9 sp:7fde286f6e48 error:0 in node[400000+19b9000]
[Fri Mar 15 09:51:42 2019] traps: node[7015] trap invalid opcode ip:1622cf9 sp:7ffde02634f8 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:51:42 2019] traps: node[7021] trap invalid opcode ip:1622cf9 sp:7ffe7d06afc8 error:0 in node[400000+1dd7000]
[Fri Mar 15 09:51:44 2019] traps: node[7173] trap invalid opcode ip:1622cf9 sp:7f58dfff9e38 error:0 in node[400000+1dd7000]
[Fri Mar 15 10:26:05 2019] traps: node[18744] trap invalid opcode ip:15e92d9 sp:7ffc389cfc38 error:0 in node[400000+19b9000]
[Fri Mar 15 10:26:06 2019] traps: node[18801] trap invalid opcode ip:15e92d9 sp:7fffe40df9b8 error:0 in node[400000+19b9000]
[Fri Mar 15 10:26:06 2019] traps: node[18867] trap invalid opcode ip:15e92d9 sp:7ffe320126b8 error:0 in node[400000+19b9000]
[Fri Mar 15 10:26:06 2019] traps: node[18944] trap invalid opcode ip:15e92d9 sp:7ffdb1ea20f8 error:0 in node[400000+19b9000]
[Fri Mar 15 10:26:06 2019] traps: node[18985] trap invalid opcode ip:15e92d9 sp:7ffee0ef7ad8 error:0 in node[400000+19b9000]
[Fri Mar 15 10:26:06 2019] traps: node[19000] trap invalid opcode ip:1622d49 sp:7ffc4d1fd9f8 error:0 in node[400000+1dd7000]
[Fri Mar 15 10:26:06 2019] traps: node[19003] trap invalid opcode ip:15e92d9 sp:7ffc94030618 error:0 in node[400000+19b9000]
[Fri Mar 15 10:26:06 2019] traps: node[19016] trap invalid opcode ip:1622d49 sp:7fffb6cba348 error:0 in node[400000+1dd7000]
[Fri Mar 15 10:26:06 2019] traps: node[19051] trap invalid opcode ip:15e92d9 sp:7ffe4899d318 error:0 in node[400000+19b9000]
[Fri Mar 15 10:26:06 2019] traps: node[19067] trap invalid opcode ip:1622d49 sp:7ffc6ceeaed8 error:0 in node[400000+1dd7000]
[Fri Mar 15 10:27:45 2019] do_general_protection: 14 callbacks suppressed
[Fri Mar 15 10:27:45 2019] traps: node[19419] general protection ip:f67050 sp:7fc5e0864180 error:0 in node[400000+19b9000]
[Fri Mar 15 10:28:40 2019] traps: node[23737] trap invalid opcode ip:15e92d9 sp:7ffd0e8b0458 error:0 in node[400000+19b9000]
[Fri Mar 15 10:28:40 2019] traps: node[23738] trap invalid opcode ip:15e92d9 sp:7ffd8a741718 error:0 in node[400000+19b9000]
[Fri Mar 15 10:28:43 2019] traps: node[23866] trap invalid opcode ip:15e92d9 sp:7fc6d37f8e48 error:0 in node[400000+19b9000]
[Fri Mar 15 10:28:46 2019] traps: node[24299] trap invalid opcode ip:1622d49 sp:7fffa0d9ffa8 error:0 in node[400000+1dd7000]
[Fri Mar 15 10:28:46 2019] traps: node[24303] trap invalid opcode ip:1622d49 sp:7fff88ae03e8 error:0 in node[400000+1dd7000]
[Fri Mar 15 10:28:48 2019] traps: node[24502] trap invalid opcode ip:1622d49 sp:7f0cec7a7e38 error:0 in node[400000+1dd7000]
[Fri Mar 15 11:21:22 2019] traps: node[3697] trap invalid opcode ip:15e92d9 sp:7ffe38f00168 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:22 2019] traps: node[3746] trap invalid opcode ip:15e92d9 sp:7ffde0081ef8 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:22 2019] traps: node[3793] trap invalid opcode ip:15e92d9 sp:7ffcac81e8c8 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:22 2019] traps: node[3864] trap invalid opcode ip:15e92d9 sp:7ffc2ad260c8 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:23 2019] traps: node[3934] trap invalid opcode ip:15e92d9 sp:7ffdf5e8c828 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:23 2019] traps: node[3931] trap invalid opcode ip:15e92d9 sp:7ffd7697fce8 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:23 2019] traps: node[3965] trap invalid opcode ip:15e92d9 sp:7ffd9668c578 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:23 2019] traps: node[4041] trap invalid opcode ip:15e92d9 sp:7ffd919dc5f8 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:23 2019] traps: node[4087] trap invalid opcode ip:15e92d9 sp:7ffef0300a38 error:0 in node[400000+19b9000]
[Fri Mar 15 11:21:23 2019] traps: node[4107] trap invalid opcode ip:15e92d9 sp:7ffd31430458 error:0 in node[400000+19b9000]
[Fri Mar 15 11:23:01 2019] do_general_protection: 14 callbacks suppressed
[Fri Mar 15 11:23:01 2019] traps: node[5141] general protection ip:f66fa0 sp:7f5ff44c3180 error:0 in node[400000+19b9000]
[Fri Mar 15 11:23:56 2019] traps: node[9284] trap invalid opcode ip:15e92d9 sp:7ffe2fee1258 error:0 in node[400000+19b9000]
[Fri Mar 15 11:23:56 2019] traps: node[9285] trap invalid opcode ip:15e92d9 sp:7ffe99d0ccd8 error:0 in node[400000+19b9000]
[Fri Mar 15 11:23:58 2019] traps: node[9588] trap invalid opcode ip:15e92d9 sp:7f37a4b48e48 error:0 in node[400000+19b9000]
[Fri Mar 15 11:23:59 2019] traps: node[9702] trap invalid opcode ip:1622bd9 sp:7fff01347178 error:0 in node[400000+1dd7000]
[Fri Mar 15 11:23:59 2019] traps: node[9704] trap invalid opcode ip:1622bd9 sp:7ffe03d0d848 error:0 in node[400000+1dd7000]
[Fri Mar 15 11:24:01 2019] traps: node[9934] trap invalid opcode ip:1622bd9 sp:7fce70567e38 error:0 in node[400000+1dd7000]
root@test-digitalocean-ubuntu1604-docker-x64-2:~# date
Fri Mar 15 13:54:38 UTC 2019

@gireeshpunathil
Copy link
Member

thanks @refack ! the 32 GB total seem to be ample, but the actual available memory to the process when it ran will depend how many processes were running parallel. So unfortunately that data do not throw any clues.

However, the system log shows it all! invalid opcode is indeed OOM-KILLER from the container!

I am having hard time matching the timezones:
ci shows 15:57:44 not ok 2198 parallel/test-worker-prof # TODO : Fix flaky test
the container current time shows Fri Mar 15 13:54:38 UTC 2019 - so how do we map?

Outside of this, we seem to have terminated several processes frequently, through container memory constraints - what do you think of it? I mean, shouldn't we fix that?

@gireeshpunathil
Copy link
Member

and, do_general_protection looks to be a kernel crash, as opposed to user space one? Are there evidence (in any other logs) of the container being restarted around do_general_protection ? there is always minutes of gap between the timestamps around these crash, but that may not be conclusive.

@Trott
Copy link
Member Author

Trott commented Oct 5, 2020

https://ci.nodejs.org/job/node-test-binary-windows-js-suites/6008/RUN_SUBSET=2,nodes=win2012r2-COMPILED_BY-vs2019-x86/console

00:44:05 not ok 754 sequential/test-worker-prof # TODO : Fix flaky test
00:44:05   ---
00:44:05   duration_ms: 30.252
00:44:05   severity: flaky
00:44:05   exitcode: 1
00:44:05   stack: |-
00:44:05     assert.js:104
00:44:05       throw new AssertionError(obj);
00:44:05       ^
00:44:05     
00:44:05     AssertionError [ERR_ASSERTION]: child exited with signal: {
00:44:05       error: Error: spawnSync C:\workspace\node-test-binary-windows-js-suites\node\Release\node.exe ETIMEDOUT
00:44:05           at Object.spawnSync (internal/child_process.js:1066:20)
00:44:05           at spawnSync (child_process.js:622:24)
00:44:05           at Object.<anonymous> (C:\workspace\node-test-binary-windows-js-suites\node\test\sequential\test-worker-prof.js:52:23)
00:44:05           at Module._compile (internal/modules/cjs/loader.js:1072:30)
00:44:05           at Object.Module._extensions..js (internal/modules/cjs/loader.js:1101:10)
00:44:05           at Module.load (internal/modules/cjs/loader.js:937:32)
00:44:05           at Function.Module._load (internal/modules/cjs/loader.js:778:14)
00:44:05           at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)
00:44:05           at internal/main/run_main_module.js:17:47 {
00:44:05         errno: -4039,
00:44:05         code: 'ETIMEDOUT',
00:44:05         syscall: 'spawnSync C:\\workspace\\node-test-binary-windows-js-suites\\node\\Release\\node.exe',
00:44:05         path: 'C:\\workspace\\node-test-binary-windows-js-suites\\node\\Release\\node.exe',
00:44:05         spawnargs: [
00:44:05           '--prof',
00:44:05           'C:\\workspace\\node-test-binary-windows-js-suites\\node\\test\\sequential\\test-worker-prof.js',
00:44:05           'child'
00:44:05         ]
00:44:05       },
00:44:05       status: null,
00:44:05       signal: 'SIGTERM',
00:44:05       output: [ null, '', '' ],
00:44:05       pid: 2008,
00:44:05       stdout: '',
00:44:05       stderr: ''
00:44:05     }
00:44:05         at Object.<anonymous> (C:\workspace\node-test-binary-windows-js-suites\node\test\sequential\test-worker-prof.js:58:10)
00:44:05         at Module._compile (internal/modules/cjs/loader.js:1072:30)
00:44:05         at Object.Module._extensions..js (internal/modules/cjs/loader.js:1101:10)
00:44:05         at Module.load (internal/modules/cjs/loader.js:937:32)
00:44:05         at Function.Module._load (internal/modules/cjs/loader.js:778:14)
00:44:05         at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)
00:44:05         at internal/main/run_main_module.js:17:47 {
00:44:05       generatedMessage: false,
00:44:05       code: 'ERR_ASSERTION',
00:44:05       actual: 'SIGTERM',
00:44:05       expected: null,
00:44:05       operator: 'strictEqual'
00:44:05     }
00:44:05   ...

@Trott
Copy link
Member Author

Trott commented Oct 5, 2020

Stress test shows 21 failures in 1000 runs: https://ci.nodejs.org/job/node-stress-single-test/177/nodes=win2012r2-vs2019/console

@Trott
Copy link
Member Author

Trott commented Jan 30, 2021

I haven't seen this in a while and am wondering if something fixed it.

Stress test: https://ci.nodejs.org/job/node-stress-single-test/209/

@Trott
Copy link
Member Author

Trott commented Jan 30, 2021

I haven't seen this in a while and am wondering if something fixed it.

Stress test: https://ci.nodejs.org/job/node-stress-single-test/209/

Hmm, didn't build, let's try again with more platforms: https://ci.nodejs.org/job/node-stress-single-test/210/

@Trott
Copy link
Member Author

Trott commented Jan 31, 2021

Hmm, didn't build, let's try again with more platforms: https://ci.nodejs.org/job/node-stress-single-test/210/

One failure in 1000 runs.

not ok 1 sequential/test-worker-prof
  ---
  duration_ms: 30.309
  severity: fail
  exitcode: 1
  stack: |-
    node:assert:122
      throw new AssertionError(obj);
      ^
    
    AssertionError [ERR_ASSERTION]: child exited with signal: {
      error: Error: spawnSync C:\workspace\node-stress-single-test\nodes\win2012r2-vs2019\out\Release\node.exe ETIMEDOUT
          at Object.spawnSync (node:internal/child_process:1086:20)
          at spawnSync (node:child_process:667:24)
          at Object.<anonymous> (C:\workspace\node-stress-single-test\nodes\win2012r2-vs2019\test\sequential\test-worker-prof.js:52:23)
          at Module._compile (node:internal/modules/cjs/loader:1108:14)
          at Object.Module._extensions..js (node:internal/modules/cjs/loader:1137:10)
          at Module.load (node:internal/modules/cjs/loader:973:32)
          at Function.Module._load (node:internal/modules/cjs/loader:813:14)
          at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:76:12)
          at node:internal/main/run_main_module:17:47 {
        errno: -4039,
        code: 'ETIMEDOUT',
        syscall: 'spawnSync C:\\workspace\\node-stress-single-test\\nodes\\win2012r2-vs2019\\out\\Release\\node.exe',
        path: 'C:\\workspace\\node-stress-single-test\\nodes\\win2012r2-vs2019\\out\\Release\\node.exe',
        spawnargs: [
          '--prof',
          'C:\\workspace\\node-stress-single-test\\nodes\\win2012r2-vs2019\\test\\sequential\\test-worker-prof.js',
          'child'
        ]
      },
      status: null,
      signal: 'SIGTERM',
      output: [ null, '', '' ],
      pid: 5332,
      stdout: '',
      stderr: ''
    }
        at Object.<anonymous> (C:\workspace\node-stress-single-test\nodes\win2012r2-vs2019\test\sequential\test-worker-prof.js:58:10)
        at Module._compile (node:internal/modules/cjs/loader:1108:14)
        at Object.Module._extensions..js (node:internal/modules/cjs/loader:1137:10)
        at Module.load (node:internal/modules/cjs/loader:973:32)
        at Function.Module._load (node:internal/modules/cjs/loader:813:14)
        at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:76:12)
        at node:internal/main/run_main_module:17:47 {
      generatedMessage: false,
      code: 'ERR_ASSERTION',
      actual: 'SIGTERM',
      expected: null,
      operator: 'strictEqual'
    }
  ...

@Trott
Copy link
Member Author

Trott commented Feb 14, 2021

https://ci.nodejs.org/job/node-test-binary-arm-12+/9366/RUN_SUBSET=0,label=pi2-docker/console

00:55:03 not ok 787 sequential/test-worker-prof # TODO : Fix flaky test
00:55:03   ---
00:55:03   duration_ms: 32.289
00:55:06   severity: flaky
00:55:06   exitcode: 1
00:55:06   stack: |-
00:55:06     node:assert:122
00:55:06       throw new AssertionError(obj);
00:55:06       ^
00:55:06     
00:55:07     AssertionError [ERR_ASSERTION]: child exited with signal: {
00:55:07       error: Error: spawnSync /home/iojs/build/workspace/node-test-binary-arm/out/Release/node ETIMEDOUT
00:55:07           at Object.spawnSync (node:internal/child_process:1086:20)
00:55:07           at spawnSync (node:child_process:667:24)
00:55:07           at Object.<anonymous> (/home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-worker-prof.js:52:23)
00:55:07           at Module._compile (node:internal/modules/cjs/loader:1094:14)
00:55:07           at Object.Module._extensions..js (node:internal/modules/cjs/loader:1123:10)
00:55:07           at Module.load (node:internal/modules/cjs/loader:974:32)
00:55:07           at Function.Module._load (node:internal/modules/cjs/loader:815:14)
00:55:07           at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:76:12)
00:55:07           at node:internal/main/run_main_module:17:47 {
00:55:07         errno: -110,
00:55:07         code: 'ETIMEDOUT',
00:55:07         syscall: 'spawnSync /home/iojs/build/workspace/node-test-binary-arm/out/Release/node',
00:55:07         path: '/home/iojs/build/workspace/node-test-binary-arm/out/Release/node',
00:55:07         spawnargs: [
00:55:07           '--prof',
00:55:07           '/home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-worker-prof.js',
00:55:07           'child'
00:55:07         ]
00:55:07       },
00:55:07       status: null,
00:55:07       signal: 'SIGTERM',
00:55:07       output: [ null, '', '' ],
00:55:07       pid: 20403,
00:55:07       stdout: '',
00:55:07       stderr: ''
00:55:07     }
00:55:07         at Object.<anonymous> (/home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-worker-prof.js:58:10)
00:55:07         at Module._compile (node:internal/modules/cjs/loader:1094:14)
00:55:07         at Object.Module._extensions..js (node:internal/modules/cjs/loader:1123:10)
00:55:07         at Module.load (node:internal/modules/cjs/loader:974:32)
00:55:07         at Function.Module._load (node:internal/modules/cjs/loader:815:14)
00:55:07         at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:76:12)
00:55:07         at node:internal/main/run_main_module:17:47 {
00:55:07       generatedMessage: false,
00:55:07       code: 'ERR_ASSERTION',
00:55:07       actual: 'SIGTERM',
00:55:07       expected: null,
00:55:07       operator: 'strictEqual'
00:55:07     }
00:55:07   ...

Trott added a commit to Trott/io.js that referenced this issue Feb 14, 2021
Trott added a commit to Trott/io.js that referenced this issue Feb 14, 2021
Trott added a commit to Trott/io.js that referenced this issue Feb 15, 2021
Trott added a commit to Trott/io.js that referenced this issue Feb 15, 2021
Trott added a commit to Trott/io.js that referenced this issue Feb 15, 2021
Trott added a commit to Trott/io.js that referenced this issue Feb 17, 2021
Trott added a commit to Trott/io.js that referenced this issue Feb 17, 2021
Fixes: nodejs#26401
Co-authored-by: Gireesh Punathil <[email protected]>
@Trott Trott closed this as completed in 04fb597 Feb 20, 2021
targos pushed a commit that referenced this issue Feb 28, 2021
Fixes: #26401
Co-authored-by: Gireesh Punathil <[email protected]>

PR-URL: #37372
Reviewed-By: Antoine du Hamel <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Gireesh Punathil <[email protected]>
@joyeecheung joyeecheung reopened this Aug 21, 2023
@joyeecheung
Copy link
Member

Still flaky: nodejs/reliability#640

@joyeecheung
Copy link
Member

I start to wonder if we actually need 15 ticks in the tests. Maybe it's already good enough to run some tests and check the code-creation events...

nodejs-github-bot pushed a commit that referenced this issue Sep 4, 2023
Use a JS workload instead of repeating FS operations and use a
timer to make it less flaky on machines with little resources.

PR-URL: #49274
Refs: #26401
Refs: nodejs/reliability#640
Reviewed-By: Benjamin Gruenbaum <[email protected]>
ruyadorno pushed a commit that referenced this issue Sep 28, 2023
Use a JS workload instead of repeating FS operations and use a
timer to make it less flaky on machines with little resources.

PR-URL: #49274
Refs: #26401
Refs: nodejs/reliability#640
Reviewed-By: Benjamin Gruenbaum <[email protected]>
alexfernandez pushed a commit to alexfernandez/node that referenced this issue Nov 1, 2023
Use a JS workload instead of repeating FS operations and use a
timer to make it less flaky on machines with little resources.

PR-URL: nodejs#49274
Refs: nodejs#26401
Refs: nodejs/reliability#640
Reviewed-By: Benjamin Gruenbaum <[email protected]>
@shnooshnoo
Copy link

I believe this issue can be closed now #37673 (comment)

@lpinca lpinca closed this as completed Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. worker Issues and PRs related to Worker support.
Projects
None yet