Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky sequential/test-net-GH-5504 #19906

Closed
BridgeAR opened this issue Apr 9, 2018 · 10 comments
Closed

Flaky sequential/test-net-GH-5504 #19906

BridgeAR opened this issue Apr 9, 2018 · 10 comments
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. test Issues and PRs related to the tests.

Comments

@BridgeAR
Copy link
Member

BridgeAR commented Apr 9, 2018

https://ci.nodejs.org/job/node-test-binary-arm/286/RUN_SUBSET=1,label=pi1-docker/console

16:34:09 not ok 299 sequential/test-net-GH-5504
16:34:09   ---
16:34:09   duration_ms: 360.115
16:34:09   severity: fail
16:34:09   stack: |-
16:34:09     timeout
@BridgeAR BridgeAR added test Issues and PRs related to the tests. flaky-test Issues and PRs related to the tests with unstable failures on the CI. labels Apr 9, 2018
@BridgeAR
Copy link
Member Author

@BridgeAR
Copy link
Member Author

@Trott
Copy link
Member

Trott commented Apr 17, 2018

Might be worth double-checking that we're getting stdout and stderr in these timeout cases. I suspect not, and I wonder if that might be fixable with a modification to test.py. This test writes a lot to stderr (38 lines or so) and knowing how far it got before timing out would be a pretty good clue as to where the test is getting hung up.

@Trott
Copy link
Member

Trott commented Apr 17, 2018

@nodejs/testing @nodejs/build (on that last comment I left)

@Trott
Copy link
Member

Trott commented Apr 24, 2018

Again: https://ci.nodejs.org/job/node-test-binary-arm/667/RUN_SUBSET=1,label=pi2-docker/console

03:20:10 not ok 302 sequential/test-net-GH-5504
03:20:10   ---
03:20:10   duration_ms: 240.37
03:20:10   severity: fail
03:20:10   exitcode: -15
03:20:10   stack: |-
03:20:10     timeout

@Trott
Copy link
Member

Trott commented Apr 24, 2018

Seems to only happen on the Raspberry Pi devices.

Trott added a commit to Trott/io.js that referenced this issue Apr 24, 2018
When a test times out, the contents of stdout and stderr can often be
highly valuable in debugging. Provide that information.

Refs: nodejs#19906 (comment)
Trott added a commit to Trott/io.js that referenced this issue Apr 27, 2018
When a test times out, the contents of stdout and stderr can often be
highly valuable in debugging. Provide that information.

Refs: nodejs#19906 (comment)

PR-URL: nodejs#20260
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Michael Dawson <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
Reviewed-By: Gus Caplan <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
@Trott
Copy link
Member

Trott commented Apr 27, 2018

Now that we log stdout and stderr on timeouts, CI output may give us a better idea as to what's going wrong here:

https://ci.nodejs.org/job/node-test-binary-arm/763/RUN_SUBSET=4,label=pi3-docker/console

12:51:56 not ok 302 sequential/test-net-GH-5504
12:51:56   ---
12:51:56   duration_ms: 240.52
12:51:56   severity: fail
12:51:56   exitcode: -15
12:51:56   stack: |-
12:51:56     timeout
12:51:56     SERVER 1>listening
12:51:56     SERVER 2>NET 16884: setupListenHandle 127.0.0.1 12346 4 false undefined
12:51:56     SERVER 2>NET 16884: setupListenHandle: create a handle
12:51:56     SERVER 2>NET 16884: bind to 127.0.0.1
12:51:56     CLIENT 2>NET 16890: createConnection [ { host: '127.0.0.1', port: 12346 },
12:51:56     CLIENT 2>  [Function],
12:51:56     CLIENT 2>  [Symbol(normalizedArgs)]: true ]
12:51:56     CLIENT 2>NET 16890: pipe false undefined
12:51:56     SERVER 2>NET 16884: onconnection
12:51:56     CLIENT 2>NET 16890: afterConnect
12:51:56     SERVER 2>NET 16884: _read
12:51:56     SERVER 2>NET 16884: Socket._read readStart
12:51:56     SERVER 2>NET 16884: SERVER _emitCloseIfDrained
12:51:56     SERVER 2>NET 16884: SERVER handle? false   connections? 1
12:51:56     CLIENT 2>NET 16890: destroy
12:51:56     CLIENT 2>NET 16890: close
12:51:56     CLIENT 2>NET 16890: close handle
12:51:56     CLIENT 2>NET 16890: _read
12:51:56     CLIENT 2>NET 16890: _read wait for connection
12:51:56     CLIENT 2>NET 16890: emit close
12:51:56   ...

@Trott
Copy link
Member

Trott commented Apr 30, 2018

Another one today: https://ci.nodejs.org/job/node-test-binary-arm/803/RUN_SUBSET=0,label=pi1-docker/console

03:41:49 not ok 303 sequential/test-net-GH-5504
03:41:49   ---
03:41:49   duration_ms: 360.130
03:41:49   severity: fail
03:41:49   exitcode: -15
03:41:49   stack: |-
03:41:49     timeout
03:41:49     SERVER 1>listening
03:41:49     SERVER 2>NET 24453: setupListenHandle 127.0.0.1 12346 4 false undefined
03:41:49     SERVER 2>NET 24453: setupListenHandle: create a handle
03:41:49     SERVER 2>NET 24453: bind to 127.0.0.1
03:41:49     CLIENT 2>NET 24459: createConnection [ { host: '127.0.0.1', port: 12346 },
03:41:49     CLIENT 2>  [Function],
03:41:49     CLIENT 2>  [Symbol(normalizedArgs)]: true ]
03:41:49     CLIENT 2>NET 24459: pipe false undefined
03:41:49     SERVER 2>NET 24453: onconnection
03:41:49     CLIENT 2>NET 24459: afterConnect
03:41:49     SERVER 2>NET 24453: _read
03:41:49     CLIENT 2>NET 24459: destroy
03:41:49     SERVER 2>NET 24453: Socket._read readStart
03:41:49     CLIENT 2>NET 24459: close
03:41:49     CLIENT 2>NET 24459: close handle
03:41:49     SERVER 2>NET 24453: SERVER _emitCloseIfDrained
03:41:49     SERVER 2>NET 24453: SERVER handle? false   connections? 1
03:41:49     CLIENT 2>NET 24459: _read
03:41:49     CLIENT 2>NET 24459: _read wait for connection
03:41:49     CLIENT 2>NET 24459: emit close
03:41:49   ...

For comparison, here's a successful run on my laptop:

SERVER 2>NET 62741: setupListenHandle 127.0.0.1 12346 4 false undefined
SERVER 2>NET 62741: setupListenHandle: create a handle
SERVER 2>NET 62741: bind to 127.0.0.1
SERVER 1>listening
CLIENT 2>NET 62742: createConnection [ { host: '127.0.0.1', port: 12346 },
CLIENT 2>  [Function],
CLIENT 2>  [Symbol(normalizedArgs)]: true ]
CLIENT 2>NET 62742: pipe false undefined
CLIENT 2>NET 62742: afterConnect
SERVER 2>NET 62741: onconnection
CLIENT 2>NET 62742: destroy
SERVER 2>NET 62741: _read
SERVER 2>NET 62741: Socket._read readStart
CLIENT 2>NET 62742: close
CLIENT 2>NET 62742: close handle
SERVER 2>NET 62741: SERVER _emitCloseIfDrained
SERVER 2>NET 62741: SERVER handle? false   connections? 1
CLIENT 2>NET 62742: _read
CLIENT 2>NET 62742: _read wait for connection
CLIENT 2>NET 62742: emit close
SERVER 2>NET 62741: afterWrite -32
SERVER 2>NET 62741: write failure { Error: write EPIPE
SERVER 2>    at WriteWrap.afterWrite [as oncomplete] (net.js:835:14) errno: 'EPIPE', code: 'EPIPE', syscall: 'write' }
SERVER 2>NET 62741: destroy
SERVER 2>NET 62741: close
SERVER 2>NET 62741: close handle
SERVER 2>events.js:167
SERVER 2>      throw er; // Unhandled 'error' event
SERVER 2>      ^
SERVER 2>
SERVER 2>Error: write EPIPE
SERVER 2>    at WriteWrap.afterWrite [as oncomplete] (net.js:835:14)
SERVER 2>Emitted 'error' event at:
SERVER 2>    at onwriteError (_stream_writable.js:444:12)
SERVER 2>    at onwrite (_stream_writable.js:469:5)
SERVER 2>    at _destroy (internal/streams/destroy.js:49:7)
SERVER 2>    at Socket._destroy (net.js:605:3)
SERVER 2>    at Socket.destroy (internal/streams/destroy.js:36:8)
SERVER 2>    at WriteWrap.afterWrite [as oncomplete] (net.js:837:10)

MylesBorins pushed a commit that referenced this issue May 4, 2018
When a test times out, the contents of stdout and stderr can often be
highly valuable in debugging. Provide that information.

Refs: #19906 (comment)

PR-URL: #20260
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Michael Dawson <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
Reviewed-By: Gus Caplan <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
@BridgeAR
Copy link
Member Author

@apapirovski
Copy link
Member

I don't think this is just a flaky test? It seems to be failing in a way that might indicate the original issue still exists on Arm?

MylesBorins pushed a commit that referenced this issue Aug 17, 2018
When a test times out, the contents of stdout and stderr can often be
highly valuable in debugging. Provide that information.

Refs: #19906 (comment)

PR-URL: #20260
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Michael Dawson <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
Reviewed-By: Gus Caplan <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. test Issues and PRs related to the tests.
Projects
None yet
Development

No branches or pull requests

3 participants