-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(tests): Disable flaky tests on MacOS #4251
Conversation
Signed-off-by: ktf <[email protected]>
cc @leebenson |
Signed-off-by: ktf <[email protected]>
Signed-off-by: ktf <[email protected]>
Even if all tests pass, there's #4196 to contend with. I added some additional commentary to that issue this morning. I'm not sure how to probe it further, without being able to SSH into an active runner and do some more extensive stack tracing against the compiled binary. I can't recreate this locally at all. I've seen segfaults in other environments where memory pressure was clearly the issue (i.e. actual "out of memory" errors in the console), so possibly this is a memory pressure thing. Not sure. It could also be a genuine segfault due to Not sure how to probe this any further without choosing a 'proper' CI where we have more control over the underlying hardware, OS and can SSH into the environment. |
@leebenson It's a bit hacky, but I've actually used https://github.com/mxschmitt/action-tmate before to get SSH access to a running Github Action job for debugging a similar type of issue that only happens on CI. |
Thanks @jszwedko, I was looking for something like that! Will give it a shot. |
Signed-off-by: ktf <[email protected]>
That sounds like a tough issue. Then I would at least like to either confirm or eliminate the possibility that some smallish subset of tests can cause #4196. It's quite possible that some specific tests are corrupting the memory which then causes SEGFAULTs at random times. Although I'm now aware that this approach in PR may not work. |
Signed-off-by: ktf <[email protected]>
@leebenson as you reported in #4196 SEGFAULT started to happen after the tests where whole test groups can be the cause, and with recent bors upgrade this isn't viable approach so I'll close the PR. |
Ref. #2978, #4056
I'll just keep disabling tests for Mac in this PR until we get a consistently passing check, and after that make a triage of them. This to avoid accumulation of broken tests behind a couple, hopefully, of them at moment
Tests disabled on Mac:
tcp_stream_detects_disconnect
file_update
topology
tests