Enable fail-fast for new JVM version matrix actions#24690
Enable fail-fast for new JVM version matrix actions#24690ZacBlanco wants to merge 1 commit intoprestodb:masterfrom
Conversation
|
@czentgr One concern I am noticing here regarding the fail-fast is that with flaky-tests we might be more susceptible to test flakiness causing build failures |
|
@ZacBlanco You mentioned you monitored the pipeline for a bit now. Do we have that many flaky tests to cause a problem in the fail-fast mode vs actual issues? |
|
So from my anec-data in this PR, is that I had to re-run the "failed" action group 4 times before the checks all passed. I suspect without fail-fast it would have been only 1 or 2. The issue lies in that if you have both 8 & 17 tests running, and one fails, the other is instantly failed. So if either 8 or 17 is flaky, you have to re-run both, and hope that neither have flaky tests. When fail-fast is disabled, then both actions can run to completion. Any action without a failure doesn't get re-run when trying to "Re run failed jobs" from the github UI. I understand that fail-fast can potentially save resources on deterministic failures, but given the frequency of flaky tests, I am of the opinion it is actually better to actually leave fail-fast disabled to decrease the number of re-runs required on average |
|
@ZacBlanco I understand the reasoning. Obviously, my assumption was that we don't have that many flaky tests but do recall there was an effort to clean these up. I also investigated one and could not understand how it was failing occasionally. On the other hand we can't take them out of testing because the flaky error is known (issue created) but also if passing shows no regression. I don't really have a problem with using the resources. I suppose we can make this change dependent on having stable tests. By that time we might have reduced the number jobs per java version where it becomes moot. We are running on the "free" github runners so the other concern was queuing if we keep too many resources for longer than necessary. |
Description
Enables fail-fast on matrix builds
Motivation and Context
Should save some CI resources on builds where items fail
Impact
N/A
Test Plan
N/A
Contributor checklist
Release Notes