workflows/tests: enable use of a CI-force-dependents-tests label#82220
workflows/tests: enable use of a CI-force-dependents-tests label#82220carlocab wants to merge 1 commit intoHomebrew:masterfrom
CI-force-dependents-tests label#82220Conversation
Currently, CI skips dependents if the formula tests fail for any reason. This can be undesirable in PRs which require a large number of revision bumps (e.g. `icu4c`, `boost`, or `libffi`), or when failures are due to transient networks issues or other issues that can be more easily addressed in separate PRs (e.g. some `livecheck` failures). This should allow us to make use of a label that will force CI to continue attempting to test dependents even when the tests of the modified formula fails.
I think we should be fixing these in the relevant PRs. I agree with not fast failing on dependents being flaky but if you're revision bumping in a PR anyway: that seems like a very good time to fix flaky tests/livechecks. Alternatively, these should just get removed if they are flaky. The only exception I can see is flaky e.g. homepage/github audits in which case we should either 1) handle these a bit differently in As a result, I'm afraid, I think I'm 👎🏻 on this change. It encourages us to keep ignoring flakes rather than fixing them when they are going to be tested by CI anyway. |
I think 1 is the best option here. I know casks have some audits as warning an I think that could be the solution. |
If we can't connect to upstream in this case I think we'll just want to disable homepage audits for specific formulae/domains instead of better supporting audit failures. |
|
It was just an example really. My point was that all failures should be fixed, but maybe not all failures should immediately abort the build. You can bottle something just fine if their homepage is having an outage. |
MikeMcQuaid
left a comment
There was a problem hiding this comment.
Comment so this doesn't show up as still needing my review 😅
You technically can but: we should stop doing this. Temporary outages of this sort seem really rare (and easy enough to just add to the audit allowlist JSON in the same PR if need be). |
(compared to stuff that fails consistently) |
|
I am not sure. The way I see that going is:
|
|
Note it's not just temporary outages that are a problem -- sometimes some tests in large PRs just fail because they get rate-limited. This makes these failures rather complicated to fix in large PRs. It also complicates fixing issues, because skipping dependent tests because of a failure (whether spurious or real) in a modified formula means that you don't get to address these dependent failures on your next PR run. Instead, you have to wait for a second or a third run to even get around to fixing dependents -- assuming you're not hit by a spurious failure that tends to plague large PRs. This can be days or even weeks later for CI runs that take a long time as they do for I'd really rather try fixing all the failures in one go, rather than have to do it in several stages, where I fix failures in modified formulae first, then run CI to see if there are dependent failures, and then run CI again to fix those.
I disagree -- I think I see temporary stuff much more often than actual problems, particularly in PRs that modify several formulae. |
This is fine with me. The current version is:
These sort of tests should just be removed IMO. Flaky tests are worse than no tests.
I see this as the opposite. A PR doesn't get to progress to dependent testing (which takes a long, long time) until all the formulae modified in the PR pass their basic tests. What we could consider is making these fail even faster by checking e.g
If you're always fixing every failure in every PR and you literally never merge a PR that's red: I'd agree. Otherwise, I'm afraid I do not.
Sorry, my "temporary" here I meant "the homepage is temporarily down for a few hours" rather than "this test is temporarily failing because it is a bit flaky". |
If we can split out the homepage/livecheck/meta audit I'd be fine with the current status. And then I think we could attempt requiring green CI on x86 macOS since the flaky tests would be in a separate metadata CI run |
This works for me too. |
|
Thanks all ❤️ |
|
One thing I realised after trying it -- our current CI setup (and the proposed separation of some network-dependent tests to the tap-syntax job) prevents the proper testing of pre-releases. See #84363. Is there anything else we can do to enable this, or do we just never want to test pre-releases that get caught by our audit? (Not all of them do.) |
@carlocab I'm not sure I understand this. Can you elaborate a bit more here? |
|
Sure. If a tag on GitHub is labelled as a pre-release, a CI run for a PR that updates to that tag will fail at Currently, this failure means that the dependent tests will be skipped. If we move In both of these cases, trying to test a pre-release would not work. In the status quo, dependents are not tested, and this is often the hard part of pre-release testing. Under the proposed changes to our CI setup, no testing of the formula is done at all. Testing pre-releases is something we do on occasion -- typically to prepare for the actual release of a new version, hence the I'm hoping there would still be some way to test pre-releases in CI because it is occasionally useful. It saved a little time in the previous Go version bump PR, for example. Or, if we just never want to be able to do this, it would be helpful to clarify this too. |
Gotcha. I think having something like |
|
Yea, seems reasonable. It also corresponds nicely to a label we already have. I'll put it on my to-do list... |
brew install --build-from-source <formula>, where<formula>is the name of the formula you're submitting?brew test <formula>, where<formula>is the name of the formula you're submitting?brew audit --strict <formula>(after doingbrew install --build-from-source <formula>)? If this is a new formula, does it passbrew audit --new <formula>?Currently, CI skips dependents if the formula tests fail for any reason.
This can be undesirable in PRs which require a large number of revision
bumps (e.g.
icu4c,boost, orlibffi), or when failures are due totransient networks issues or other issues that can be more easily
addressed in separate PRs (e.g. some
livecheckfailures).This should allow us to make use of a label that will force CI to
continue attempting to test dependents even when the tests of the
modified formula fails.
Not sure if this does what I intend this to yet, but if this change makes sense I can test it out to check.