Skip to content

workflow/labels: switch to a scheduled trigger#416808

Merged
philiptaron merged 2 commits intoNixOS:masterfrom
wolfgangwalther:ci-scheduled-approvals
Jun 16, 2025
Merged

workflow/labels: switch to a scheduled trigger#416808
philiptaron merged 2 commits intoNixOS:masterfrom
wolfgangwalther:ci-scheduled-approvals

Conversation

@wolfgangwalther
Copy link
Contributor

@wolfgangwalther wolfgangwalther commented Jun 14, 2025

This came out of the after-merge discussion in #415259.

Instead of relying on workflow_run events, we now trigger the labeling workflow by schedule. This avoids all permission/secrets problems of running in the pull_request_review context - and also gets rid of the "waiting for approval to run workflows" button for new contributors that sometimes comes up right now.

Also, it's more efficient. Previously, the labeling workflow would run on any pull_request_review event, which happens for all comments, too. So quite a few runs.

This will cause a delay of up to 1 hour with the current settings until approval labels are set. Depending on how long the job normally runs we can adjust the frequency. The workflow is written in a way that will work no matter what the frequency ends up being, even when it's interrupted by transient GHA failures: It will always look at all PRs which were updated since the last time the workflow ran successfully.

We also add the ability to run it manually via UI. This is useful:

  • When fixing bugs in the labels workflow to run it all the way back to where the bug was introduced.
  • When introducing new labels, to get a head start for a reasonable number of PRs immediately.

Of course, the workflow is also still run directly after Eval itself. This is also the only case that the actions/labeler steps will run, since they depend on the pull_request context.

As a side-effect this also handles #416719 better. I wouldn't close it with that, because old PRs are not labeled correctly, yet. But at least the job should not fail anymore.

Things done

Of course, the interesting question will be how this behaves at the scale of Nixpkgs, compared to my lonely fork.


Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. 6.topic: continuous integration Affects continuous integration (CI) in Nixpkgs, including Ofborg and GitHub Actions 6.topic: policy discussion Discuss policies to work in and around Nixpkgs backport release-24.11 labels Jun 14, 2025
Copy link
Contributor

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fewer changes than I suspected would be needed.

Copy link
Contributor

@MattSturgeon MattSturgeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My JavaScript is feeling rusty and I haven't really internalized this workflow yet. But here's my initial feedback 🙂

Depending on how long the job normally runs we can adjust the frequency.

I suspect that given how many PR events nixpkgs gets, more or less any value will still be an improvement, even ridiculously small intervals like running every few minutes. Although in reality such short intervals would likely run into issues with concurrency* and/or cancellation.

Depending on how long the workflow actually takes to run, maybe the interval should be some multiplier of a typical run length? E.g. 2x or 3x average run?

*Speaking of concurrency, this makes me wonder if we should use a non-cancelling concurrency group for the scheduled runs, so that if we ever did get a scenario where a run is still going when the next one is scheduled, the next one will wait in the queue while the first continues.

@wolfgangwalther wolfgangwalther force-pushed the ci-scheduled-approvals branch from 7389593 to a2c0033 Compare June 15, 2025 09:18
Copy link
Contributor Author

@wolfgangwalther wolfgangwalther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great, detailed feedback, so far!

@wolfgangwalther
Copy link
Contributor Author

*Speaking of concurrency, this makes me wonder if we should use a non-cancelling concurrency group for the scheduled runs, so that if we ever did get a scenario where a run is still going when the next one is scheduled, the next one will wait in the queue while the first continues.

This.. makes a lot of sense, yes. I pushed a fixup that we can discuss. Does this roughly make sense?

@wolfgangwalther wolfgangwalther force-pushed the ci-scheduled-approvals branch 2 times, most recently from caa1516 to 7314661 Compare June 15, 2025 09:26
Instead of relying on workflow_run events, we now trigger the labeling
workflow by schedule. This avoids all permission/secrets problems of
running in the pull_request_review context - and also gets rid of the
"waiting for approval to run workflows" button for new contributors that
sometimes comes up right now.

Also, it's more efficient. Previously, the labeling workflow would run
on *any* pull_request_review event, which happens for all comments, too.
So quite a few runs.

This will cause a delay of up to 1 hour with the current settings until
approval labels are set. Depending on how long the job normally runs we
can adjust the frequency. The workflow is written in a way that will
work no matter what the frequency ends up being, even when it's
interrupted by transient GHA failures: It will always look at all PRs
which were updated since the last time the workflow ran successfully.

We also add the ability to run it manually via UI. This is useful:
- When fixing bugs in the labeler workflow to run it all the way back to
where the bug was introduced.
- When introducing new labels, to get a head start for a reasonable
amount of PRs immediately.

Of course, the workflow is also still run directly after Eval itself.
This is also the only case that the actions/labeler steps will run,
since they depend on the `pull_request` context.
With the previous commit we now have the `before` labels available
already, which allows some simplification.
@wolfgangwalther wolfgangwalther force-pushed the ci-scheduled-approvals branch from 7314661 to 4d53700 Compare June 16, 2025 07:39
Copy link
Contributor

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give it a go. To @MattSturgeon's point about running more frequently, I agree that's the likely next PR, and it's as simple as going from 37 * * * * to 7,22,37,52 * * * *.

@philiptaron philiptaron merged commit 5bd9bdc into NixOS:master Jun 16, 2025
14 of 16 checks passed
@nixpkgs-ci
Copy link
Contributor

nixpkgs-ci bot commented Jun 16, 2025

Successfully created backport PR for release-24.11:

@nixpkgs-ci
Copy link
Contributor

nixpkgs-ci bot commented Jun 16, 2025

Successfully created backport PR for release-25.05:

@github-actions github-actions bot added the 8.has: port to stable This PR already has a backport to the stable release. label Jun 16, 2025
@wolfgangwalther wolfgangwalther deleted the ci-scheduled-approvals branch June 16, 2025 12:41
@wolfgangwalther
Copy link
Contributor Author

I manually triggered a run and it says:

SyntaxError: Unexpected identifier 'name'

https://github.com/NixOS/nixpkgs/actions/runs/15681062125/job/44172374668

Let's see what I did wrong.

@wolfgangwalther
Copy link
Contributor Author

Missing comma in:

                    ...context.repo,
                    issue_number: pull_request.number
                    name

🙈

@wolfgangwalther
Copy link
Contributor Author

Still fails after this fix with something more obscure:

https://github.com/NixOS/nixpkgs/actions/runs/15681261596/job/44173012211

SyntaxError: Unexpected non-whitespace character after JSON at position 424
Error: Unhandled error: SyntaxError: Unexpected non-whitespace character after JSON at position 424
    at JSON.parse (<anonymous>)

Will see whether that affects all PRs or just the scheduled job. If it affects all PRs, we might want to revert. If it only affects the scheduled run and maybe a specific PR, we can keep it in. Looks like it might be specific about a certain PR.

@wolfgangwalther
Copy link
Contributor Author

Will see whether that affects all PRs or just the scheduled job. If it affects all PRs, we might want to revert. If it only affects the scheduled run and maybe a specific PR, we can keep it in. Looks like it might be specific about a certain PR.

PRs are now passing again, so I'll keep it as is and investigate further for which PR exactly it breaks down.

@MattSturgeon
Copy link
Contributor

SyntaxError: Unexpected non-whitespace character after JSON at position 424

Can we debug by adding some try catchs that print more info about what failed (but still then throw another exception)?

I only see two calls to JSON.parse in the workflow, assuming this isn't coming from the JS API bindings.

@wolfgangwalther
Copy link
Contributor Author

I think the best way would be to:

  • Catch and log those errors nicely, ideally with more context.
  • Continue handling all non-failing PRs.
  • Still return an exit code at the end, so the action is marked as failed.

I'll try to set it up this way.

@wolfgangwalther
Copy link
Contributor Author

Not sure how to add more context nicely... but it wouldn't really help in this specific case. The problem is simple and stupid: All PRs download their artifacts to the exact same location comparison/...

This means they overwrite each other and just happen to be reading an incomplete file while it's being saved.

@MattSturgeon
Copy link
Contributor

Nice!

Looks like downloadArtifact takes an optional second argument options, of type DownloadArtifactOptions & FindOptions.

DownloadArtifactOptions has a path property:

path

Optional path: string
Denotes where the artifact will be downloaded to. If not specified then the artifact is download to GITHUB_WORKSPACE

We should be able to use this to specify a unique name.

@wolfgangwalther
Copy link
Contributor Author

Yes - we use that already. I just copied the hardcoded comparison/ value from the actions/download-artifact action that we had set up previously and somehow... this didn't fail in my fork, even though it ran with multiple PRs. I think it had only one artifact to download, the other one was expired and the remaining two PRs didn't have any or so :)

@wolfgangwalther
Copy link
Contributor Author

After the second round of fixes the first manually triggered run was successful: https://github.com/NixOS/nixpkgs/actions/runs/15684644647/job/44184623606

It went through 34 PRs in 7 seconds. Installing the dependency.. takes 22 seconds compared to that.

Also confirmed in #417251 that it actually works and adds some labels.

To @MattSturgeon's point about running more frequently, I agree that's the likely next PR, and it's as simple as going from 37 * * * * to 7,22,37,52 * * * *.

Let's wait for a scheduled run to go off, but yes, this should be easily doable. Maybe even more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: continuous integration Affects continuous integration (CI) in Nixpkgs, including Ofborg and GitHub Actions 6.topic: policy discussion Discuss policies to work in and around Nixpkgs 8.has: port to stable This PR already has a backport to the stable release. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants