workflows/labels: manage stale & merge conflict labels by wolfgangwalther · Pull Request #419481 · NixOS/nixpkgs

wolfgangwalther · 2025-06-24T07:48:07Z

To manage 2.status: stale and 2.status: merge conflict labels, we need to restructure the code a bit: The "list all PRs and stop iterating after updated_at cutoff" approach won't work anymore, because we also need to look at much older PRs.

I thought about being smart and trying to filter out pull requests which certainly won't change, for example when they already have a merge conflict label or when they are already stale. But this quickly becomes quite complicated, considering all edge-cases. Thus, I opted for a simpler approach: We now scan all PRs twice a day. We still operate on the last updated PRs every 10 minutes, that doesn't change. But we additionally operate on a batch of 100 "other" PRs. This batch is rotated through across the 144 workflow runs each day. 7k PRs / 100 per page = 70 pages ~ 2x / day.

To label a merge conflict, the PR needs to be touched twice: The first time around the API GET request will trigger the creation of a new test merge commit. The second time around, this will have completed, the merge_commit_sha state be updated and the label applied. This means that stale labels are updated within 12 hours and merge conflict labels within 24 hours. Which is more than OK.

Of course, this will use a lot more API requests. My estimation is in the range of 3-4k / hour now. But this is not a problem at all, because with #419188 we have about 11k requests / hour to spare without any impact on other jobs. In fact, the "scan all PRs" approach has one more benefit: We don't need to manually backfill anything, the labeling will just fix those labels within 24 hours. Thus, I removed the "update within" input from workflow_dispatch - this trigger is now only useful from forks for testing.

With #419188 and this, the labeler should become much more hands-off and just do its thing. (Not that it was much hands-on so far anyway...).

To make those changes, I had to make a few refactors, which caused huge indentation changes. I tried to split them into separate commits, so that reviewing the PR with "hide whitespace" should work reasonably well. Don't even try to look at this PR with whitespace changes visible and/or at the full diff of all commits...

Refactors:

Using the /search/issues endpoint doesn't allow to find "a single PR", so the "pull request event" case is now handled differently.
Older PRs might not have the right workflow runs or artifacts, but still need to label. Handling this case gracefully now, which also works a bit towards Workflow "Label PR" fails for old PRs ("Artifact has expired") #416719.

Open items:

Some PRs might be hit twice in the same run, when they are both on the current page and recently updated.
Potentially for later: Issues should also be labeled as stale, so extend the logic to handle issues, too?

Note: The settings for stale labeling are taken from .github/stale.yml. It seems like the stale bot has not been working for some time anymore, at least https://github.com/probot/stale is archived and the app seems to be disabled.

Assuming this works well, this should close #100477.

Things done

Tested in my fork, including actual merge-conflict and stale labeling.
Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other contributing documentation in corresponding paths.

Add a 👍 reaction to pull requests you find important.

MattSturgeon · 2025-06-24T08:02:37Z

Some PRs might be hit twice in the same run, when they are both on the current page and recently updated.

One approach would be to remember which PR numbers this run has looked at in a Set, Array, or Object; i.e. when either the "recent" or "backlog" code path starts work on a PR, it does its work within a java-style visitedPullRequests.computeIfAbsent(), or simply after an includes() check.

If checking for presence in the set and adding to the set aren't done in one atomic transaction, there's still potential for a race condition... But I guess that's pretty unlikely, and not too important; the worst that'd happen is a PR is checked twice.

wolfgangwalther · 2025-06-24T08:22:16Z

One approach would be to remember which PR numbers this run has looked at in a Set, Array, or Object

This would work, but it could also be done without race condition: With the new approach, we don't need to iterate over the response in steps and eventually call done() - the search will return only those PRs that we want to look at anyway. Thus, I am fetching the whole search response at once before. At this stage, we could also filter out duplicates, before we even start looking at them.

I also considered an alternative: The list-of-updated PRs contains this search query updated:>=${cutoff.toISOString()}. It should be enough to add the reverse of that condition to the allOptions query, so that the big list doesn't even contain the duplicates to start with. But this would cause the big list to change more frequently from run-to-run and thus make it more likely to miss some PRs when going from page-to-page in different runs. This problem still exists when older PRs are merged or closed, the whole list will shift a bit and thus some PRs will be lost in the current iteration. But merging/closing happens on a much smaller scale than "every update", so I wouldn't want to do it the filter way.

Edit: I think something like this should do it:

               const items = [].concat(updatedItems, allItems)
+                .filter((thisItem, idx, arr) => idx == arr.findIndex(firstItem => firstItem.number == thisItem.number))

wolfgangwalther · 2025-06-24T08:31:35Z

Edit: I think something like this should do it:

               const items = [].concat(updatedItems, allItems)
+                .filter((thisItem, idx, arr) => idx == arr.findIndex(firstItem => firstItem.number == thisItem.number))

Tested, this works well. Added it locally, but will only push after first round of review to save CI resources.

MattSturgeon · 2025-06-24T09:09:16Z

But this would cause the big list to change more frequently from run-to-run and thus make it more likely to miss some PRs when going from page-to-page in different runs. This problem still exists when older PRs are merged or closed, the whole list will shift a bit and thus some PRs will be lost in the current iteration.

I was going to hold off asking until I'd looked at the diff... But I was curious how you were handling pagination here. Especially given we're dealing with a moving target and between each run it is likely that more PRs were "updated", therefore shifting their position in search results and changing whether they match the search query.

wolfgangwalther · 2025-06-24T09:11:16Z

I was going to hold off asking until I'd looked at the diff... But I was curious how you were handling pagination here. Especially given we're dealing with a moving target and between each run it is likely that more PRs were "updated", therefore shifting their position in search results and changing whether they match the search query.

Yeah, to counter the effect as much as possible the big list is sorted by created in ascending order. That should remove all update-related changes, but as mentioned earlier, when is:open changes, that will change the list.

wolfgangwalther · 2025-06-24T09:14:54Z

An alternative, to counter that problem a bit, would be to create some kind of overlap. For example instead of creating batches of 100 and running a single one, we could make batches of 25 and then run 5 batches instead of 4 per run, with one of them overlapping. This should then catch those PRs that would have moved out of the current batch, because some PRs created earlier were merged in the meantime.

Edit: Or maybe not even in those smaller batches, but still 100. And then run the previous page, too, overlapping. This would have the effect that merge conflicts would be labeled in 12 hours instead of 24, because any PR will be looked at after 10 minutes again - when the merge commit has been computed. Might not be the worst thing to do.

MattSturgeon · 2025-06-24T09:36:57Z

Sounds reasonable. I wonder how much overlap buffer we need for this to be reliable?

wolfgangwalther · 2025-06-24T10:00:31Z

Sounds reasonable. I wonder how much overlap buffer we need for this to be reliable?

If we do the simple approach of just running two pages with 100 items, so overlap=100, then we should have much more than needed, I think.

In general, the effect of a merged PR is bigger the older the PR is, because it shifts more pages. But it's also much less likely for a PR that is really old to be merged. I'd say the effect is negligible for PRs in the stale-range. It only becomes relevant for more recent PRs. But those don't need to be marked stale - and they are also less likely to be merge-conflicted.

I'd say we try it without overlap at first, see how many API requests this burns, see whether anybody notices any strange effect for more recent PRs - and only if this is actually relevant, introduce some overlap.

MattSturgeon

In 28458b3f8dfb58db9cebfa36f50601ae77c4a3eb's commit messages you mention we need to iterate over all PRs for the stale and conflict labels:

the much bigger one in being able to handle merge-conflict and stale labels properly later. For those, it's inevitable to eventually scan through all PRs.

This may be true for conflicts, because I'm not sure you can easily search for them. Even if you could, your comment here would apply:

The attribute merge_commit_sha keeps the old value of null or the hash until the new test merge commit has either successfuly been created or failed so.

For stale issues/PRs, this could also be achieved with the inverse search to our "recently updated" search; i.e. a "not updated since" search. It could even be a search for "not updated since XYZ and also doesn't have stable label" to only get the relevant results.

This probably isn't necessary, because by their nature it isn't important to mark stale items as stale immediately, but just pointing out that it is technically possible.

Even so, would handling adding the stable label separately simplify the main labeling step? I guess the main step would still need to remove the stale label, so maybe not...

I'd say we try it without overlap at first, see how many API requests this burns, see whether anybody notices any strange effect for more recent PRs - and only if this is actually relevant, introduce some overlap.

SGTM. Maybe add brief comments to the relevant parts of the code?

.github/workflows/labels.yml

wolfgangwalther · 2025-06-24T12:06:57Z

For stale issues/PRs, this could also be achieved with the inverse search to our "recently updated" search; i.e. a "not updated since" search. It could even be a search for "not updated since XYZ and also doesn't have stable label" to only get the relevant results.

This probably isn't necessary, because by their nature it isn't important to mark stale items as stale immediately, but just pointing out that it is technically possible.

Even so, would handling adding the stable label separately simplify the main labeling step? I guess the main step would still need to remove the stale label, so maybe not...

The problem is, that this simple search doesn't quite work: updated will also be changed when we add the stale label. That's why it's important to look at the timeline of events for the actually stale-relevant timestamp. We can't search for that. And this can very well be older than the updated, so we must look at all PRs, at least those not labeled stale already.

So essentially, we could only filter out those PRs that are already merge conflicted and stale. "only" is relative of course, because that's still a huge number. The problem is, we probably still need to run at least one go across all PRs, because the labels might be incorrectly set right now. Also, if we introduce new labels, we want them to appear on all PRs if possible, even those marked that way already.

Also consider the case of somebody adding the stale label manually - we'd suddenly skip looking at the PR, even though this PR might not actually be stale according to our rules at all.

So yeah, I think we really need to look at all PRs, even for stale.

Separate commit for better diff.

…ntext We previously ran another list request in this case, but don't need to anymore - we already have the `pull_request` context available.

This doesn't provide much value in itself, yet, but is much more flexible in the next step, when also looking at much older PRs.

This replaces the manual dispatch trigger with a batched run through all pull requests every day. This has the small benefit of not having to worry about backfilling labeling after fixing bugs - and the much bigger one in being able to handle merge-conflict and stale labels properly later. For those, it's inevitable to eventually scan through all PRs. At this stage, the vast majority of PRs will still be skipped, because there won't be an eval run with artifact available. This will be improved in the next step. Technically, the workflow_dispatch trigger is kept to allow easily testing this in forks, where the scheduled jobs are disabled. The triggered job will behave similar to the scheduled job, though, and have no special inputs.

We keep working through the PR, even though we don't have any eval results. This will allow actually managing labels for much older PRs as well. Most importantly, it will allow merge-conflict and stale-labeling next.

This manages the `2. status: stale` label for pull requests only (not issues, yet) with the following conditions: - The last event on the timeline of the Pull Request counts. - Labeling and unlabeling of any kind are ignored. - Older than 180 days are stale. - Security labeled PRs are never stale. To handle this label correctly, it's important to go through all pull requests. Any approach to limit the list of PRs via search are not going to work: - Filtering by `updated` is not going to work, because it includes the last time that *a label was set* on the PR. To actually find out whether a PR is stale or not, the timeline of events needs to be looked at. - Filtering by an existing stale label is not going to work either, because such a label might have been added manually and thus breaking the rules we set up here. Thus any existing label needs to be confirmed as well.

The code comments describe much better what we do then a commit message could ever do.

wolfgangwalther

Thansk for the great, thorough review (as always!). I addressed all your comments.

.github/workflows/labels.yml

wolfgangwalther · 2025-06-24T12:55:40Z

I also gave it another test-run in my fork, still works, didn't break anything.

Potentially for later: Issues should also be labeled as stale, so extend the logic to handle issues, too?

Certainly for later, because it will increase the number of items to look at massively. Especially initially this might be relevant, because:

On the first few iterations, we are probably going to change labels on hundreds of PRs.
Those will appear in the "updated" category again on the next run.

So until we are through the whole thing twice (because of merge-conflict labels, too), this will run a lot more PRs each run. This should still fit in easily with the resource limits, but if we take issues initially as well... then it might get a lot closer to the limit.

Edit: Also.. when we go through all PRs, I wonder how many more edge-cases we're going to hit that we didn't account for, yet :D

MattSturgeon · 2025-06-24T13:23:00Z

Those will appear in the "updated" category again on the next run.

I hadn't considered that. So for the first several runs the "recently updated" results will mostly consist of PRs updated by the prior run, instead of PRs updated by contributors.

This shouldn't be a problem, per se... We don't limit how many "recent" PRs we will process, so it shouldn't leave any PRs un-labelled. It just means the number of pages each run has to iterate through will be higher than usual.

MattSturgeon

Diff LGTM.

🤞 we haven't missed any important edge-cases.
As always, let's be ready to revert if everything suddenly 💥

nixpkgs-ci · 2025-06-24T13:40:14Z

Successfully created backport PR for release-24.11:

[Backport release-24.11] workflows/labels: manage stale & merge conflict labels #419564

nixpkgs-ci · 2025-06-24T13:40:32Z

Successfully created backport PR for release-25.05:

[Backport release-25.05] workflows/labels: manage stale & merge conflict labels #419565

wolfgangwalther · 2025-06-24T13:45:50Z

Stale labeling doesn't work as expected, yet in #238561. This should not be stale!

wolfgangwalther · 2025-06-24T13:48:39Z

Merge conflict was properly set in #415587.

Notice: Processed 112 PRs, made 872 API requests and downloaded 103 artifacts.

This will be more in the range of 5k+ requests / hour - but still fine, I guess.

wolfgangwalther · 2025-06-24T14:00:03Z

Uh, there's a bigger problem here:

Error: Unhandled error: HttpError: Only the first 1000 search results are available

I don't see a quick fix (if at all) for that, so reverting now...

MattSturgeon · 2025-06-24T14:02:00Z

Stale labeling doesn't work as expected, yet in #238561. This should not be stale!

Would be interesting to know what latest_event_at was bound to for that PR...

Inspecting locally

gh api /repos/NixOS/nixpkgs/issues/238561/timeline

Produces: events.json

The latest event is your cross-reference
Then labeled created_at 2025-06-24T13:44:50Z by nixpkgs-ci[bot] (from the run)
Then another labeled created_at 2025-06-24T13:26:45Z by nixpkgs-ci[bot]
Then a reviewed; this doesn't have created_at, but instead has "submitted_at": "2025-06-24T13:10:35Z"

wolfgangwalther · 2025-06-24T14:03:52Z

Inspecting locally

gh api /repos/NixOS/nixpkgs/issues/238561/timeline

Aha!

I was looking at /repos/NixOS/nixpkgs/issues/238561/events and didn't see anything that didn't have created_at. That explains it, then :)

wolfgangwalther · 2025-06-24T14:09:46Z

Merge conflict was properly set in #415587.

Apparently the merge conflict label didn't work when opening a PR, though, as can be seen in #419574.

💥

.github/workflows/labels.yml

mdaniels5757 · 2025-06-24T15:45:50Z

Wait but #419600 still has no rebuild labels?

wolfgangwalther · 2025-06-24T15:49:28Z

Wait but #419600 still has no rebuild labels?

That's indeed odd. Will look into it later.

.github/workflows/labels.yml

wolfgangwalther requested review from MattSturgeon and wegank June 24, 2025 07:48

nix-owners bot requested review from Mic92, philiptaron and zowoq June 24, 2025 07:50

MattSturgeon reviewed Jun 24, 2025

View reviewed changes

wolfgangwalther added 8 commits June 24, 2025 14:30

workflows/labels: refactor into handle() function

042a2fd

Separate commit for better diff.

workflows/labels: refactor moving cutoff downwards

f394b27

workflows/labels: save an API request when running in pull_request co…

8b51015

…ntext We previously ran another list request in this case, but don't need to anymore - we already have the `pull_request` context available.

workflows/labels: refactor to search instead of listing PRs

d9d97fd

This doesn't provide much value in itself, yet, but is much more flexible in the next step, when also looking at much older PRs.

workflows/labels: handle missing eval results gracefully

63b9355

We keep working through the PR, even though we don't have any eval results. This will allow actually managing labels for much older PRs as well. Most importantly, it will allow merge-conflict and stale-labeling next.

workflows/labels: manage merge-conflict label for pull requests

36e9fe9

The code comments describe much better what we do then a commit message could ever do.

wolfgangwalther commented Jun 24, 2025

View reviewed changes

wolfgangwalther force-pushed the ci-labels-with-search branch from 8d857cb to 36e9fe9 Compare June 24, 2025 12:48

MattSturgeon approved these changes Jun 24, 2025

View reviewed changes

wolfgangwalther merged commit 802980b into NixOS:master Jun 24, 2025
44 of 46 checks passed

wolfgangwalther deleted the ci-labels-with-search branch June 24, 2025 13:39

nixpkgs-ci bot mentioned this pull request Jun 24, 2025

[Backport release-24.11] workflows/labels: manage stale & merge conflict labels #419564

Closed

1 task

nixpkgs-ci bot mentioned this pull request Jun 24, 2025

[Backport release-25.05] workflows/labels: manage stale & merge conflict labels #419565

Closed

1 task

github-actions bot added the 8.has: port to stable This PR already has a backport to the stable release. label Jun 24, 2025

wolfgangwalther mentioned this pull request Jun 24, 2025

Revert "workflows/labels: manage stale & merge conflict labels" #419574

Merged

MattSturgeon reviewed Jun 24, 2025

View reviewed changes

.github/workflows/labels.yml Show resolved Hide resolved

MattSturgeon reviewed Jun 24, 2025

View reviewed changes

.github/workflows/labels.yml Show resolved Hide resolved

wolfgangwalther commented Jun 24, 2025

View reviewed changes

.github/workflows/labels.yml Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

MattSturgeon reviewed Jun 24, 2025

View reviewed changes

.github/workflows/labels.yml Show resolved Hide resolved

wolfgangwalther mentioned this pull request Jun 24, 2025

Reapply "workflows/labels: manage stale & merge conflict labels" #419654

Merged

2 tasks

Uh oh!

Conversation

wolfgangwalther commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Things done

Uh oh!

MattSturgeon commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

MattSturgeon commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MattSturgeon commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

MattSturgeon left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

wolfgangwalther left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wolfgangwalther commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MattSturgeon commented Jun 24, 2025

Uh oh!

MattSturgeon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nixpkgs-ci bot commented Jun 24, 2025

Uh oh!

nixpkgs-ci bot commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

MattSturgeon commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

wolfgangwalther commented Jun 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

wolfgangwalther commented Jun 24, 2025 •

edited

Loading

wolfgangwalther commented Jun 24, 2025 •

edited

Loading

wolfgangwalther commented Jun 24, 2025 •

edited

Loading

MattSturgeon left a comment •

edited

Loading

wolfgangwalther commented Jun 24, 2025 •

edited

Loading