workflows/labels: manage stale & merge conflict labels#419481
workflows/labels: manage stale & merge conflict labels#419481wolfgangwalther merged 8 commits intoNixOS:masterfrom
Conversation
One approach would be to remember which PR numbers this run has looked at in a Set, Array, or Object; i.e. when either the "recent" or "backlog" code path starts work on a PR, it does its work within a java-style If checking for presence in the set and adding to the set aren't done in one atomic transaction, there's still potential for a race condition... But I guess that's pretty unlikely, and not too important; the worst that'd happen is a PR is checked twice. |
This would work, but it could also be done without race condition: With the new approach, we don't need to iterate over the response in steps and eventually call I also considered an alternative: The list-of-updated PRs contains this search query Edit: I think something like this should do it: const items = [].concat(updatedItems, allItems)
+ .filter((thisItem, idx, arr) => idx == arr.findIndex(firstItem => firstItem.number == thisItem.number)) |
Tested, this works well. Added it locally, but will only push after first round of review to save CI resources. |
I was going to hold off asking until I'd looked at the diff... But I was curious how you were handling pagination here. Especially given we're dealing with a moving target and between each run it is likely that more PRs were "updated", therefore shifting their position in search results and changing whether they match the search query. |
Yeah, to counter the effect as much as possible the big list is sorted by |
|
An alternative, to counter that problem a bit, would be to create some kind of overlap. For example instead of creating batches of 100 and running a single one, we could make batches of 25 and then run 5 batches instead of 4 per run, with one of them overlapping. This should then catch those PRs that would have moved out of the current batch, because some PRs created earlier were merged in the meantime. Edit: Or maybe not even in those smaller batches, but still 100. And then run the previous page, too, overlapping. This would have the effect that merge conflicts would be labeled in 12 hours instead of 24, because any PR will be looked at after 10 minutes again - when the merge commit has been computed. Might not be the worst thing to do. |
|
Sounds reasonable. I wonder how much overlap buffer we need for this to be reliable? |
If we do the simple approach of just running two pages with 100 items, so overlap=100, then we should have much more than needed, I think. In general, the effect of a merged PR is bigger the older the PR is, because it shifts more pages. But it's also much less likely for a PR that is really old to be merged. I'd say the effect is negligible for PRs in the stale-range. It only becomes relevant for more recent PRs. But those don't need to be marked stale - and they are also less likely to be merge-conflicted. I'd say we try it without overlap at first, see how many API requests this burns, see whether anybody notices any strange effect for more recent PRs - and only if this is actually relevant, introduce some overlap. |
There was a problem hiding this comment.
In 28458b3f8dfb58db9cebfa36f50601ae77c4a3eb's commit messages you mention we need to iterate over all PRs for the stale and conflict labels:
the much bigger one in being able to handle merge-conflict and stale labels properly later. For those, it's inevitable to eventually scan through all PRs.
This may be true for conflicts, because I'm not sure you can easily search for them. Even if you could, your comment here would apply:
The attribute merge_commit_sha keeps the old value of null or the hash until the new test merge commit has either successfuly been created or failed so.
For stale issues/PRs, this could also be achieved with the inverse search to our "recently updated" search; i.e. a "not updated since" search. It could even be a search for "not updated since XYZ and also doesn't have stable label" to only get the relevant results.
This probably isn't necessary, because by their nature it isn't important to mark stale items as stale immediately, but just pointing out that it is technically possible.
Even so, would handling adding the stable label separately simplify the main labeling step? I guess the main step would still need to remove the stale label, so maybe not...
I'd say we try it without overlap at first, see how many API requests this burns, see whether anybody notices any strange effect for more recent PRs - and only if this is actually relevant, introduce some overlap.
SGTM. Maybe add brief comments to the relevant parts of the code?
The problem is, that this simple search doesn't quite work: So essentially, we could only filter out those PRs that are already merge conflicted and stale. "only" is relative of course, because that's still a huge number. The problem is, we probably still need to run at least one go across all PRs, because the labels might be incorrectly set right now. Also, if we introduce new labels, we want them to appear on all PRs if possible, even those marked that way already. Also consider the case of somebody adding the stale label manually - we'd suddenly skip looking at the PR, even though this PR might not actually be stale according to our rules at all. So yeah, I think we really need to look at all PRs, even for stale. |
Separate commit for better diff.
…ntext We previously ran another list request in this case, but don't need to anymore - we already have the `pull_request` context available.
This doesn't provide much value in itself, yet, but is much more flexible in the next step, when also looking at much older PRs.
This replaces the manual dispatch trigger with a batched run through all pull requests every day. This has the small benefit of not having to worry about backfilling labeling after fixing bugs - and the much bigger one in being able to handle merge-conflict and stale labels properly later. For those, it's inevitable to eventually scan through all PRs. At this stage, the vast majority of PRs will still be skipped, because there won't be an eval run with artifact available. This will be improved in the next step. Technically, the workflow_dispatch trigger is kept to allow easily testing this in forks, where the scheduled jobs are disabled. The triggered job will behave similar to the scheduled job, though, and have no special inputs.
We keep working through the PR, even though we don't have any eval results. This will allow actually managing labels for much older PRs as well. Most importantly, it will allow merge-conflict and stale-labeling next.
This manages the `2. status: stale` label for pull requests only (not issues, yet) with the following conditions: - The last event on the timeline of the Pull Request counts. - Labeling and unlabeling of any kind are ignored. - Older than 180 days are stale. - Security labeled PRs are never stale. To handle this label correctly, it's important to go through all pull requests. Any approach to limit the list of PRs via search are not going to work: - Filtering by `updated` is not going to work, because it includes the last time that *a label was set* on the PR. To actually find out whether a PR is stale or not, the timeline of events needs to be looked at. - Filtering by an existing stale label is not going to work either, because such a label might have been added manually and thus breaking the rules we set up here. Thus any existing label needs to be confirmed as well.
The code comments describe much better what we do then a commit message could ever do.
wolfgangwalther
left a comment
There was a problem hiding this comment.
Thansk for the great, thorough review (as always!). I addressed all your comments.
8d857cb to
36e9fe9
Compare
|
I also gave it another test-run in my fork, still works, didn't break anything.
Certainly for later, because it will increase the number of items to look at massively. Especially initially this might be relevant, because:
So until we are through the whole thing twice (because of merge-conflict labels, too), this will run a lot more PRs each run. This should still fit in easily with the resource limits, but if we take issues initially as well... then it might get a lot closer to the limit. Edit: Also.. when we go through all PRs, I wonder how many more edge-cases we're going to hit that we didn't account for, yet :D |
I hadn't considered that. So for the first several runs the "recently updated" results will mostly consist of PRs updated by the prior run, instead of PRs updated by contributors. This shouldn't be a problem, per se... We don't limit how many "recent" PRs we will process, so it shouldn't leave any PRs un-labelled. It just means the number of pages each run has to iterate through will be higher than usual. |
MattSturgeon
left a comment
There was a problem hiding this comment.
Diff LGTM.
🤞 we haven't missed any important edge-cases.
As always, let's be ready to revert if everything suddenly 💥
|
Successfully created backport PR for |
|
Successfully created backport PR for |
|
Stale labeling doesn't work as expected, yet in #238561. This should not be stale! |
|
Merge conflict was properly set in #415587.
This will be more in the range of 5k+ requests / hour - but still fine, I guess. |
|
Uh, there's a bigger problem here: I don't see a quick fix (if at all) for that, so reverting now... |
Would be interesting to know what Inspecting locally gh api /repos/NixOS/nixpkgs/issues/238561/timelineProduces: events.json
|
Aha! I was looking at |
This comment was marked as outdated.
This comment was marked as outdated.
|
Wait but #419600 still has no rebuild labels? |
That's indeed odd. Will look into it later. |
To manage
2.status: staleand2.status: merge conflictlabels, we need to restructure the code a bit: The "list all PRs and stop iterating after updated_at cutoff" approach won't work anymore, because we also need to look at much older PRs.I thought about being smart and trying to filter out pull requests which certainly won't change, for example when they already have a merge conflict label or when they are already stale. But this quickly becomes quite complicated, considering all edge-cases. Thus, I opted for a simpler approach: We now scan all PRs twice a day. We still operate on the last updated PRs every 10 minutes, that doesn't change. But we additionally operate on a batch of 100 "other" PRs. This batch is rotated through across the 144 workflow runs each day. 7k PRs / 100 per page = 70 pages ~ 2x / day.
To label a merge conflict, the PR needs to be touched twice: The first time around the API GET request will trigger the creation of a new test merge commit. The second time around, this will have completed, the
merge_commit_shastate be updated and the label applied. This means that stale labels are updated within 12 hours and merge conflict labels within 24 hours. Which is more than OK.Of course, this will use a lot more API requests. My estimation is in the range of 3-4k / hour now. But this is not a problem at all, because with #419188 we have about 11k requests / hour to spare without any impact on other jobs. In fact, the "scan all PRs" approach has one more benefit: We don't need to manually backfill anything, the labeling will just fix those labels within 24 hours. Thus, I removed the "update within" input from
workflow_dispatch- this trigger is now only useful from forks for testing.With #419188 and this, the labeler should become much more hands-off and just do its thing. (Not that it was much hands-on so far anyway...).
To make those changes, I had to make a few refactors, which caused huge indentation changes. I tried to split them into separate commits, so that reviewing the PR with "hide whitespace" should work reasonably well. Don't even try to look at this PR with whitespace changes visible and/or at the full diff of all commits...
Refactors:
/search/issuesendpoint doesn't allow to find "a single PR", so the "pull request event" case is now handled differently.Open items:
Note: The settings for stale labeling are taken from
.github/stale.yml. It seems like the stale bot has not been working for some time anymore, at least https://github.com/probot/stale is archived and the app seems to be disabled.Assuming this works well, this should close #100477.
Things done
Add a 👍 reaction to pull requests you find important.