Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PR triage workflow #17881

Merged
merged 22 commits into from
Dec 10, 2024
Merged

Add PR triage workflow #17881

merged 22 commits into from
Dec 10, 2024

Conversation

mumrah
Copy link
Member

@mumrah mumrah commented Nov 20, 2024

Replaces #17298.

This patch adds a simple label based workflow for PRs. PRs opened by non-committers will have a triage label automatically added. This allows committers to see which PRs need to be triaged. After 7 days, a needs-attention label will be added. This serves as a reminder to committers and allows them to see which PRs have languished.

For now, the triage label will need to be manually removed by a committer. In the future, we could try to automate this by removing the label after a PR has been reviewed, assigned, or commented on.

@github-actions github-actions bot added build Gradle build or GitHub Actions small Small PRs labels Nov 20, 2024
@chia7712
Copy link
Contributor

I totally love this idea!

After 7 days, a needs-attention label will be added. This serves as a reminder to committers and allows them to see which PRs have languished.

Could we just add label "needs-attention" if the PRs from non-committer have no reviewer after 7 days? That may be more simple and it can remind committers to check those languished PRs as well

@fvaleri
Copy link
Contributor

fvaleri commented Nov 21, 2024

I like this, thanks @mumrah.

@mumrah
Copy link
Member Author

mumrah commented Nov 21, 2024

@chia7712

Could we just add label "needs-attention" if the PRs from non-committer have no reviewer after 7 days?

Unfortunately, pull_request_reviewed has the same issues as pull_request, but there is no pull_request_reviewed_target event to allow editing details of the PR. I tried a few things, but ultimately don't think we can achieve much automation as the result of a review being submitted :(

So, for now it would be a manual process to remove the triage label. If the triage label is removed, the needs-attention label will not be applied.

One improvement we can make is to use https://github.com/actions/stale?tab=readme-ov-file#exempt-all-pr-assignees to prevent needs-attention from being added if the PR has an assignee. This might not be that useful since we rarely use assignee.

Edit: Since we already require adding ci-approved, it seems that requiring a removal or triage is not too burdensome for committers. After all, it is only a few clicks in the UI :)

2nd Edit: IF we ever use GitHub Projects (a big "IF" 😄), then labels like this can be automatically removed when transitioning a PR from one state to another which is a drag-and-drop action in the UI

@mumrah mumrah requested a review from chia7712 November 22, 2024 14:01
FOUND_CONTRIBUTOR=$(gh api \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
/repos/apache/kafka/contributors --jq '.[] | select(.login == "'"$PR_USER"'") | .login')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does /repos/apache/kafka/contributors return "committers"? it seems to return top contributors

Copy link
Member Author

@mumrah mumrah Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right 😖

I think we can query the "orgs" API to check if the author is a member of apache, but that won't tell us if they are a Kafka committer (could be any Apache committer). That would at least prevent committers from having the triage label added.


Another approach is to create a list of committers by GH username. This is related to another improvement I've been contemplating. I'd like to separate the "reviewers" (anyone in community) from the "approvers" (must be committer) in our commit messages. A config file of GH usernames + email + preferred name is how I would approach that one. We could use this config file as the input for this task which would solve the problem.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another approach is to create a list of committers by GH username.

Maybe we can leverage committers.html?

This is related to another improvement I've been contemplating. I'd like to separate the "reviewers" (anyone in community) from the "approvers" (must be committer) in our commit messages.

I love this idea!

@github-actions github-actions bot removed the small Small PRs label Nov 22, 2024
@mumrah
Copy link
Member Author

mumrah commented Dec 2, 2024

@chia7712 I created a mapping file from the ASF roster and manually matched it up with GitHub user names. https://github.com/mumrah/kafka/blob/asf-committers/README.md

I would like to add this asf-committers as an orphaned branch to this repo so we can use it as a datasource for this PR (and future automations).

WDYT?

@chia7712
Copy link
Contributor

chia7712 commented Dec 3, 2024

I would like to add this asf-committers as an orphaned branch to this repo so we can use it as a datasource for this PR (and future automations).

My main concern is how to sync the file with committers.html 🤔

@mumrah
Copy link
Member Author

mumrah commented Dec 3, 2024

I'm not sure it's that much of a problem TBH. There's already a lot of manual steps involved in becoming a committer. I don't think one more PR is too much of a burden :).

Also, the committers.html is more about personal info (photo, links to social, etc). The dataset I'm proposing is more technical to help power GH automations. If the two are out of sync, I can't think of any real negative outcomes. For the workflow in this PR, the worst outcome is that a committer gets a triage label added to their own PRs.

If we wanted to combine the two data sets, I think that would be do-able as well. This would involve moving the photos to the asf-committers branch along with other committer data and then using that to generate the html. Definitely possible.

I'll start a discussion thread about adding the asf-committers branch, since I can't really a PR for it without first creating the orphaned branch.

@mimaison
Copy link
Member

mimaison commented Dec 4, 2024

In https://kafka.apache.org/committers, we have a <div> tag associated with each committer with their GitHub username. For example:

<td>
    David Arthur<br>
    <div class="github_login" hidden>mumrah<br></div>
    Committer, and PMC member<br>
    <a href="[https://www.linkedin.com/in/davidrarthur](view-source:https://www.linkedin.com/in/davidrarthur)">/in/davidarthur</a><br>
    <a href="[https://github.com/mumrah](view-source:https://github.com/mumrah)">github.com/mumrah</a>
</td>

This is what we currently use in https://github.com/apache/kafka/blob/trunk/committer-tools/refresh_collaborators.py

@mumrah
Copy link
Member Author

mumrah commented Dec 4, 2024

@mimaison good to know! I had no idea we had the GH usernames in a hidden div. Any idea why we don't create a link to the committer's GitHub page? Privacy concerns?

Thanks to @assignUser for the tip on "author_association" 👍

@mumrah
Copy link
Member Author

mumrah commented Dec 4, 2024

We can see this in action on my fork.

PR opened by me

PR opened by collaborator

image

@assignUser
Copy link
Member

As good practice I'd recommend minimising the token permissions: to just what's need.

@mumrah mumrah requested a review from chia7712 December 4, 2024 21:56
@mumrah
Copy link
Member Author

mumrah commented Dec 6, 2024

@chia7712 or @mimaison any more feedback on this one?

# See the License for the specific language governing permissions and
# limitations under the License.

name: Pull Request Reviewed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're starting to have quite a few actions. I think it would be good to add the description field to all of them to clearly state their role, how they work, and the potential relationships between each others.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a description field for the workflow files. I've been adding comments here and there to help people navigate the workflow files. I think name is as good as we get unfortunately.

Here's the docs: https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#about-yaml-syntax-for-workflows

And a similar question on SO: https://stackoverflow.com/questions/70941956/description-for-a-workflow-in-github-actions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad I was looking at https://docs.github.com/en/actions/sharing-automations/creating-actions/metadata-syntax-for-github-actions#description which is not available for workflows.

So let's use comments to describe the workflows. Maybe it's only me (I'm not super familiar with GitHub Actions) but looking at the files (without reading the PR), it's not immediately obvious what each one does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added some docs to the README in the .github/workflows directory. PTAL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks that's very helpful!

Copy link
Member

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mumrah mumrah merged commit b8dadb7 into apache:trunk Dec 10, 2024
8 checks passed
@mumrah
Copy link
Member Author

mumrah commented Dec 10, 2024

Well, something isn't quite right. I opened a test PR and it was labeled as triage. Looking into it.

@mumrah
Copy link
Member Author

mumrah commented Dec 10, 2024

Ok, it is working but there is a little quirk. Filed #18128

@gharris1727
Copy link
Contributor

gharris1727 commented Dec 10, 2024

Another quirk: this is causing workflows to fail after reviewing PRs: https://github.com/apache/kafka/actions/runs/12266421343

It doesn't block anything, it just seems to generate nuisance emails.

I submitted this review: #16754 (review)

name: Pull Request Reviewed

on:
pull_request_review:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This event brings a interesting race condition:

  1. Approve the Pull Request (PR):

    • This action triggers the "Pull Request Reviewed" workflow.
    • Example Run
  2. Add ci-approve:

    • This action attempts to locate the "pending" run_id.
    • it mistakenly retrieves the run_id of the "Pull Request Reviewed" workflow, resulting in an error.
      Error Example

Perhaps the issue is invalid, as pull requests should not be approved before the CI process completes. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #18200 to (hopefully) fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Gradle build or GitHub Actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants