Skip to content

Store tree-hash inside a comment and compare it before removing lgtm#9138

Merged
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
matthyx:6353
Oct 5, 2018
Merged

Store tree-hash inside a comment and compare it before removing lgtm#9138
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
matthyx:6353

Conversation

@matthyx
Copy link
Contributor

@matthyx matthyx commented Aug 23, 2018

Fixes #6353
I have decided to stick with statuses to avoid spamming reviewers.
Also it makes more sense to me, as a lgtm refers to a commit.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/prow Issues or PRs related to prow labels Aug 23, 2018
@matthyx
Copy link
Contributor Author

matthyx commented Aug 23, 2018

@BenTheElder
Copy link
Member

/lint

Copy link
Contributor

@k8s-ci-robot k8s-ci-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenTheElder: 0 warnings.

Details

In response to this:

/lint

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

// https://developer.github.com/v3/repos/commits/#get-a-single-commit
type SingleCommit struct {
URL string `json:"url"`
Sha string `json:"sha"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/Sha/SHA/, SHA is an initialism


if hasLGTM == true {
// Get combined status to see if we already know a lgtm tree-hash
combined, err := gc.GetCombinedStatus(org, repo, *pe.PullRequest.MergeSHA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be the right SHA because the PR just changed. This is the second point I mentioned here: #6353 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it wasn't clear on the mobile phone... please discard my previous comment. I will check if I can use another ref (The ref can be a SHA, a branch name, or a tag name.).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem isn't with the ref, it is with the use of status contexts. Using a status context associates the state you want to store with a commit SHA that will necessarily change after a rebase. At that time you won't know what the previous SHA was so you won't be able to look up the state you stored.

@matthyx matthyx changed the title Store tree-hash inside a status and compare it before removing lgtm WIP Store tree-hash inside a status and compare it before removing lgtm Aug 23, 2018
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 23, 2018
@stevekuznetsov
Copy link
Contributor

When someone rebases and force-pushes their branch, don't we lose the commit you're adding this status to?

Copy link
Contributor

@stevekuznetsov stevekuznetsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear the approach will work

@matthyx matthyx changed the title WIP Store tree-hash inside a status and compare it before removing lgtm Store tree-hash inside a label and compare it before removing lgtm Aug 24, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 24, 2018
@matthyx
Copy link
Contributor Author

matthyx commented Aug 24, 2018

#bikeshed let's use a label

@cjwagner
Copy link
Member

If we store this state in a label we're going to end up creating a new label for the repository every time we add the lgtm label. Is that ok?

@matthyx
Copy link
Contributor Author

matthyx commented Aug 24, 2018

Actually the tree-hash label is removed if the PR has changed code wise, so if you lgtm again, there will always be zero or one label at a time.

@cjwagner
Copy link
Member

Actually the tree-hash label is removed if the PR has changed code wise, so if you lgtm again, there will always be zero or one label at a time.

I'm talking about labels accumulating in the repository, not on the PR. When you add a new label to a PR it also adds it as a new label to the repository. Because the labels include hashes they will always be unique which will create a lot of new labels in the repository.

@matthyx
Copy link
Contributor Author

matthyx commented Aug 24, 2018

Oh, I see. Like what labelsync does? I'm too junior for that... what do other think?
Otherwise comments... We could add the tree-hash to the existing lgtm comment?

@0xmichalis
Copy link
Contributor

0xmichalis commented Aug 25, 2018 via email

@stevekuznetsov
Copy link
Contributor

Not sure we want to make repo-scoped changes for this. What was the argument against recording it in a comment when we see a LGTM label added?

@matthyx
Copy link
Contributor Author

matthyx commented Aug 27, 2018

I think it was spamming the reviewers with the comment notifications...

@stevekuznetsov
Copy link
Contributor

You'd leave one comment when the LGTM label was added, right? I don't think that really constitutes as "spam". If we don't want to do that, we need to invent some other stateful place to throw the data, which is convoluted if it's some other part of GitHub or complicated if it's a database.

@matthyx
Copy link
Contributor Author

matthyx commented Aug 27, 2018

@cjwagner said on #6353

Commenting every time /lgtm is posted will generate a ton of comment spam.
We should get community approval before going forward with this if we are going to store the tree hash on the PR in any way that triggers an email update to users who are following the PR.

Then he asked @cblecker for his advice, with no answer.

@cblecker
Copy link
Member

Sorry, the other issue got lost in my notifications.

Yeah, this really isn't how Github labels are meant to be used, and we're inviting a bunch of potential problems by doing this.

GitHub loads the list of all labels in a repo on so many pages.. the PR list, the issue list, every issue/PR if you have write access. I can see page load times increasing and the number of 🦄s skyrocketing.

I also don't know if GitHub has a limit of labels per repo.

k/k has an average of 64 PRs per day. If each of those is modified/rebased once, that is 128 labels per day, or just shy of 900 labels per week. Even if we cleaned them up after somehow, k/k has 960 open issues, so we'd need ~1000 labels for this.

Not to mention, the hope was the label_sync could one day become authoritative and delete any labels it doesn't recognize.

I don't see how this is tenable for us.

@matthyx
Copy link
Contributor Author

matthyx commented Oct 1, 2018

@cjwagner @stevekuznetsov should be good now!

Copy link
Member

@cjwagner cjwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we detect if the bot comment has been edited so that we can reject the comment if it has been modified by a human? If that isn't possible thats fine since only maintainers should have permission to edit other peoples' comments, but it would be a nice additional security check to do.

m := addLGTMLabelNotificationRe.FindStringSubmatch(comment.Body)
if comment.User.Login == botname && m != nil {
log.Info("Removing tree-hash comment.")
if err := gc.DeleteComment(org, repoName, comment.ID); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment deletion can be simplified and made more efficient if you use the CommentPruner client in the plugin client. You can follow the pattern used in the require-matching-label plugin. The shouldPrune function may assume that all bot comments it considers are bot comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done so, but I need your (@cjwagner ) help to fix the unit tests... I have found nowhere an example on how to actually fake PruneComments deleting comments.

if err := gc.RemoveLabel(org, repoName, number, LGTMLabel); err != nil {
return err
}
botname, err := gc.BotName()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can skip trying to clean up the "LGTM added." comment in repos where StoreTreeHash is not enabled. This will save API tokens that would be used to list the issue comments.
If we fail to delete a few comments after a config change disables the feature in a repo it isn't a big deal.

}
for _, comment := range comments {
if comment.User.Login == botname && comment.Body == removeLGTMLabelNoti {
if err := gc.DeleteComment(org, repoName, comment.ID); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CommentPruner can be used here as well.

}
hasLGTM = github.HasLabel(LGTMLabel, labels)

if hasLGTM == true {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/hasLGTM == true/hasLGTM/

You can actually get rid of hasLGTM entirely and just put github.HasLabel(LGTMLabel, labels) here.

if err != nil {
log.WithError(err).Errorf("Failed to get pull request.")
}
SHA := *pr.MergeSHA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MergeSHA is a pointer so we need a nil check before accessing it.
In particular, this field is lazily calculated and may not always be specified.

}
}
// Get the current tree-hash
commit, err := gc.GetSingleCommit(org, repo, *pe.PullRequest.MergeSHA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nil check for MergeSHA.

@matthyx
Copy link
Contributor Author

matthyx commented Oct 2, 2018

For the comment edited detection, we could compare created_at and updated_at?

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 4, 2018
@matthyx
Copy link
Contributor Author

matthyx commented Oct 4, 2018

@cjwagner I think we have all we need now :)
I let you have a look, and if lgty I will squash before we merge, thanks for your time!

}
}
// Get the current tree-hash
if pe.PullRequest.MergeSHA != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check could go before gc.BotName and gc.ListIssueComments to save API tokens.

@cjwagner
Copy link
Member

cjwagner commented Oct 4, 2018

This LGTM. Please squash commits.

@matthyx matthyx force-pushed the 6353 branch 2 times, most recently from b3e0eb2 to 76e2e5e Compare October 5, 2018 10:33
Copy link
Member

@cjwagner cjwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @matthyx!
As @stevekuznetsov mentioned:

Will need update from @kubernetes/sig-contributor-experience-pr-reviews and an e-mail blast when rolled out

But this PR only actually enables the plugin for kubernetes/test-infra which we don't need to send out an update for (except maybe a comment in sig-testing slack). We can test out the behavior before notifying the appropriate parties and rolling out to the rest of kubernetes.
Note that merging this PR won't have an affect until we bump Prow.
/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 5, 2018
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 5, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cjwagner, matthyx, stevekuznetsov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit a6e5d31 into kubernetes:master Oct 5, 2018
@k8s-ci-robot
Copy link
Contributor

@matthyx: Updated the plugins configmap using the following files:

  • key plugins.yaml using file prow/plugins.yaml
Details

In response to this:

Fixes #6353
I have decided to stick with statuses to avoid spamming reviewers.
Also it makes more sense to me, as a lgtm refers to a commit.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@matthyx matthyx deleted the 6353 branch October 5, 2018 17:11
}
treeHash := commit.Commit.Tree.SHA
log.Info("Adding comment to store tree-hash: " + treeHash)
if err := gc.CreateComment(org, repoName, number, fmt.Sprintf(addLGTMLabelNotification, treeHash)); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it reduce spam if we edited the comment vs pruning and creating every time?

@wking
Copy link
Contributor

wking commented Oct 18, 2018

This is great :). Can we pull the helpers out somewhere so approve can use them too? Should any approve use be gated by a config option, or can we always blow away approval on tree changes? [edit: probably just recycle store_tree_hash there too]

@cjwagner
Copy link
Member

@wking How would approve benefit from this? Approval is not affected by the specific line changes that are made, but rather which files are changed.
When a PR is updated, the set of approvers is not changed, the only reason a PR update would cause the approved label to be removed is if the PR is updated to change additional files that are not approved by the current approver set.

@wking
Copy link
Contributor

wking commented Oct 18, 2018

When a PR is updated, the set of approvers is not changed, the only reason a PR update would cause the approved label to be removed...

This seems like a bug to me. For example, say:

  1. You submit a PR.
  2. I look at my section and /approve your change.
  3. You push something crazy.
  4. Tide sees the approve label and merges.

That's what we have now (as far as the approve plugin is concerned), but it seems... overly trusting ;). For comparison, GitHub allows you to configure "Dismiss stale pull request approvals when new commits are pushed" when you enable required reviews for pull requests. I don't see we wouldn't want a similar option for dismissing stale approvals. For bonus points, it would be nice to only dismiss approvals when the files controlled by that approver changed, but I think an opt-in blanket dismissal would be a good first step.

@cjwagner
Copy link
Member

The approve process is distinct from the lgtm process which is what is used for reviewing actual code changes. The approved label is meant to indicate that a code owner thinks the overall direction or intention of the PR is appropriate, this frequently does not involve a careful review of the code.
These processes are distinct not only because the concepts are distinct, but also so that we can allow a larger set of reviewers than there are code owners. If code owners had to review and re-approve every minor PR change they would become a massive bottleneck to merge velocity so we allow any org member (not just code owners) to do code review and just require general approval from the code owners.

TLDR; it is intentional and part of the design for the set of approvers to not be changed by updates to the PR's code.

@wking
Copy link
Contributor

wking commented Oct 18, 2018

The approved label is meant to indicate that a code owner thinks the overall direction or intention of the PR is appropriate, this frequently does not involve a careful review of the code.

That matches the docs here.

These processes are distinct not only because the concepts are distinct, but also so that we can allow a larger set of reviewers than there are code owners. If code owners had to review and re-approve every minor PR change they would become a massive bottleneck to merge velocity so we allow any org member (not just code owners) to do code review and just require general approval from the code owners.

So basically it's an honor system among members of the organization once the approvers have blessed the general direction at some point in the PR's lifecycle [which may no longer correspond to the current direction]? Would you accept an PR to allow projects to opt-in to tighter handling when they do have sufficiently active/picky/paranoid code-owners?

@cjwagner
Copy link
Member

So basically it's an honor system among members of the organization once the approvers have blessed the general direction at some point in the PR's lifecycle [which may no longer correspond to the current direction]?

To some degree. In practice there is rarely just one reviewer looking at a PR and if a PR was significantly changed without any notice that would be a giant red flag.
A PR author will always need an org member to review their latest changes, but we don't require the code owners to review every incremental PR update because that would hinder merge velocity too much. This has been sufficiently secure in our experience.

Would you accept an PR to allow projects to opt-in to tighter handling when they do have sufficiently active/picky/paranoid code-owners?

I'm open to the idea, but the approve plugin is currently quite a mess (namely it doesn't work properly with regexp filtered owners files) so I'm a little hesitant to compound technical debt by adding more complexity before a rewrite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/prow Issues or PRs related to prow cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants

Comments