Skip to content

Run update-ref-docs (dry-run) on presubmit of istio.io#2607

Merged
istio-testing merged 1 commit intoistio:masterfrom
clarketm:urd-dr
May 4, 2020
Merged

Run update-ref-docs (dry-run) on presubmit of istio.io#2607
istio-testing merged 1 commit intoistio:masterfrom
clarketm:urd-dr

Conversation

@clarketm
Copy link
Copy Markdown
Member

@clarketm clarketm commented Apr 22, 2020

  1. Run update-ref-docs-dry-run on presubmit of istio/istio.io which tests the generation of reference docs and set it as required keep it as optional.
  2. Make the existing update-ref-docs-dry-run presubmit on istio/istio required. Remove the update-ref-docs-dry-run presubmit from istio/istio

/hold
istio/istio.io#7132

@clarketm clarketm requested a review from a team as a code owner April 22, 2020 21:26
@istio-testing istio-testing added the do-not-merge/hold Block automatic merging of a PR. label Apr 22, 2020
@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Apr 22, 2020
@istio-testing istio-testing added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Apr 22, 2020
@clarketm clarketm changed the title Run update-ref-docs (dr) on presubmit of istio.io and make it required Run update-ref-docs (dry-run) on presubmit of istio.io and make it required Apr 22, 2020
@clarketm clarketm marked this pull request as draft April 22, 2020 21:50
@clarketm clarketm marked this pull request as ready for review April 23, 2020 18:20
@clarketm clarketm removed the do-not-merge/hold Block automatic merging of a PR. label Apr 23, 2020
@ericvn
Copy link
Copy Markdown
Contributor

ericvn commented Apr 24, 2020

My concern with setting up the test here is that it will cause a failure in the istioio pipeline with integrating a PR. And more than likely the problem will be from a change make in another repo that the docs developer won't know much about.

@clarketm
Copy link
Copy Markdown
Member Author

clarketm commented Apr 24, 2020

My concern with setting up the test here is that it will cause a failure in the istioio pipeline with integrating a PR. And more than likely the problem will be from a change make in another repo that the docs developer won't know much about.

@ericvn

Should this test only run inistio/istio presubmit? Alternatively, we can make it optional in istio/istio.io to indicate doc generation is broken but not block PRs. All of these permutations are possible. What do you think makes the most sense (I can modify as needed)?

@ericvn
Copy link
Copy Markdown
Contributor

ericvn commented Apr 27, 2020

@clarketm Let me try and articulate the failures and hopefully that can help with a decision.

Failure #1. Someone makes a change in a document that is pulled from another repo (maybe one in the proxy repo). This document causes the linter to fail in the automated PR, and the PR won't merge until the problem is fixed.

  • istio.io maintainer would notice the automated PR holding out (might be more noticeable if we auto-merged since any outstanding automated PR would be construed as bad). The maintainer looks at the failing tests, determines it's a change in another file in another repo (hopefully) and then engages that team to fix their document. Once that change is merged, the automated PR runs and gets the change, the PR should be ready for merging (or auto-merged).
  • Until that PR merges, lack of a pre-submit dry run, lets the istio.io repo continue on,
  • What happens in his case with a dry-run? If it's just checking if the files can be pulled, and no lint checking after the files are pulled, things will be fine. If the dry-run pulls the files, and we try to do lint checking, then all PRs will fail the test until the other component has their change merged.

Failure #2. There is a change in the script in terms of what files are pulled (maybe we remove the galley files for example).

  • No checking during presubmit, lets the change go in.
  • The automated job will run and create a PR, which will fail listing if the docs still reference some of the removed galley docs.
  • A istio.io maintainer notes the automated PR isn't merging, looks at the tests, and sees that their are some galley references missing. Since it's in the istio.io repo, they can either add to the automated PR or should probably create a new PR with the galley items removed. Once that PR merges, a /retest on the automated PR will pass and the PR merges.
  • What does the dry-run do here? If it runs a update and then runs the linter, then the error would be found before the PR was merged (insrtead of a day or two later when the automated PR was noticed). If it's just verifying that the script works, there is no error caught and the original flow happens.

Failure #3: Someone updates the script incorrectly so it fails. It passes the basic lint tests, but something is wrong with the running of the script. This would be the same as Failure #2 in terms of when things would get noticed and fixed. However, in this case, actually trying to run the script in a pre-submit job would have caught the error earlier.

So, what does the dry-run actually do? If it just runs the make update_ref_docs, then I'm not sure there is much use, other than verifying that the script works (fixes Failure #3). It won't really help with Failures #1 and #2. It will also fail if one of the other repos made a change and it hadn't been through the automated yet.

If it actually pulls files and runs lint, then it hurts istio.io maintainers in the Failure #1 case as all merges will be held until the other team can fix their docs (or revert their breakage). It would prevent #2 from happening as the PRs aren't merged.

I would advocate that the maintainers should have any changes to the script also require a local make update-ref-docs run and check in the changes with it. It there are no additional changes, hold the PR out until it is done. This would be similar to a check that make gen was run. And that's what I expect this dry-run is trying to do. But see my comment earlier about changes in another repo potentially holding our PRs until the automated ran, or someone ran things manually.

Please correct my incorrect thinking, or maybe their is another failure scenario I am missing.

@ericvn
Copy link
Copy Markdown
Contributor

ericvn commented Apr 28, 2020

I'll add a Failure#4:

Someone changes one of the reference files (let's pick spelling) and updates the .spelling file removing the incorrect spelling. In this case, without doing any reference checking and if someone doesn't note that a generated file is update, the PR will go in. As soon as the automator runs, it will find the changed file and attempt to override it with the old one (with the incorrect spelling). This PR will fail lint since the old incorrect spelling, which wasn't changed upstream, is now invalid.

In this case, could the job may have identified that the file was updated and prevent the original PR from going in?

@clarketm clarketm added the do-not-merge/hold Block automatic merging of a PR. label Apr 29, 2020
@clarketm
Copy link
Copy Markdown
Member Author

clarketm commented Apr 29, 2020

@ericvn

I have not fully analyzed the various failure conditions you mentioned above, so I still owe you a more detailed response. But for clarity, below is a response to your above assumption.

I would advocate that the maintainers should have any changes to the script also require a local make update-ref-docs run and check in the changes with it. It there are no additional changes, hold the PR out until it is done. This would be similar to a check that make gen was run. And that's what I expect this dry-run is trying to do. But see my comment earlier about changes in another repo potentially holding our PRs until the automated ran, or someone ran things manually.

In this particular case, the dry-run is verifying that the command being ran, make update-ref-docs, completes successfully (i.e. completes with a zero exit code). If does not actually check for a clean diff. Instead, it is considered dry-run because it does not commit changes. To achieve what you describe, we could additionally run make check-clean-repo in the cmd (or add a native way to do this in automator).

@ericvn
Copy link
Copy Markdown
Contributor

ericvn commented Apr 30, 2020

Thanks for explaining what is planned for an istio.io PR. That helps.

Let's look at the actual failure parsing back through the PRs (correct me if I get this wrong).

  • There was a change removing a directory in istio/istio.
    • everything there passed and was merged.
  • The nightly automated update-ref-docs ran (job specified for the istio.io repo).
    • The job itself failed because it tried to cd to a non-existent directory or maybe failed to build the executable
    • I assume that this failure would happen before we got to the part of the process to do the actual diff and creating a PR in the istio.io repo. The only place we see this failure is possibly on a dashboard (or slack channel). We don't see
  • So no one really knows there was an issue (assuming that not a lot of people watch the dashboard, etc).
  • Thus the issue isn't really looked at and fixed.

So, the problem is someone made a change in the istio/istio rep, it broke an automated job on the istio.io repo and no one really got notified and had a need to fix it.

  • One fix is this one, which would do the portion if the failing job whenever someone checks code into the istio.io repo. So, we are now limiting any merging into the istio.io repo because someone made a change in the istio repo. It does raise the issue to one needing to be fixed quickly. At this point we have several ways to fix this. The istio.io team may be able to see that a PR was merged in another repo and work on fixing the issue. This change might be minimal, but most likely it's larger. In this case, the lack of a file meant things had to be removed in terms of files we are processing. One the files are missing, then the docs that referenced the missing data need to be updated. My guess is that this is a 1 day minimum sort of task. Shutting down the istio.io repo probably isn't the best thing. More than likely, the original PR could have been reverted for the time being. Then either the original developer, or they can work with someone from the doc team, can work on getting everything organized for a successful merge.

My thoughts based on this is the there is really no way to raise the issue that occurred without adding a test somewhere. It would be hard to add them in the other repos, so having the ref update docs in the istio.io repo might be the easiest way to find it.

I don't think having it required is the best option since it will unnecessarily force a revert of a PR or a possible long downtime in merging PRs. If we find that the red X is not getting enough focus, then we can think about making it required.

@clarketm
Copy link
Copy Markdown
Member Author

clarketm commented Apr 30, 2020

Let's look at the actual failure parsing back through the PRs (correct me if I get this wrong).
...

Yes, your analysis and perspective aligns with mine pertaining to update-ref-docs failures in the recent past. And yes, you are also correct in that the failure(s) occur at a step before for diff checking would hypothetically be reached.

My thoughts based on this is the there is really no way to raise the issue that occurred without adding a test somewhere. It would be hard to add them in the other repos, so having the ref update docs in the istio.io repo might be the easiest way to find it.

I would agree that in most cases it is difficult to add and distribute a particular test across several repos (from a maintenance standpoint), although with the robust job generation and tools we have I would say that it is definitely a possibility if we want to extend our coverage with this approach (i.e. adding test to the various repos that docs are sources from).

I don't think having it required is the best option since it will unnecessarily force a revert of a PR or a possible long downtime in merging PRs. If we find that the red X is not getting enough focus, then we can think about making it required.

Okay, to ensure I understand you though process correctly, the plan is:

  1. Add update-ref-docs-dry-run job to istio/istio.io and make it optional. This will provide a layer of visibility that something is awry but not inhibit productivity.
  2. Remove (or do not add) the update-ref-docs-dry-run to any of the various source repos (e.g. istio/istio).

@clarketm clarketm changed the title Run update-ref-docs (dry-run) on presubmit of istio.io and make it required Run update-ref-docs (dry-run) on presubmit of istio.io May 2, 2020
@clarketm clarketm removed the do-not-merge/hold Block automatic merging of a PR. label May 2, 2020
@clarketm
Copy link
Copy Markdown
Member Author

clarketm commented May 2, 2020

/retest

@ericvn
Copy link
Copy Markdown
Contributor

ericvn commented May 3, 2020

I'm fine with this, but would have preferred to leave the test in istio/istio (although no one appears to care when it fails when pushing in PRs) and make it required. That is the initial point of failure for the last failure. If the initial PR was never allowed to go in, there would have been no failure. The author would have to know that they needed to update the doc ref update script before the initial PR was allowed to merge.

@howardjohn
Copy link
Copy Markdown
Member

Eric, I don't think that is how it behaves in my experience. I had a pr that broke the docs and my pr did not fail. The test pulls from master of istio/istio, so it doesn't fail until it's already merged

@ericvn
Copy link
Copy Markdown
Contributor

ericvn commented May 4, 2020

Ah. Since the change isn't yet merged in master, the failure doesn't happen when run. We would have to change the test to run against the current commit. Maybe we can do that at a later time and then add to the repos that the ref updates.

@istio-testing istio-testing merged commit 76f8c99 into istio:master May 4, 2020
@istio-testing
Copy link
Copy Markdown
Collaborator

@clarketm: Updated the job-config configmap in namespace default at cluster default using the following files:

  • key istio.istio.io.master.gen.yaml using file prow/cluster/jobs/istio/istio.io/istio.istio.io.master.gen.yaml
  • key istio.istio.master.gen.yaml using file prow/cluster/jobs/istio/istio/istio.istio.master.gen.yaml
Details

In response to this:

  1. Run update-ref-docs-dry-run on presubmit of istio/istio.io which tests the generation of reference docs and set it as required keep it as optional.
  2. Make the existing update-ref-docs-dry-run presubmit on istio/istio required. Remove the update-ref-docs-dry-run presubmit from istio/istio

/hold
istio/istio.io#7132

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants