-
Notifications
You must be signed in to change notification settings - Fork 292
cmd/pj-rehearse: Truncate extra rehearsals #1315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/pj-rehearse: Truncate extra rehearsals #1315
Conversation
Before this commit, pj-rehearse would just give up [1]: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/12959/pull-ci-openshift-release-master-pj-rehearse/1318651994902106112/build-log.txt | tail -n2 time="2020-10-20T20:36:51Z" level=info msg="Created a rehearsal job to be submitted" org=openshift rehearsal-job=rehearse-12959-pull-ci-openshift-kubernetes-master-e2e-cmd repo=release target-job=pull-ci-openshift-kubernetes-master-e2e-cmd target-repo=openshift/kubernetes time="2020-10-20T20:36:51Z" level=info msg="Would rehearse too many jobs, will not proceed" org=openshift rehearsal-jobs=68 rehearsal-threshold=45 repo=release But skipping rehearsals for a change that touches tons of jobs, like we have done since ef4f47e (Limit rehearsals to 15 jobs, 2019-02-13, openshift/ci-operator-prowgen#78), makes it possible to break a whole lot of things without failing a warning rehearsal. With this commit, we pivot from "give up and test nothing" to "test as many of the touched jobs as we can afford". It would be nice to intelligently truncate, e.g. if we touch a few types of jobs, or a few different workflows. But grouping seems complicated, so for this commit I'm just randomly shuffling and then dumping the tail. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/12959/pull-ci-openshift-release-master-pj-rehearse/1318651994902106112
50505f9 to
f3352ae
Compare
Folks have been tuning this since b0d60e3 (Bump the limit of rehearsed jobs, 2019-08-19, openshift#4789), most recently in 3f0040e (Raise rehearsal-limit to 45 temporarily, 2020-02-03, openshift#6999). But with [1], we no longer need to fuss with this setting in order to see rehearsals for changes that touch lots of jobs, so let it fall back to pj-rehearse's default of 15. [1]: openshift/ci-tools#1315
|
e2e: But that seems orthogonal. I think this is ready to land, just needs a new |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alvaroaleman, stevekuznetsov, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
I'm not a big fan of this, as this will result in gargantuan bot comments with old stale rehearsal failures - this is one of the most visible annoyances and source of confusion of users, and this PR will make that much worse. |
|
@petr-muller what about instead sorting them and using the first N? |
|
Or pruning stale failures? Since that is an issue, even without this feeding in its churn. |
The standard pruning logic is in prow and not easely changeable because that affects a lot of ppl and I am not fond of the idea of adding a github client and a custom pruning logic to the rehearsal tool |
|
Sorting helps but not entirely because the underlying set changes between runs, too Pruning stale failures... I understand that is a wanted feature of Prow itself, to capture that some job ran (possibly with manual trigger only) in the past revision. With rehearsals it's different and annoying, but it would be quite a lot of work, b/c rehearsal tool does not currently interact with GH API in any way. |
|
Bloody hell Alvaro beat me :) |
Looks like the |
I personally would like that as well (ref kubernetes/community#3621 (comment)) but:
|
Before this commit, pj-rehearse would just give up:
But skipping rehearsals for a change that touches tons of jobs, like we have done since ef4f47e (openshift/ci-operator-prowgen#78), makes it possible to break a whole lot of things without failing a warning rehearsal. With this commit, we pivot from "give up and test nothing" to "test as many of the touched jobs as we can afford".
It would be nice to intelligently truncate, e.g. if we touch a few types of jobs, or a few different workflows. But grouping seems complicated, so for this commit I'm just randomly shuffling and then dumping the tail.
CC @petr-muller