machine-config-operator: Make all tests run when requested #3635

cgwalters · 2019-04-30T08:46:12Z

Today we spawn three clusters every time someone does a git push.
That's pretty nuts, and IMO doing this type of thing across
all the repos is both an unnecessary waste of money, and also
exacerbates issues with AWS resource limits.

I sometimes hesitate at doing git push to fix a typo in a comment
as part of a PR review because I know that push may end up
stealing an AWS NAT limit that's going to be used by an actually
important job.

For the MCO specifically many of our changes are extremely
unlikely to break e2e-aws, and if they did they'd break
e2e-aws-op too.

So we'll do e.g. /test e2e-aws on demand in PRs, or
/test all, etc.

Today we spawn *three* clusters every time someone does a `git push`. That's pretty nuts, and IMO doing this type of thing across all the repos is both an unnecessary waste of money, and also exacerbates issues with AWS resource limits. I sometimes hesitate at doing `git push` to fix a typo in a comment as part of a PR review because I know that push may end up stealing an AWS NAT limit that's going to be used by an actually important job. For the MCO specifically many of our changes are extremely unlikely to break `e2e-aws`, and if they did they'd break `e2e-aws-op` too. So we'll do e.g. `/test e2e-aws` on demand in PRs, or `/test all`, etc.

cgwalters · 2019-04-30T08:46:22Z

/hold
Until MCO team approves

wking · 2019-04-30T09:20:10Z

More detail from @cgwalters for folks like me who are shaky on these job properties:

There's a difference in Prow between always_run and optional. The jobs are still not optional, i.e. they are required for merges. This is not at all turning off tests, it's just making them not run on every git push.

cgwalters · 2019-04-30T09:32:20Z

We're debating whether or not this actually works. The intended semantics are still to have all the jobs run after slash-lgtm.

My reading of the Prow docs implied it was, but I could be wrong! (Do we have a "test repo" setup where we can play with job configs?)

Or with Prow and this type of config today do we need to do

 / test all
 / lgtm

when setting up to merge?

openshift-ci-robot · 2019-04-30T09:32:20Z

@cgwalters: you cannot LGTM your own PR.

Details

In response to this:

We're debating whether or not this actually works. The intended semantics are still to have all the jobs run after /lgtm.

My reading of the Prow docs implied it was, but I could be wrong! (Do we have a "test repo" setup where we can play with job configs?)

Or with Prow and this type of config today do we need to do
/test all
/lgtm
when setting up to merge?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cgwalters · 2019-04-30T09:34:40Z

I'm doing this for a whole lot of reasons; another example is that right now at the moment I'd like to iterate on
openshift/machine-config-operator#682
and it's launching upgrade and regular e2e-aws jobs too which is a waste.

sdodson · 2019-04-30T12:37:00Z

/approve
👍 to anything that reduces wasteful cluster builds. We seem to be pushing as many concurrent cluster builds as possible and then wondering why things fail.

As an aside, we should also be looking into measuring the cloud API requirements for our cluster tests so that we can determine what the maximum number of jobs we can run concurrently while ensuring a high rate of success.

openshift-ci-robot · 2019-04-30T12:37:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, sdodson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~ci-operator/config/openshift/machine-config-operator/OWNERS~~ [cgwalters]
~~ci-operator/jobs/openshift/machine-config-operator/OWNERS~~ [cgwalters]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

patrickdillon · 2019-04-30T13:00:32Z

Interesting idea and seems possible. I think you need to create a trigger based on /lgtm: https://github.com/kubernetes/test-infra/blob/master/prow/jobs.md#triggering-jobs-with-comments
You might also need required: true rather than optional: false.

runcom · 2019-04-30T14:19:31Z

/hold
Until MCO team approves

it looks great to me, assuming we clear out confusion on how triggers work

cgwalters · 2019-04-30T14:35:18Z

You might also need required: true

I don't see such a key in the Prow docs nor in any of our jobs.

kikisdeliveryservice · 2019-04-30T14:43:54Z

I'm super on board with not running e2e on each and every push before my PR is ready for review and would love for these tests to be on-demand until I'm looking for a final approval.

patrickdillon · 2019-04-30T14:55:39Z

I don't see such a key in the Prow docs nor in any of our jobs.
My mistake. Sorry for the confusion. Just reread the docs and you have it right with optional.

smarterclayton · 2019-04-30T23:35:42Z

I don't know why you are worried about this. You aren't the problem, the infra is the problem. Our spend is a tiny fraction of the cost of business. Not a significant worry.

smarterclayton · 2019-04-30T23:56:42Z

/hold

stevekuznetsov · 2019-05-01T00:28:17Z

The doc you want is here: https://github.com/kubernetes/test-infra/blob/master/prow/jobs.md#standard-triggering-and-execution-behavior-for-jobs

stevekuznetsov · 2019-05-01T00:30:55Z

Some weird side effects of what you are doing:

if a job does not run for some reason, you will not be able to use /retest to trigger it, as Prow does not know it should be there
tide will only require the statuses if they exist, so you will be able to merge without ever triggering these manually, but once you trigger them manually if they're failed we will not merge

I don't think this is what you want to do.

smarterclayton · 2019-05-01T00:31:20Z

A cluster run costs us between $0.25 and $0.50

stevekuznetsov · 2019-05-01T00:31:25Z

If we get some idea of how many resources a job uses, we can have a throttle on the total number of concurrent jobs hitting the AWS API in a given zone.

smarterclayton · 2019-05-01T00:32:11Z

The current priority is

have AWS increase rate limits (real solution)
potentially drop to 2 zones in CI (distant second solution)
optimize image pulls to come from AWS (where the actual money goes)

wking · 2019-05-01T04:38:25Z

potentially drop to 2 zones in CI (distant second solution)

But also something we can do ourselves without waiting on AWS; #3615.

sdodson · 2019-05-01T11:29:48Z

If we get some idea of how many resources a job uses

@stevekuznetsov which team would be best equipped to measure that?

stevekuznetsov · 2019-05-01T15:16:23Z

@sdodson it's work that DPTP may take on for 4.2

openshift-ci-robot · 2019-10-24T16:17:10Z

@cgwalters: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/build-farm/build01-dry	`6afea5b`	link	`/test build01-dry`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

petr-muller · 2019-10-25T10:40:21Z

/close

Looks stale, please reopen & rebase if still needed.

openshift-ci-robot · 2019-10-25T10:40:22Z

@petr-muller: Closed this PR.

Details

In response to this:

/close

Looks stale, please reopen & rebase if still needed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2019

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 30, 2019

openshift-ci-robot requested review from kikisdeliveryservice and runcom April 30, 2019 08:46

openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 30, 2019

openshift-ci-robot closed this Oct 25, 2019

machine-config-operator: Make all tests run when requested #3635

machine-config-operator: Make all tests run when requested #3635

Uh oh!

Conversation

cgwalters commented Apr 30, 2019

Uh oh!

cgwalters commented Apr 30, 2019

Uh oh!

wking commented Apr 30, 2019

Uh oh!

cgwalters commented Apr 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented Apr 30, 2019

Uh oh!

cgwalters commented Apr 30, 2019

Uh oh!

sdodson commented Apr 30, 2019

Uh oh!

openshift-ci-robot commented Apr 30, 2019

Uh oh!

patrickdillon commented Apr 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

runcom commented Apr 30, 2019

Uh oh!

cgwalters commented Apr 30, 2019

Uh oh!

kikisdeliveryservice commented Apr 30, 2019

Uh oh!

patrickdillon commented Apr 30, 2019

Uh oh!

smarterclayton commented Apr 30, 2019

Uh oh!

smarterclayton commented Apr 30, 2019

Uh oh!

stevekuznetsov commented May 1, 2019

Uh oh!

stevekuznetsov commented May 1, 2019

Uh oh!

smarterclayton commented May 1, 2019

Uh oh!

stevekuznetsov commented May 1, 2019

Uh oh!

smarterclayton commented May 1, 2019

Uh oh!

wking commented May 1, 2019

Uh oh!

sdodson commented May 1, 2019

Uh oh!

stevekuznetsov commented May 1, 2019

Uh oh!

openshift-ci-robot commented Oct 24, 2019

Uh oh!

petr-muller commented Oct 25, 2019

Uh oh!

openshift-ci-robot commented Oct 25, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

cgwalters commented Apr 30, 2019 •

edited

Loading

patrickdillon commented Apr 30, 2019 •

edited

Loading