Skip to content

Conversation

@vrutkovs
Copy link
Contributor

@vrutkovs vrutkovs commented Jul 3, 2019

Run openshift/conformance/parallel suite when running RHEL 7 scaleup tests. This suite doesn't include serial tests, which flake often after parallel runs. As a result e2e-aws-scaleup-rhel7 would flake much less often

Follow up for #4253

/cc @abhinavdahiya @runcom @enxebre

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 3, 2019
@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jul 5, 2019

/test pj-rehearse

@vrutkovs
Copy link
Contributor Author

/cc @abhinavdahiya @runcom @enxebre

@sdodson
Copy link
Member

sdodson commented Jul 30, 2019

/retest

@vrutkovs
Copy link
Contributor Author

/cc @wking

@openshift-ci-robot openshift-ci-robot requested a review from wking July 31, 2019 15:08
@wking
Copy link
Member

wking commented Jul 31, 2019

Do we understand the flake increase? Are the tests that flake consistent? Why is RHEL scaleup different from provisioning more RHCOS compute machines?

@vrutkovs
Copy link
Contributor Author

Do we understand the flake increase?

Yes

Are the tests that flake consistent?

Yes

Why is RHEL scaleup different from provisioning more RHCOS compute machines?

The difference here is that on scaleup tests we run full suite - first parallel tests and then serial. These don't play nice together, seems parallel storage tests don't properly cleanup namespaces and PVs, so serial tests are flaking a lot more often.
Usual e2e-aws tests run only parallel tests so these are not affected.

@vrutkovs
Copy link
Contributor Author

/retest

@wking
Copy link
Member

wking commented Jul 31, 2019

... seems parallel storage tests don't properly cleanup namespaces and PVs, so serial tests are flaking a lot more often.

Is there a ticket for getting this fixed, either the parallel cleanup, serial robustness, or both?

Rehearsals failed with Prometheus issues like:

fail [github.com/openshift/origin/test/extended/prometheus/prometheus_builds.go:167]: Expected <map[string]error | len:1>: { "sum(ALERTS{alertstate=\"firing\"})": { s: "query sum(ALERTS{alertstate=\"firing\"}) for tests []prometheus.metricTest{prometheus.metricTest{labels:map[string]string(nil), greaterThanEqual:false, value:2, success:false}} had results {\"status\":\"success\",\"data\":{\"resultType\":\"vector\",\"result\":[{\"metric\":{},\"value\":[1564594325.614,\"2\"]}]}}", }, } to be empty
...
failed: (13m14s) 2019-07-31T17:32:06 "[Feature:Prometheus][Conformance] Prometheus when installed on the cluster should report less than two alerts in firing or pending state [Suite:openshift/conformance/parallel/minimal]"

/retest

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Aug 3, 2019

/retest

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Aug 3, 2019

podman run --rm registry.svc.ci.openshift.org/ci-op-khy5kyqs/release@sha256:501c06cba127fda5e4269f22faf4aa795b532b4c311f0cac87d8d70c24a13dbf image machine-config-daemon
...
  stderr: 'F0803 08:00:17.269481       1 image.go:32] error: error: Unknown name requested, could not find machine-config-daemon in UpdatePayload' 

@smarterclayton @runcom, did this image got renamed in 4.2 payload?

@cgwalters
Copy link
Member

We folded the MCO into one image, see openshift/machine-config-operator#850 (and a lot of following PRs). Scaleup needs to pull machine-config-operator now which has the MCD binary.

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Aug 5, 2019

Oh, I see. openshift/openshift-ansible#11801 would fix that

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Aug 9, 2019

/retest

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Aug 9, 2019

/retest

Flakes, but works in general

@sdodson
Copy link
Member

sdodson commented Aug 9, 2019

/lgtm

We've got to spend some time reviewing which tests flake more frequently here than on RHCOS nodes so I've put in https://jira.coreos.com/browse/CORS-1176

Nevermind, that's not necessary.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 9, 2019
@sdodson
Copy link
Member

sdodson commented Aug 12, 2019

@vrutkovs can you update OWNERS on installer paths?
ci-operator/populate-owners.sh installer

@vrutkovs
Copy link
Contributor Author

Done.

/hold

as I'm not sure if OWNERS were updated - installer-owners and -reviewers was removed on purpose?

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 12, 2019
@vrutkovs vrutkovs force-pushed the scaleup-update-other-repos branch from 48bccf5 to 5b407f2 Compare August 12, 2019 14:44
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Aug 12, 2019
@sdodson
Copy link
Member

sdodson commented Aug 12, 2019

/approve
Yeah, those changes look as expected. The OWNERS files in this repo should resolve aliases in order to avoid conflicts, there's a note about it in the docs somewhere.

@vrutkovs
Copy link
Contributor Author

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 12, 2019
@sdodson
Copy link
Member

sdodson commented Aug 12, 2019

/assign @enxebre @runcom
for approval

@mtnbikenc
Copy link
Member

/retest

@sdodson
Copy link
Member

sdodson commented Aug 14, 2019

/label approved

@sdodson
Copy link
Member

sdodson commented Aug 14, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 14, 2019
@sdodson sdodson added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 14, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: sdodson, vrutkovs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 4b9a7db into openshift:master Aug 14, 2019
@openshift-ci-robot
Copy link
Contributor

@vrutkovs: Updated the following 12 configmaps:

  • ci-operator-master-configs configmap in namespace ci-stg using the following files:
    • key openshift-installer-master.yaml using file ci-operator/config/openshift/installer/openshift-installer-master.yaml
    • key openshift-machine-api-operator-master.yaml using file ci-operator/config/openshift/machine-api-operator/openshift-machine-api-operator-master.yaml
    • key openshift-machine-config-operator-master.yaml using file ci-operator/config/openshift/machine-config-operator/openshift-machine-config-operator-master.yaml
  • ci-operator-4.3-configs configmap in namespace ci using the following files:
    • key openshift-installer-release-4.3.yaml using file ci-operator/config/openshift/installer/openshift-installer-release-4.3.yaml
    • key openshift-machine-api-operator-release-4.3.yaml using file ci-operator/config/openshift/machine-api-operator/openshift-machine-api-operator-release-4.3.yaml
    • key openshift-machine-config-operator-release-4.3.yaml using file ci-operator/config/openshift/machine-config-operator/openshift-machine-config-operator-release-4.3.yaml
  • ci-operator-4.3-configs configmap in namespace ci-stg using the following files:
    • key openshift-installer-release-4.3.yaml using file ci-operator/config/openshift/installer/openshift-installer-release-4.3.yaml
    • key openshift-machine-api-operator-release-4.3.yaml using file ci-operator/config/openshift/machine-api-operator/openshift-machine-api-operator-release-4.3.yaml
    • key openshift-machine-config-operator-release-4.3.yaml using file ci-operator/config/openshift/machine-config-operator/openshift-machine-config-operator-release-4.3.yaml
  • job-config-master-presubmits configmap in namespace ci using the following files:
    • key openshift-installer-master-presubmits.yaml using file ci-operator/jobs/openshift/installer/openshift-installer-master-presubmits.yaml
    • key openshift-machine-api-operator-master-presubmits.yaml using file ci-operator/jobs/openshift/machine-api-operator/openshift-machine-api-operator-master-presubmits.yaml
    • key openshift-machine-config-operator-master-presubmits.yaml using file ci-operator/jobs/openshift/machine-config-operator/openshift-machine-config-operator-master-presubmits.yaml
  • job-config-4.1 configmap in namespace ci using the following files:
    • key openshift-installer-release-4.1-presubmits.yaml using file ci-operator/jobs/openshift/installer/openshift-installer-release-4.1-presubmits.yaml
    • key openshift-machine-api-operator-release-4.1-presubmits.yaml using file ci-operator/jobs/openshift/machine-api-operator/openshift-machine-api-operator-release-4.1-presubmits.yaml
    • key openshift-machine-config-operator-release-4.1-presubmits.yaml using file ci-operator/jobs/openshift/machine-config-operator/openshift-machine-config-operator-release-4.1-presubmits.yaml
  • ci-operator-master-configs configmap in namespace ci using the following files:
    • key openshift-installer-master.yaml using file ci-operator/config/openshift/installer/openshift-installer-master.yaml
    • key openshift-machine-api-operator-master.yaml using file ci-operator/config/openshift/machine-api-operator/openshift-machine-api-operator-master.yaml
    • key openshift-machine-config-operator-master.yaml using file ci-operator/config/openshift/machine-config-operator/openshift-machine-config-operator-master.yaml
  • ci-operator-4.1-configs configmap in namespace ci using the following files:
    • key openshift-installer-release-4.1.yaml using file ci-operator/config/openshift/installer/openshift-installer-release-4.1.yaml
    • key openshift-machine-api-operator-release-4.1.yaml using file ci-operator/config/openshift/machine-api-operator/openshift-machine-api-operator-release-4.1.yaml
    • key openshift-machine-config-operator-release-4.1.yaml using file ci-operator/config/openshift/machine-config-operator/openshift-machine-config-operator-release-4.1.yaml
  • ci-operator-4.1-configs configmap in namespace ci-stg using the following files:
    • key openshift-installer-release-4.1.yaml using file ci-operator/config/openshift/installer/openshift-installer-release-4.1.yaml
    • key openshift-machine-api-operator-release-4.1.yaml using file ci-operator/config/openshift/machine-api-operator/openshift-machine-api-operator-release-4.1.yaml
    • key openshift-machine-config-operator-release-4.1.yaml using file ci-operator/config/openshift/machine-config-operator/openshift-machine-config-operator-release-4.1.yaml
  • ci-operator-4.2-configs configmap in namespace ci using the following files:
    • key openshift-installer-release-4.2.yaml using file ci-operator/config/openshift/installer/openshift-installer-release-4.2.yaml
    • key openshift-machine-api-operator-release-4.2.yaml using file ci-operator/config/openshift/machine-api-operator/openshift-machine-api-operator-release-4.2.yaml
    • key openshift-machine-config-operator-release-4.2.yaml using file ci-operator/config/openshift/machine-config-operator/openshift-machine-config-operator-release-4.2.yaml
  • ci-operator-4.2-configs configmap in namespace ci-stg using the following files:
    • key openshift-installer-release-4.2.yaml using file ci-operator/config/openshift/installer/openshift-installer-release-4.2.yaml
    • key openshift-machine-api-operator-release-4.2.yaml using file ci-operator/config/openshift/machine-api-operator/openshift-machine-api-operator-release-4.2.yaml
    • key openshift-machine-config-operator-release-4.2.yaml using file ci-operator/config/openshift/machine-config-operator/openshift-machine-config-operator-release-4.2.yaml
  • job-config-4.2 configmap in namespace ci using the following files:
    • key openshift-installer-release-4.2-presubmits.yaml using file ci-operator/jobs/openshift/installer/openshift-installer-release-4.2-presubmits.yaml
    • key openshift-machine-api-operator-release-4.2-presubmits.yaml using file ci-operator/jobs/openshift/machine-api-operator/openshift-machine-api-operator-release-4.2-presubmits.yaml
    • key openshift-machine-config-operator-release-4.2-presubmits.yaml using file ci-operator/jobs/openshift/machine-config-operator/openshift-machine-config-operator-release-4.2-presubmits.yaml
  • job-config-4.3 configmap in namespace ci using the following files:
    • key openshift-installer-release-4.3-presubmits.yaml using file ci-operator/jobs/openshift/installer/openshift-installer-release-4.3-presubmits.yaml
    • key openshift-machine-api-operator-release-4.3-presubmits.yaml using file ci-operator/jobs/openshift/machine-api-operator/openshift-machine-api-operator-release-4.3-presubmits.yaml
    • key openshift-machine-config-operator-release-4.3-presubmits.yaml using file ci-operator/jobs/openshift/machine-config-operator/openshift-machine-config-operator-release-4.3-presubmits.yaml
Details

In response to this:

Run openshift/conformance/parallel suite when running RHEL 7 scaleup tests. This suite doesn't include serial tests, which flake often after parallel runs. As a result e2e-aws-scaleup-rhel7 would flake much less often

Follow up for #4253

/cc @abhinavdahiya @runcom @enxebre

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@vrutkovs: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/rehearse/openshift/machine-config-operator/release-4.3/e2e-rhel-scaleup 5b407f2 link /test pj-rehearse
ci/rehearse/openshift/machine-api-operator/release-4.3/e2e-rhel-scaleup 5b407f2 link /test pj-rehearse
ci/rehearse/openshift/installer/release-4.3/e2e-aws-scaleup-rhel7 5b407f2 link /test pj-rehearse
ci/rehearse/openshift/machine-config-operator/release-4.2/e2e-rhel-scaleup 5b407f2 link /test pj-rehearse
ci/rehearse/openshift/installer/release-4.1/e2e-aws-scaleup-rhel7 5b407f2 link /test pj-rehearse
ci/rehearse/openshift/machine-api-operator/release-4.2/e2e-rhel-scaleup 5b407f2 link /test pj-rehearse
ci/rehearse/openshift/installer/release-4.2/e2e-aws-scaleup-rhel7 5b407f2 link /test pj-rehearse
ci/prow/pj-rehearse 5b407f2 link /test pj-rehearse

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants