Skip to content

Conversation

@vrutkovs
Copy link
Contributor

bastion_ssh output may be mangled, so it should not be used to set var
values. Instead first master should collect necessary info without
passing it back to test container

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 30, 2019
@vrutkovs vrutkovs force-pushed the dr-etcd-restore-assemble-fix branch 10 times, most recently from 7b50694 to fcb7649 Compare May 31, 2019 13:37
@vrutkovs vrutkovs force-pushed the dr-etcd-restore-assemble-fix branch 3 times, most recently from ce1db6b to 7c94cc3 Compare May 31, 2019 19:00
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 31, 2019
@hexfusion
Copy link
Contributor

/retest

@vrutkovs vrutkovs force-pushed the dr-etcd-restore-assemble-fix branch from 7c94cc3 to a6d8c5c Compare June 1, 2019 08:24
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 1, 2019
@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jun 1, 2019

/test pj-rehearse

ssh-bastion remove failed

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 2, 2019
@vrutkovs vrutkovs force-pushed the dr-etcd-restore-assemble-fix branch from c02234d to 701d530 Compare June 2, 2019 12:15
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 2, 2019
@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jun 2, 2019

/test pj-rehearse

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jun 2, 2019

Flakes

/test pj-rehearse

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jun 3, 2019

/test pj-rehearse

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jun 3, 2019

Additional masters didn't come up

/test pj-rehearse

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jun 3, 2019

/test pj-rehearse

@vrutkovs vrutkovs force-pushed the dr-etcd-restore-assemble-fix branch 5 times, most recently from 252acaf to a014af5 Compare June 3, 2019 15:47
@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jun 3, 2019

Failing tests:

[Feature:Platform] Managed cluster should have no crashlooping pods in core namespaces over two minutes [Suite:openshift/conformance/parallel]
[Feature:Prometheus][Conformance] Prometheus when installed on the cluster should report less than two alerts in firing or pending state [Suite:openshift/conformance/parallel/minimal]
[sig-instrumentation] MetricsGrabber should grab all metrics from a Kubelet. [Suite:openshift/conformance/parallel] [Suite:k8s]

/test pj-rehearse

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jun 3, 2019

@vrutkovs: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/rehearse/openshift/machine-config-operator/master/e2e-restore-cluster-state a014af5b4ef0be22cf4c4417d18cff76d0fb7ffb link /test pj-rehearse

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@vrutkovs
Copy link
Contributor Author

vrutkovs commented Jun 4, 2019

/cc @hexfusion @runcom

PTAL, this fixes etcd restore tests - see https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_release/3919/rehearse-3919-pull-ci-openshift-machine-config-operator-master-e2e-restore-cluster-state/30

[Feature:Platform] Managed cluster should have no crashlooping pods in core namespaces over two minutes [Suite:openshift/conformance/parallel] is still failing - I'll address that in the followup PR

`bastion_ssh` output may be mangled, so it should not be used to set var
values. Instead first master should collect necessary info without
passing it back to `test` container
echo "Assemble etcd connection string"
bastion_ssh "core@${FIRST_MASTER}" 'rm -rf /tmp/etcd/connstring && mapfile -t MASTERS < <(ls /tmp/etcd) && echo ${MASTERS[@]} && for master in "${MASTERS[@]}"; do echo -n "$(cat /tmp/etcd/${master}/etcd_name)=$(cat /tmp/etcd/${master}/etcd_uri)," >> /tmp/etcd/connstring; done && sed -i '"'$ s/.$//'"' /tmp/etcd/connstring'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since etcd is running we can use it to generate this list. We don't need to change it now but just a talking point. Not perfect but this is one idea.

etcdctl member list -w json | jq -r '.members[] | [.name,.peerURLs[0]] | "(.[0])=(.[1])" ' | xargs | sed -e 's/ /,/g'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so I'll be rewriting this into a proper golang e2e test - hopefully we'd have better instruments than cat and would be able to rework this

@hexfusion
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 5, 2019
Copy link
Contributor

@abhinavdahiya abhinavdahiya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, hexfusion, vrutkovs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2019
@openshift-merge-robot openshift-merge-robot merged commit ccd42f5 into openshift:master Jun 5, 2019
@openshift-ci-robot
Copy link
Contributor

@vrutkovs: Updated the following 2 configmaps:

  • prow-job-cluster-launch-installer-e2e configmap in namespace ci using the following files:
    • key cluster-launch-installer-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml
  • prow-job-cluster-launch-installer-e2e configmap in namespace ci-stg using the following files:
    • key cluster-launch-installer-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml
Details

In response to this:

bastion_ssh output may be mangled, so it should not be used to set var
values. Instead first master should collect necessary info without
passing it back to test container

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants