Skip to content

Conversation

@eisraeli
Copy link
Contributor

@eisraeli eisraeli commented Aug 31, 2020

Signed-off-by: Eran Israeli <eisraeli@redhat.com>
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 31, 2020
@openshift-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

set -o errexit
set -o pipefail

echo "************ baremetalds e2e upgrade test command ************" No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step is no needed, openshift-e2e-test has it

- ref: baremetalds-devscripts-gather
# - ref: baremetalds-packet-teardown
env:
TEST_SUITE: "all"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would override openshift-e2e-test TEST_SUITE, its not required

@@ -0,0 +1,50 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is not required - openshift-e2e-test handles that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not possible to use the openshift-e2e-test in the current workflow since we're using a different cluster profile (packet) and dev-scripts to setup remotely the cluster (see baremetalds-devscript-setup step). @eisraeli I think anyhow that it could be worth reusing the same approach in baremetalds-e2e-test step, ie running upgrade tests if a variable is set, in order to have a single test step for the baremetal workflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openshift-e2e-tests reads KUBECONFIG and doesn't need direct access to the cluster

Copy link
Contributor

@andfasano andfasano Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that the current implementation support packet profile (see

*) echo >&2 "Unsupported cluster type '${CLUSTER_TYPE}'"; exit 1;;
), and currently our KUBECONFIG points to a file generated by dev-scripts in the remote packet instance (
echo 'export KUBECONFIG=/root/dev-scripts/ocp/ostest/auth/kubeconfig' >> /root/.bashrc
). Moreover, I would not recommend such refactoring within the scope of the current PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd have to add support for it then. I don't think it has any specific flags, you'd need to reuse existing legacy templates.

currently our KUBECONFIG points to a file generated by dev-scripts in the remote packet instance

Copy it to ${SHARED_DIR} - AWS install does that:

cp \
    -t "${SHARED_DIR}" \
    "${dir}/auth/kubeconfig" \
    "${dir}/metadata.json"

Moreover, I would not recommend such refactoring within the scope of the current PR.

This is necessary for this PR to land and be approved by OTA team

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API server is not accessible remotely

I'm not seeing any tricks to access it in this commit. If packet requires any I suppose we can fork the upgrade test step, but that is very undesirable and goes against the whole purpose of multistep testing.

What is the OTA team?

Team taking care of upgrades, including CI upgrade tests.

Copy link
Member

@stbenjam stbenjam Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API server is not accessible remotely

I'm not seeing any tricks to access it in this commit. If packet requires any I suppose we can fork the upgrade test step, but that is very undesirable and goes against the whole purpose of multistep testing.

What do you mean? It's not about Packet. It's because we save significant amounts of money by running baremetal clusters through emulated baremetal with libvirt + vbmc/sushy-tools. The cluster is contained within the libvirt networks on the host, therefore the API is not exposed on the public IP. Maybe we could do it with some iptables/xinetd trickery, but we haven't looked into that yet.

The entire purpose of baremetalds-e2e-upgrade-test-commands is that it runs the e2e-tests suites by SSH'ing to the host. I agree about mulisteps tests should be useful for everyone, and in fact the baremetalds workflows existed long before any other platforms did, and when it came time to create the ipi, upgrade, etc steps our specific use cases weren't considered so most of our steps remained special.

Copy link
Contributor

@vrutkovs vrutkovs Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster is contained within the libvirt networks on the host, therefore the API is not exposed on the public IP

This workflow would create pods on api.ci / build01 / build02 CI cluster. How would they reach the created cluster?

This should should not hardcode the release, it should not use oc adm upgrade or any other home-grown way to upgrade the cluster. It should use openshift-tests run-upgrade --to-image=${OPENSHIFT_UPGRADE_RELEASE_IMAGE_OVERRIDE}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • They do not reach directly the cluster, all the operations are ssh'ed to the target Packet host that is currently containing the newly spawned cluster
  • We are in sync to not hardcode the release: when Eran started the activity the feature was not yet available for the multistage tests (the PR is still a draft)
  • The goal is to run an oc adm upgrade - or any other standard test command that will perform the upgrade on the target cluster (ie like the openshift-tests run-upgrade you suggested I suppose)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay, that was my concern - upgrade should happen via openshift-tests run-upgrade. This test should be able to reach the target host via ssh (iiuc its already implemented in templates)

Signed-off-by: Eran Israeli <eisraeli@redhat.com>
openshift-tests "${TEST_COMMAND}" "${TEST_SUITE}" ${TEST_ARGS:-} \
# --provider "${TEST_PROVIDER}" \
-o /tmp/artifacts/e2e.log \
--junit-dir /tmp/artifacts/junit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that such logic should be merged within the existing baremetalds-e2e-test command, which already performs some work for filtering and reporting results (probably just reusing ${TEST_COMMAND} could be enough).
In this way, during the upgrade workflow, only the upgrade tests will be executed by the workflow test step, without the need of an additional dedicated step.

Copy link
Contributor Author

@eisraeli eisraeli Oct 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it definitely should be merged with the existing openshift-e2e-test command at some point.
As you indicated before here:
https://github.com/openshift/release/pull/11436/files#r489434008
baremetal-e2e-test command currently does not support packet ATM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in that comment https://github.com/openshift/release/pull/11436/files#r489434008 I was referring another step (openshift-e2e-test) defined outside the baremetalds workflow.

The baremetads-e2e-test instead it's the step currently being executed by the baremetalds workflow, and it's one that will require to be modified to accommodate upgrade tests execution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I guess I misread that.
For some reason I was thinking of openshift-e2e-test while you referring to baremetalds-e2e-test


echo "Finished upgrading from version $original_version to version $upgrade_version_label !"

END
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire command could be removed

Copy link
Contributor Author

@eisraeli eisraeli Oct 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the commented out section which used 'oc adm upgrade' ? - I removed it.
baremetalds-e2e-upgrade-upgrade-commands.sh now contains only the 'openshift-tests run-upgrade' command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment, about reusing the baremetalds-e2e-test step for executing upgrade tests (and reporting correctly back the results). This will align our execution model also with the ones of the other cluster profiles.

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: eisraeli
To complete the pull request process, please assign andfasano after the PR has been reviewed.
You can assign the PR to them by writing /assign @andfasano in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Name of the CSI driver manifest file to use. Used by the `openshift-tests`
program as TEST_CSI_DRIVER_FILES env. var., see its documentation for
details. The file must be present in ${SHARED_DIR}.
resources:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
resources:
dependencies:
- name: "release:latest"
env: OPENSHIFT_UPGRADE_RELEASE_IMAGE_OVERRIDE
resources:

See https://github.com/openshift/release/blob/master/ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-ref.yaml#L27-L29 - this sets env var with the release used to upgrade. In the periodic job external script would create release:latest and release:initial

echo "${TESTS}" | grep "${TEST_SKIPS}"
TEST_ARGS="${TEST_ARGS:-} --file /tmp/tests"
fi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if -n "${OPENSHIFT_UPGRADE_RELEASE_IMAGE_OVERRIDE}" ]]; then
# OPENSHIFT_UPGRADE_RELEASE_IMAGE_OVERRIDE is a pullspec of release:latest imagestreamtag
TEST_ARGS="${TEST_ARGS:-} --to-image=${OPENSHIFT_UPGRADE_RELEASE_IMAGE_OVERRIDE}"
fi

See https://github.com/openshift/release/blob/master/ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh#L73-L76

Signed-off-by: Eran Israeli <eisraeli@redhat.com>
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 13, 2020
@openshift-ci-robot
Copy link
Contributor

@eisraeli: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@eisraeli: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/ci-testgrid-allow-list cefbbdd link /test ci-testgrid-allow-list
ci/rehearse/openshift/baremetal-operator/master/e2e-metal-ipi 28b0272a157188467b531d28ed4bf8282cc7be9d link /test pj-rehearse
ci/prow/owners 015fa4d link /test owners
ci/build-farm/app-ci-config-dry 015fa4d link /test app-ci-config-dry
ci/prow/pj-rehearse 015fa4d link /test pj-rehearse
ci/build-farm/build01-dry 015fa4d link /test build01-dry
ci/prow/release-controller-config 015fa4d link /test release-controller-config
ci/prow/ci-operator-config-metadata 015fa4d link /test ci-operator-config-metadata
ci/prow/core-dry 015fa4d link /test core-dry
ci/build-farm/build02-dry 015fa4d link /test build02-dry
ci/prow/ci-operator-registry 015fa4d link /test ci-operator-registry
ci/prow/generated-dashboards 015fa4d link /test generated-dashboards
ci/prow/config 015fa4d link /test config
ci/prow/prow-config-semantics 015fa4d link /test prow-config-semantics
ci/prow/boskos-config 015fa4d link /test boskos-config
ci/prow/prow-config 015fa4d link /test prow-config
ci/prow/services-dry 015fa4d link /test services-dry
ci/build-farm/vsphere-dry 015fa4d link /test vsphere-dry
ci/prow/step-registry-metadata 015fa4d link /test step-registry-metadata
ci/prow/yamllint 015fa4d link /test yamllint
ci/prow/generated-config 015fa4d link /test generated-config
ci/prow/ordered-prow-config 015fa4d link /test ordered-prow-config
ci/prow/correctly-sharded-config 015fa4d link /test correctly-sharded-config
ci/prow/step-registry-shellcheck 015fa4d link /test step-registry-shellcheck
ci/prow/ci-operator-config 015fa4d link /test ci-operator-config
ci/prow/prow-config-filenames 015fa4d link /test prow-config-filenames
ci/prow/core-valid 015fa4d link /test core-valid
ci/prow/services-valid 015fa4d link /test services-valid

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@eisraeli
Copy link
Contributor Author

Did some cleanup on this draft and decided to submit a fresh PR

@eisraeli eisraeli closed this Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants