Skip to content

Conversation

@wking
Copy link
Member

@wking wking commented Oct 6, 2020

Using the openshift-e2e-gcp workflow and overriding the test step per these docs to run our operator tests instead of the usual e2e suite.

I've dropped cincinnati from the job name, because this presubmit only runs in the cincinnati-operator repository. The fact that it is operator-e2e is sufficient to distinguish from other presubmits in that repository.

I've dropped aws from the job name, because we are platform-agnostic.

Generated by editing ci-operator/config and then running:

$ make update

WIP because once we get a green rehearsal I'll extend this to cover 4.6+.

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 6, 2020
@openshift-ci-robot
Copy link
Contributor

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/rehearse/openshift/cincinnati-operator/master/operator-e2e 32e11cd9093b8cfd0dbf8bdd9273ed2e5a160959 link /test pj-rehearse
ci/prow/pj-rehearse 32e11cd9093b8cfd0dbf8bdd9273ed2e5a160959 link /test pj-rehearse

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@wking
Copy link
Member Author

wking commented Oct 6, 2020

Hmm, we need oc, our operator source checkout, and a Go toolchain. Maybe that's too much to pack into a single step? Also in this space is OLM-operator CI support.

@vrutkovs
Copy link
Contributor

vrutkovs commented Oct 7, 2020

Hmm, we need oc, our operator source checkout, and a Go toolchain. Maybe that's too much to pack into a single step?

Sounds like we need functests/Dockerfile based on Go toolchain with oc included

@jottofar
Copy link
Contributor

jottofar commented Oct 7, 2020

Not sure the e2e test should be using deploy.sh. That's just a convenience script for deploying manually. It's also used when running the unit tests. I would think we want our e2e test to test the way this will be deployed in the field which is per step 11 of https://github.com/openshift/cincinnati-operator/blob/master/docs/disconnected-cincinnati-operator.md or at least along those lines and/or per the doc you referenced OLM-operator CI support.

@jottofar
Copy link
Contributor

jottofar commented Oct 7, 2020

Also, I'm in the process of fixing deploy.sh because the files it references have changed locations and names.

@wking wking force-pushed the step-workflow-cincy-operator-e2e branch from 32e11cd to 7abfc1a Compare October 14, 2020 22:32
@wking
Copy link
Member Author

wking commented Oct 14, 2020

With 32e11cd909 -> 7abfc1a8c4, I've rebased onto master and added cli: initial now that openshift/ci-tools#1296 has given us oc injection.

@wking wking force-pushed the step-workflow-cincy-operator-e2e branch from 7abfc1a to 4eb6956 Compare October 14, 2020 22:50
@wking
Copy link
Member Author

wking commented Oct 15, 2020

Still failing to build ginkgo:

 	bitbucket.org/ww/[email protected]: reading https://api.bitbucket.org/2.0/repositories/ww/goautoneg?fields=scm: 404 Not Found
hack/functest.sh: line 17: /go/bin/ginkgo: No such file or directory
hack/functest.sh: line 19: /go/bin/ginkgo: No such file or directory
make: *** [func-test] Error 127
error: failed to execute wrapped command: exit status 2

@wking
Copy link
Member Author

wking commented Oct 20, 2020

openshift/cincinnati-operator#69 landed.

/retest

@wking
Copy link
Member Author

wking commented Oct 21, 2020

/test pj-rehearse

@wking wking force-pushed the step-workflow-cincy-operator-e2e branch from 4eb6956 to f547d44 Compare October 22, 2020 23:58
@wking
Copy link
Member Author

wking commented Oct 23, 2020

Rehearsal:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/12486/rehearse-12486-pull-ci-openshift-cincinnati-operator-master-operator-e2e/1318745953997426688/build-log.txt | grep -1 'into stable\|panic'
2020/10/21 02:51:30 Build cincinnati-graph-data-container succeeded after 1m28s
2020/10/21 02:51:30 Tagging cincinnati-graph-data-container into stable
2020/10/21 02:51:49 Build cincinnati-operator succeeded after 1m47s
2020/10/21 02:51:49 Tagging cincinnati-operator into stable
2020/10/21 02:51:50 Create release image registry.build01.ci.openshift.org/ci-op-qwfkx85i/release:latest
--
I1021 03:29:37.177795    4880 utils.go:121] Waiting for full availability of cincinnati-operator deployment (0/1)
panic: test timed out after 10m0s

because of:

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/12486/rehearse-12486-pull-ci-openshift-cincinnati-operator-master-operator-e2e/1318745953997426688/artifacts/operator-e2e/gather-extra/pods.json | jq -r '.items[] | select(.metadata.name | startswith("cincinnati-operator-")).status.containerStatuses[].state.waiting.message' 
Back-off pulling image "registry.svc.ci.openshift.org/ci-op-qwfkx85i/stable:cincinnati-operator"

Like here. The failure is because the operator-repo-hard-coded registry.svc.ci.openshift.org default does not match the registry.build01.ci.openshift.org where the CI operator was injecting the images. 4eb69563b3 -> f547d44f93 moves to explicit dependency images, dropping our reliance on the unreliable operator-repo-hard-coded values. A few more minor details are documented in the f547d44f93 commit message.

@wking wking force-pushed the step-workflow-cincy-operator-e2e branch from f547d44 to 268a47a Compare October 23, 2020 00:06
@wking
Copy link
Member Author

wking commented Oct 23, 2020

Hrm:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/12486/rehearse-12486-pull-ci-openshift-cincinnati-operator-master-operator-e2e/1319430438657200128/build-log.txt | grep -1 'into stable\|panic'
2020/10/23 00:15:02 Build cincinnati-operator succeeded after 3m51s
2020/10/23 00:15:02 Tagging cincinnati-operator into stable
2020/10/23 00:26:03 Build cincinnati-graph-data-container succeeded after 14m52s
2020/10/23 00:26:03 Tagging cincinnati-graph-data-container into stable
2020/10/23 00:26:03 Create release image registry.build01.ci.openshift.org/ci-op-q1invh8m/release:latest
--
I1023 01:21:33.184314    4893 utils.go:121] Waiting for full availability of cincinnati-operator deployment (0/1)
panic: test timed out after 10m0s

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/12486/rehearse-12486-pull-ci-openshift-cincinnati-operator-master-operator-e2e/1319430438657200128/artifacts/operator-e2e/gather-extra/pods.json | jq -r '.items[] | select(.metadata.name | startswith("cincinnati-operator-")).status.containerStatuses[].state.waiting.message'
container create failed: time="2020-10-23T01:26:41Z" level=error msg="container_linux.go:366: starting container process caused: exec: \"cincinnati-operator\": executable file not found in $PATH"

Progress 😆

@wking
Copy link
Member Author

wking commented Oct 23, 2020

I think that's good enough, and that we should land this as it stands, go clean some stuff up in the operator repo, and then come back and polish off the remaining hacks on this side.

@wking wking changed the title WIP: ci-operator/config/openshift/cincinnati-operator: Move e2e-operator to multi-step ci-operator/config/openshift/cincinnati-operator: Move e2e-operator to multi-step Oct 23, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 23, 2020
…o multi-step

Using the openshift-e2e-gcp workflow and overriding the test step per
[1] to run our operator tests instead of the usual e2e suite.

I've dropped "cincinnati" from the job name, because this presubmit
only runs in the cincinnati-operator repository.  The fact that it is
operator-e2e is sufficient to distinguish from other presubmits in
that repository.

I've dropped "aws" from the job name, because we are platform-agnostic
(see ci-operator/platform-balance).

The 'cli: initial' property injects 'oc' into the step container
[2,3], because we need 'oc', a Go toolchain, and our source checkout
to run the CI suite.

The dependencies avoid [4]:

  $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/12486/rehearse-12486-pull-ci-openshift-cincinnati-operator-master-operator-e2e/1318745953997426688/build-log.txt | grep -1 'into stable\|panic'
  2020/10/21 02:51:30 Build cincinnati-graph-data-container succeeded after 1m28s
  2020/10/21 02:51:30 Tagging cincinnati-graph-data-container into stable
  2020/10/21 02:51:49 Build cincinnati-operator succeeded after 1m47s
  2020/10/21 02:51:49 Tagging cincinnati-operator into stable
  2020/10/21 02:51:50 Create release image registry.build01.ci.openshift.org/ci-op-qwfkx85i/release:latest
  --
  I1021 03:29:37.177795    4880 utils.go:121] Waiting for full availability of cincinnati-operator deployment (0/1)
  panic: test timed out after 10m0s

because of:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/12486/rehearse-12486-pull-ci-openshift-cincinnati-operator-master-operator-e2e/1318745953997426688/artifacts/operator-e2e/gather-extra/pods.json | jq -r '.items[] | select(.metadata.name | startswith("cincinnati-operator-")).status.containerStatuses[].state.waiting.message'
  Back-off pulling image "registry.svc.ci.openshift.org/ci-op-qwfkx85i/stable:cincinnati-operator"

The failure is because the operator-repo-hard-coded
registry.svc.ci.openshift.org default does not match the
registry.build01.ci.openshift.org where the CI operator was injecting
the images.  By using explicit dependency images, we drop our reliance
on the unreliable operator-repo-hard-coded values.

I'm also setting OPERAND_IMAGE to the most recent published image:

  $ skopeo inspect docker://quay.io/app-sre/cincinnati@sha256:d1d2f881bce1a1375ec8470133ee0a912164b8a7ecce19aac24d24e623aef59b | jq -r .Created
  2020-10-12T17:08:41.179845937Z

In a future pivot we'll pull the operand image out of CI too, instead
of hard-coding.  But with this change we at least move the hard-coding
into the CI repository.

And I'm clearing OPENSHIFT_BUILD_NAMESPACE, because hack/deploy.sh
uses it to clobber both OPERATOR_IMAGE and GRAPH_DATA_IMAGE [4], and
we don't want those clobbered anymore.  Once we have green CI, we can
update the operator repo to simplify the logic.

Generated by editing ci-operator/config and then running:

  $ make update

[1]: https://steps.ci.openshift.org/help#config
[2]: openshift/ci-tools#1296
[3]: https://docs.ci.openshift.org/docs/architecture/step-registry/#injecting-the-oc-cli
[4]: https://github.com/openshift/cincinnati-operator/blob/8fce9de9dfe004249b9b19a83d1cbec3c4095965/hack/deploy.sh#L11
@wking wking force-pushed the step-workflow-cincy-operator-e2e branch from 268a47a to efcafb6 Compare October 23, 2020 02:31
@wking
Copy link
Member Author

wking commented Oct 23, 2020

/assign @jottofar

@openshift-merge-robot
Copy link
Contributor

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/rehearse/openshift/cincinnati-operator/release-4.5/operator-e2e efcafb6 link /test pj-rehearse
ci/rehearse/openshift/cincinnati-operator/release-4.6/operator-e2e efcafb6 link /test pj-rehearse
ci/rehearse/openshift/cincinnati-operator/release-4.8/operator-e2e efcafb6 link /test pj-rehearse
ci/rehearse/openshift/cincinnati-operator/master/operator-e2e efcafb6 link /test pj-rehearse
ci/rehearse/openshift/cincinnati-operator/release-4.7/operator-e2e efcafb6 link /test pj-rehearse
ci/prow/pj-rehearse efcafb6 link /test pj-rehearse

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jottofar
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 23, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jottofar, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 37939f1 into openshift:master Oct 23, 2020
@openshift-ci-robot
Copy link
Contributor

@wking: Updated the following 15 configmaps:

  • job-config-4.6 configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-release-4.6-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.6-presubmits.yaml
  • job-config-4.8 configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-release-4.8-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.8-presubmits.yaml
  • ci-operator-4.7-configs configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-release-4.7.yaml using file ci-operator/config/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.7.yaml
  • job-config-master configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-master-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-master-presubmits.yaml
  • job-config-4.5 configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-release-4.5-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.5-presubmits.yaml
  • job-config-4.6 configmap in namespace ci at cluster api.ci using the following files:
    • key openshift-cincinnati-operator-release-4.6-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.6-presubmits.yaml
  • job-config-4.7 configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-release-4.7-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.7-presubmits.yaml
  • ci-operator-4.5-configs configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-release-4.5.yaml using file ci-operator/config/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.5.yaml
  • ci-operator-4.6-configs configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-release-4.6.yaml using file ci-operator/config/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.6.yaml
  • job-config-4.8 configmap in namespace ci at cluster api.ci using the following files:
    • key openshift-cincinnati-operator-release-4.8-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.8-presubmits.yaml
  • job-config-master configmap in namespace ci at cluster api.ci using the following files:
    • key openshift-cincinnati-operator-master-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-master-presubmits.yaml
  • job-config-4.7 configmap in namespace ci at cluster api.ci using the following files:
    • key openshift-cincinnati-operator-release-4.7-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.7-presubmits.yaml
  • job-config-4.5 configmap in namespace ci at cluster api.ci using the following files:
    • key openshift-cincinnati-operator-release-4.5-presubmits.yaml using file ci-operator/jobs/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.5-presubmits.yaml
  • ci-operator-master-configs configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-master.yaml using file ci-operator/config/openshift/cincinnati-operator/openshift-cincinnati-operator-master.yaml
  • ci-operator-4.8-configs configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-cincinnati-operator-release-4.8.yaml using file ci-operator/config/openshift/cincinnati-operator/openshift-cincinnati-operator-release-4.8.yaml
Details

In response to this:

Using the openshift-e2e-gcp workflow and overriding the test step per these docs to run our operator tests instead of the usual e2e suite.

I've dropped cincinnati from the job name, because this presubmit only runs in the cincinnati-operator repository. The fact that it is operator-e2e is sufficient to distinguish from other presubmits in that repository.

I've dropped aws from the job name, because we are platform-agnostic.

Generated by editing ci-operator/config and then running:

$ make update

WIP because once we get a green rehearsal I'll extend this to cover 4.6+.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking deleted the step-workflow-cincy-operator-e2e branch October 23, 2020 14:16
wking added a commit to wking/openshift-release that referenced this pull request Oct 23, 2020
…incy

The Cincinnati image is the operand, not the operator.  Fixes a typo
from efcafb6 (ci-operator/config/openshift/cincinnati-operator:
Move e2e-operator to multi-step, 2020-10-06, openshift#12486).
wking added a commit to wking/openshift-release that referenced this pull request Nov 3, 2023
…p-published-graph-data, etc.

Moving to a recent Go builder, based on [1] and:

  $ oc -n ocp get -o json imagestream builder | jq -r '.status.tags[] | select(.items | length > 0) | .items[0].created + " " + .tag' | sort | grep golang
  ...
  2023-11-02T19:53:15Z rhel-8-golang-1.18-openshift-4.11
  2023-11-02T19:53:23Z rhel-8-golang-1.17-openshift-4.11
  2023-11-02T20:49:19Z rhel-8-golang-1.19-openshift-4.13
  2023-11-02T20:49:25Z rhel-9-golang-1.19-openshift-4.13
  2023-11-02T21:54:25Z rhel-9-golang-1.20-openshift-4.14
  2023-11-02T21:54:46Z rhel-8-golang-1.20-openshift-4.14
  2023-11-02T21:55:24Z rhel-8-golang-1.19-openshift-4.14
  2023-11-02T21:55:29Z rhel-9-golang-1.19-openshift-4.14

I'd tried dropping the build_root stanza, because we didn't seem to
need the functionality it delivers [2].  But that removal caused
failures like [3]:

  Failed to load CI Operator configuration" error="invalid ci-operator config: invalid configuration: when 'images' are specified 'build_root' is required and must have image_stream_tag, project_image or from_repository set" source-file=ci-operator/config/openshift/cincinnati-operator/openshift-cincinnati-operator-master.yaml

And [2] docs a need for Git, which apparently the UBI images don't
have.  So I'm using a Go image here still, even though we don't need
Go, and although that means some tedious bumping to keep up with RHEL
and Go versions instead of floating.

The operators stanza doc'ed in [4] remains largely unchanged, although
I did rename 'cincinnati_operand_latest' to 'cincinnati-operand',
because these tests use a single operand image, and there is no need
to distinguish between multiple operand images with "latest".

The image used for operator-sdk (which I bump to an OpenShift 4.14
base) and its use are doc'ed in [5].  The 4.14 cluster-claim pool I'm
transitioning to is listed as healthy in [6].

For the end-to-end tests, we install the operator via the test suite,
so we do not need the SDK bits.  I've dropped OPERATOR_IMAGE, because
we are well past the transition initiated by eae9d38
(ci-operator/config/openshift/cincinnati-operator: Set
RELATED_IMAGE_*, 2021-04-05, openshift#17435) and
openshift/cincinnati-operator@799d18525b (Changing the name to make
OSBS auto repo/registry replacements to work, 2021-04-06,
openshift/cincinnati-operator#104).

I'm consistently using the current Cincinnati operand instead of the
pinned one, because we ship the OpenShift Update Service Operator as a
bundle with the operator and operand, and while it might be useful to
grow update-between-OSUS-releases test coverage, we do not expect long
durations of new operators coexisting with old-image operand pods.
And we never expect new operators to touch Deployments with old
operand images, except to bump them to new operand images.  We'd been
using digest-pinned operand images here since efcafb6
(ci-operator/config/openshift/cincinnati-operator: Move e2e-operator
to multi-step, 2020-10-06, openshift#12486), where I said:

  In a future pivot we'll pull the operand image out of CI too,
  instead of hard-coding.  But with this change we at least move the
  hard-coding into the CI repository.

4f46d7e (cincinnati-operator: test operator against released OSUS
version and latest master, 2022-01-11, openshift#25152) brought in that
floating operand image, but neglected, for reasons that I am not clear
on, did not drop the digest-pinned operand.  I'm dropping it now.

With "which operand image" removed as a differentiator, the remaining
differentiators for the end-to-end tests are:

* Which host OpenShift?
  * To protect from "new operators require new platform capabilities
    not present in older OpenShift releases", we have an old-ocp job.
    It's currently 4.11 for the oldest supported release [7].
  * To protect from "new operators still use platform capabilities
    that have been removed from development branches of OpenShift", we
    have a new-ocp job.  It's currently 4.14, as the most modern
    openshift-ci pool in [6], but if there was a 4.15 openshift-ci
    pool I'd us that to ensure we work on dev-branch engineering
    candidates like 4.15.0-ec.1.
  * To protect against "HyperShift does something the operator does
    not expect", we have a hypershift job.  I'd prefer to defer "which
    version?" to the workflow, because we do not expect
    HyperShift-specific difference to evolve much between 4.y
    releases, while the APIs used by the operator (Deployments,
    Services, Routes, etc.) might.  But perhaps I'm wrong, and we will
    see more API evolution during HyperShift minor versions.  And in
    any case, today 4.14 fails with [8]:

      Unable to apply 4.14.1: some cluster operators are not available

    so in the short term I'm going with 4.13, but with a generic name
    so we only have to bump one place as HyperShift support improves.
  * I'm not worrying about enumerating all the current 4.y options
    like we had done before.  That is more work to maintain, and
    renaming required jobs confuses Prow and requires an /override of
    the removed job.  It seems unlikely that we work on 4.old, break
    on some 4.middle, and work again on 4.dev.  Again, we can always
    revisit this if we change our minds about the exposure.

* Which graph-data?
  * To protect against "I updated my OSUS without changing the
    graph-data image, and it broke", we have published-graph-data
    jobs.  These consume images that were built by previous
    postsubmits in the cincinnati-graph-data repository.
  * We could theoretically also add coverage for older forms of
    graph-data images we suspect customers might be using.  I'm
    punting this kind of thing to possible future work, if we decide
    the exposure is significant enough to warrant ongoing CI coverage.
  * To allow testing new features like serving signatures, we have a
    local-graph-data job.  This consumes a graph-data image built from
    steps in the operator repository, allowing convenient testing of
    changes that simultaneously tweak the operator and how the
    graph-data image is built.  For example, [9] injects an image
    signature into graph-data, and updates graph-data to serve it.
    I'm setting a GRAPH_DATA environment variable to 'local' to allow
    the test suite to easily distinguish this case.

[1]: https://docs.ci.openshift.org/docs/architecture/images/#ci-images
[2]: https://docs.ci.openshift.org/docs/architecture/ci-operator/#build-root-image
[3]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/45245/pull-ci-openshift-release-master-generated-config/1720218786344210432
[4]: https://docs.ci.openshift.org/docs/how-tos/testing-operator-sdk-operators/#building-operator-bundles
[5]: https://docs.ci.openshift.org/docs/how-tos/testing-operator-sdk-operators/#simple-operator-installation
[6]: https://docs.ci.openshift.org/docs/how-tos/cluster-claim/#existing-cluster-pools
[7]: https://access.redhat.com/support/policy/updates/openshift/#dates
[8]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/45245/rehearse-45245-pull-ci-openshift-cincinnati-operator-master-operator-e2e-hypershift-local-graph-data/1720287506777247744
[9]: openshift/cincinnati-operator#176
openshift-merge-bot bot pushed a commit that referenced this pull request Nov 7, 2023
…p-published-graph-data, etc. (#45245)

Moving to a recent Go builder, based on [1] and:

  $ oc -n ocp get -o json imagestream builder | jq -r '.status.tags[] | select(.items | length > 0) | .items[0].created + " " + .tag' | sort | grep golang
  ...
  2023-11-02T19:53:15Z rhel-8-golang-1.18-openshift-4.11
  2023-11-02T19:53:23Z rhel-8-golang-1.17-openshift-4.11
  2023-11-02T20:49:19Z rhel-8-golang-1.19-openshift-4.13
  2023-11-02T20:49:25Z rhel-9-golang-1.19-openshift-4.13
  2023-11-02T21:54:25Z rhel-9-golang-1.20-openshift-4.14
  2023-11-02T21:54:46Z rhel-8-golang-1.20-openshift-4.14
  2023-11-02T21:55:24Z rhel-8-golang-1.19-openshift-4.14
  2023-11-02T21:55:29Z rhel-9-golang-1.19-openshift-4.14

I'd tried dropping the build_root stanza, because we didn't seem to
need the functionality it delivers [2].  But that removal caused
failures like [3]:

  Failed to load CI Operator configuration" error="invalid ci-operator config: invalid configuration: when 'images' are specified 'build_root' is required and must have image_stream_tag, project_image or from_repository set" source-file=ci-operator/config/openshift/cincinnati-operator/openshift-cincinnati-operator-master.yaml

And [2] docs a need for Git, which apparently the UBI images don't
have.  So I'm using a Go image here still, even though we don't need
Go, and although that means some tedious bumping to keep up with RHEL
and Go versions instead of floating.

The operators stanza doc'ed in [4] remains largely unchanged, although
I did rename 'cincinnati_operand_latest' to 'cincinnati-operand',
because these tests use a single operand image, and there is no need
to distinguish between multiple operand images with "latest".

The image used for operator-sdk (which I bump to an OpenShift 4.14
base) and its use are doc'ed in [5].  The 4.14 cluster-claim pool I'm
transitioning to is listed as healthy in [6].

For the end-to-end tests, we install the operator via the test suite,
so we do not need the SDK bits.  I've dropped OPERATOR_IMAGE, because
we are well past the transition initiated by eae9d38
(ci-operator/config/openshift/cincinnati-operator: Set
RELATED_IMAGE_*, 2021-04-05, #17435) and
openshift/cincinnati-operator@799d18525b (Changing the name to make
OSBS auto repo/registry replacements to work, 2021-04-06,
openshift/cincinnati-operator#104).

I'm consistently using the current Cincinnati operand instead of the
pinned one, because we ship the OpenShift Update Service Operator as a
bundle with the operator and operand, and while it might be useful to
grow update-between-OSUS-releases test coverage, we do not expect long
durations of new operators coexisting with old-image operand pods.
And we never expect new operators to touch Deployments with old
operand images, except to bump them to new operand images.  We'd been
using digest-pinned operand images here since efcafb6
(ci-operator/config/openshift/cincinnati-operator: Move e2e-operator
to multi-step, 2020-10-06, #12486), where I said:

  In a future pivot we'll pull the operand image out of CI too,
  instead of hard-coding.  But with this change we at least move the
  hard-coding into the CI repository.

4f46d7e (cincinnati-operator: test operator against released OSUS
version and latest master, 2022-01-11, #25152) brought in that
floating operand image, but neglected, for reasons that I am not clear
on, did not drop the digest-pinned operand.  I'm dropping it now.

With "which operand image" removed as a differentiator, the remaining
differentiators for the end-to-end tests are:

* Which host OpenShift?
  * To protect from "new operators require new platform capabilities
    not present in older OpenShift releases", we have an old-ocp job.
    It's currently 4.11 for the oldest supported release [7].
  * To protect from "new operators still use platform capabilities
    that have been removed from development branches of OpenShift", we
    have a new-ocp job.  It's currently 4.14, as the most modern
    openshift-ci pool in [6], but if there was a 4.15 openshift-ci
    pool I'd us that to ensure we work on dev-branch engineering
    candidates like 4.15.0-ec.1.
  * To protect against "HyperShift does something the operator does
    not expect", we have a hypershift job.  I'd prefer to defer "which
    version?" to the workflow, because we do not expect
    HyperShift-specific difference to evolve much between 4.y
    releases, while the APIs used by the operator (Deployments,
    Services, Routes, etc.) might.  But perhaps I'm wrong, and we will
    see more API evolution during HyperShift minor versions.  And in
    any case, today 4.14 fails with [8]:

      Unable to apply 4.14.1: some cluster operators are not available

    so in the short term I'm going with 4.13, but with a generic name
    so we only have to bump one place as HyperShift support improves.
  * I'm not worrying about enumerating all the current 4.y options
    like we had done before.  That is more work to maintain, and
    renaming required jobs confuses Prow and requires an /override of
    the removed job.  It seems unlikely that we work on 4.old, break
    on some 4.middle, and work again on 4.dev.  Again, we can always
    revisit this if we change our minds about the exposure.

* Which graph-data?
  * To protect against "I updated my OSUS without changing the
    graph-data image, and it broke", we have published-graph-data
    jobs.  These consume images that were built by previous
    postsubmits in the cincinnati-graph-data repository.
  * We could theoretically also add coverage for older forms of
    graph-data images we suspect customers might be using.  I'm
    punting this kind of thing to possible future work, if we decide
    the exposure is significant enough to warrant ongoing CI coverage.
  * To allow testing new features like serving signatures, we have a
    local-graph-data job.  This consumes a graph-data image built from
    steps in the operator repository, allowing convenient testing of
    changes that simultaneously tweak the operator and how the
    graph-data image is built.  For example, [9] injects an image
    signature into graph-data, and updates graph-data to serve it.
    I'm setting a GRAPH_DATA environment variable to 'local' to allow
    the test suite to easily distinguish this case.

[1]: https://docs.ci.openshift.org/docs/architecture/images/#ci-images
[2]: https://docs.ci.openshift.org/docs/architecture/ci-operator/#build-root-image
[3]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/45245/pull-ci-openshift-release-master-generated-config/1720218786344210432
[4]: https://docs.ci.openshift.org/docs/how-tos/testing-operator-sdk-operators/#building-operator-bundles
[5]: https://docs.ci.openshift.org/docs/how-tos/testing-operator-sdk-operators/#simple-operator-installation
[6]: https://docs.ci.openshift.org/docs/how-tos/cluster-claim/#existing-cluster-pools
[7]: https://access.redhat.com/support/policy/updates/openshift/#dates
[8]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/45245/rehearse-45245-pull-ci-openshift-cincinnati-operator-master-operator-e2e-hypershift-local-graph-data/1720287506777247744
[9]: openshift/cincinnati-operator#176
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants