Skip to content

Conversation

@wking
Copy link
Member

@wking wking commented Apr 12, 2022

We've had config-operator rendering on the bootstrap node since 9994d37 (#1187). Motivation for that commit isn't clear to me; this comment suggests maybe keeping CRDs out of the installer repository. But we run a rendered cluster-version operator on the bootstrap machine since 63e2750 (#330), so config manifests should have been getting pushed at bootstrap time via the CVO. Drop the config rendering, so we can see if things work without the config-rendered cluster-bootstrap pushes racing the bootstrap CVO pushes.

@openshift-ci openshift-ci bot requested review from AnnaZivkovic and r4f4 April 12, 2022 16:30
@wking
Copy link
Member Author

wking commented Apr 12, 2022

e2e-gcp:

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_installer/5800/pull-ci-openshift-installer-master-e2e-gcp/1513917400393715712/artifacts/e2e-gcp/ipi-install-install/artifacts/log-bundle-20220412175035.tar | tar xz --strip-components=1
$ grep -B2 'Main process exited' bootstrap/journals/bootkube.log | sed -n 's/^Apr 12 [0-9:]* //p'  | sed 's/[[][0-9]*]/[...]/' | sort | uniq -c
     86 ci-op-l44cr94r-15937-7d7tq-bootstrap bootkube.sh[...]: Error: error creating container storage: the container name "mco-render" is already in use by "877829c1add4c25f797255405478c4b8f75da308bc5cb3037e171487eec0ff37". You have to remove that container to be able to reuse that name.: that name is already in use
      1 ci-op-l44cr94r-15937-7d7tq-bootstrap bootkube.sh[...]: F0412 17:31:21.586273       1 bootstrap.go:118] error rendering bootstrap manifests: failed to load the cloud provider config: open /assets/config-bootstrap/cloud-provider-config-generated.yaml: no such file or directory
     86 ci-op-l44cr94r-15937-7d7tq-bootstrap bootkube.sh[...]: Rendering MCO manifests...
      1 ci-op-l44cr94r-15937-7d7tq-bootstrap mco-render[...]: F0412 17:31:21.586273       1 bootstrap.go:118] error rendering bootstrap manifests: failed to load the cloud provider config: open /assets/config-bootstrap/cloud-provider-config-generated.yaml: no such file or directory
     86 ci-op-l44cr94r-15937-7d7tq-bootstrap systemd[...]: bootkube.service: Main process exited, code=exited, status=125/n/a
      1 ci-op-l44cr94r-15937-7d7tq-bootstrap systemd[...]: bootkube.service: Main process exited, code=exited, status=255/n/a

Seems like we could use a larger clear to avoid You have to remove that container to be able to reuse that name. I'll see if I can work that up, and then circle back.

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 12, 2022
@wking
Copy link
Member Author

wking commented Apr 13, 2022

Ah, also in bootkube.log:

Apr 12 17:31:21 ci-op-l44cr94r-15937-7d7tq-bootstrap mco-render[6402]: I0412 17:31:21.584043       1 bootstrap.go:84] Version: machine-config-daemon-4.6.0-202006240615.p0-1372-ge9d8ff52-dirty (e9d8ff5274da6bdf98f9bc0a05bea12035030151)
Apr 12 17:31:21 ci-op-l44cr94r-15937-7d7tq-bootstrap bootkube.sh[2632]: I0412 17:31:21.584043       1 bootstrap.go:84] Version: machine-config-daemon-4.6.0-202006240615.p0-1372-ge9d8ff52-dirty (e9d8ff5274da6bdf98f9bc0a05bea12035030151)
Apr 12 17:31:21 ci-op-l44cr94r-15937-7d7tq-bootstrap mco-render[6402]: F0412 17:31:21.586273       1 bootstrap.go:118] error rendering bootstrap manifests: failed to load the cloud provider config: open /assets/config-bootstrap/cloud-provider-config-generated.yaml: no such file or directory
Apr 12 17:31:21 ci-op-l44cr94r-15937-7d7tq-bootstrap bootkube.sh[2632]: F0412 17:31:21.586273       1 bootstrap.go:118] error rendering bootstrap manifests: failed to load the cloud provider config: open /assets/config-bootstrap/cloud-provider-config-generated.yaml: no such file or directory
Apr 12 17:31:21 ci-op-l44cr94r-15937-7d7tq-bootstrap systemd[1]: bootkube.service: Main process exited, code=exited, status=255/n/a

So we need some kind of "if the name exists, remove it" guard up around here for the You have to remove that container to be able to reuse that name issue, and then I need to trace out how cloud-provider-config-generated.yaml comes in.

wking added a commit to wking/openshift-installer that referenced this pull request Apr 13, 2022
Avoid [1]:

  Error: error creating container storage: the container name "mco-render" is already in use by "877829c1add4c25f797255405478c4b8f75da308bc5cb3037e171487eec0ff37". You have to remove that container to be able to reuse that name.: that name is already in use

and similar for other containers by removing inconsistent --rm options
and baking that in at the bootkube_podman_run level.

Also add an etcd-bootstrap rm call, to clear out any cruft from a
previous bootkube round before calling Podman for a fresh etcd render.

[1]: openshift#5800 (comment)
@wking wking force-pushed the drop-config-render branch from 66bfad5 to 934c879 Compare April 13, 2022 04:10
@wking wking force-pushed the drop-config-render branch 2 times, most recently from 7b8ba14 to 24f8445 Compare April 13, 2022 06:36
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 13, 2023
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 14, 2023
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 14, 2023
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this Jun 13, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2023

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking reopened this Oct 12, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 12, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign andfasano for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

We've had config-operator rendering on the bootstrap node since
9994d37 (bootkube: render config.openshift.io resources,
2019-02-12, openshift#1187).  Motivation for that commit isn't clear to me; [1]
suggests maybe keeping CRDs out of the installer repository.  But we
run a rendered cluster-version operator on the bootstrap machine since
63e2750 (ignition: add CVO render to bootkube.sh, 2018-09-27, openshift#330),
so we should be able to push resources at bootstrap time via the CVO.
Remove CRDs from the config rendering, so we can see if things work
without the config-rendered cluster-bootstrap pushes racing the
bootstrap CVO pushes, or the config-rendered pushes not realizing they
should filter out manifests annotated for capabilities that are not
enabled.

[1]: openshift#1187 (comment)
@wking wking force-pushed the drop-config-render branch from 24f8445 to 1f115ab Compare October 12, 2023 06:48
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 12, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 12, 2023

@wking: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-shared-vpc 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-gcp-shared-vpc
ci/prow/e2e-aws-workers-rhel8 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-aws-workers-rhel8
ci/prow/e2e-libvirt 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-libvirt
ci/prow/e2e-openstack-parallel 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-openstack-parallel
ci/prow/e2e-metal-ipi-ovn-ipv6 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/okd-e2e-aws 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test okd-e2e-aws
ci/prow/e2e-alibaba 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-alibaba
ci/prow/e2e-metal-single-node-live-iso 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-metal-single-node-live-iso
ci/prow/e2e-openstack 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-openstack
ci/prow/e2e-ovirt 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-ovirt
ci/prow/e2e-ibmcloud 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-ibmcloud
ci/prow/openstack-manifests 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test openstack-manifests
ci/prow/e2e-aws-single-node 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-aws-single-node
ci/prow/e2e-openstack-proxy 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-openstack-proxy
ci/prow/e2e-aws-disruptive 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-aws-disruptive
ci/prow/e2e-openstack-kuryr 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-openstack-kuryr
ci/prow/e2e-aws-shared-vpc 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-aws-shared-vpc
ci/prow/e2e-azurestack 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-azurestack
ci/prow/e2e-azure-shared-vpc 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-azure-shared-vpc
ci/prow/e2e-gcp-upi-xpn 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-gcp-upi-xpn
ci/prow/e2e-crc 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-crc
ci/prow/e2e-aws-fips 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-aws-fips
ci/prow/e2e-azure-resourcegroup 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-azure-resourcegroup
ci/prow/e2e-aws-proxy 24f8445ce1ca3d644506840f0442f72cb559b97c link false /test e2e-aws-proxy
ci/prow/e2e-gcp-upi 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-gcp-upi
ci/prow/e2e-azure-upi 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-azure-upi
ci/prow/e2e-aws-upi 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-aws-upi
ci/prow/e2e-gcp 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-gcp
ci/prow/e2e-aws 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-aws
ci/prow/e2e-vsphere 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-vsphere
ci/prow/e2e-azure 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-azure
ci/prow/e2e-vsphere-ovn 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-vsphere-ovn
ci/prow/e2e-azure-ovn 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-azure-ovn
ci/prow/e2e-gcp-ovn 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-gcp-ovn
ci/prow/e2e-agent-compact-ipv4 24f8445ce1ca3d644506840f0442f72cb559b97c link true /test e2e-agent-compact-ipv4
ci/prow/e2e-aws-ovn 1f115ab link true /test e2e-aws-ovn
ci/prow/okd-e2e-aws-ovn-upgrade 1f115ab link false /test okd-e2e-aws-ovn-upgrade
ci/prow/e2e-aws-custom-security-groups 1f115ab link false /test e2e-aws-custom-security-groups
ci/prow/okd-scos-e2e-aws-ovn 1f115ab link false /test okd-scos-e2e-aws-ovn
ci/prow/okd-e2e-aws-ovn 1f115ab link false /test okd-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this Nov 12, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 12, 2023

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants