Skip to content

Conversation

@andfasano
Copy link
Contributor

Do not merge, just for internal testing on CI

JM1 added 2 commits October 27, 2023 19:25
OCP requires DNS records api.<cluster_domain> and *.apps.\
<cluster_domain> to be externally resolvable (<cluster_domain> is
<cluster_name>.<base_domain>). For SNO this list also includes DNS
record api-int.<cluster_domain>.

However, OCP does not enforce ownership of all subdomains of
<cluster_domain>. For example, it is allowed to host a disconnected
image registry at <registry_hostname>.<cluster_domain> and OCP shall
be able to resolve it using the user-supplied external DNS resolver.

PR openshift#7516 changed the systemd-resolved config of the bootstrap node /
rendezvous host to associate the complete <cluster_domain> with the
DNS server at 127.0.0.1 where CoreDNS is supposed to be listening.

When a disconnected image registry is used for cluster installation,
the registry is hosted at <registry_hostname>.<cluster_domain> and
the bootstrap node / rendezvous host does not retrieve its domain
from the DHCP server, then the registry's DNS name cannot be
resolved.
That is because in order to pull the CoreDNS image, the disconnected
registry must be connected. The split dns mechanism of systemd-\
resolved would cause it to send DNS requests for
<registry_hostname>.<cluster_domain> to 127.0.0.1 where CoreDNS is
expected to be running which is not.

When a bootstrap node / rendezvous host retrieves its domain
<cluster_domain> from a DHCP server (e.g. dnsmasq's '--domain'
option) then systemd-resolved would associate <cluster_domain> not
only with 127.0.0.1 but also with the physical network interface,
causing DNS requests for <registry_hostname>.<cluster_domain> to be
send out to 127.0.0.1 as well as the external DNS resolver.

This patch mitigates the DNS issue for other network setups. It
changes the systemd-resolved config to forward DNS requests to
CoreDNS only for domains which are resolvable by CoreDNS:

* api.<cluster_domain>
* api-int.<cluster_domain>.
* apps.<cluster_domain>

DNS requests for <registry_hostname>.<cluster_domain> and other
subdomains of <cluster_domain> will be send out to the external
DNS resolver.

Fixes openshift#7516
…d Installer

OKD/FCOS uses FCOS as its bootimage, i.e. when booting cluster nodes
the first time during installation. FCOS does not provide tools such
as OpenShift Client (oc) or crio.service which Agent-based Installer
uses at the rendezvous host, e.g. to launch the bootstrap control
plane.

RHCOS and SCOS include these tools, but FCOS has to pivot the root fs
[1] to okd-machine-os [2] first in order to make those tools available.

Pivoting uses 'rpm-ostree rebase' but the rendezvous host is booted
the first time the node boots from a FCOS Live ISO where the root fs
and /sysroot are mounted read-only. Thus 'rpm-ostree rebase' fails and
necessary tools will not be available, causing the setup to stall.

Until rpm-ostree has implemented support for rebasing Live ISOs [3],
this patch adapts the workaround for SNO installations [4] to also
support Agent-based Installer.

In particular, the Go conditional {{- if .BootstrapInPlace }} which
is used to mark a SNO install has been replaced with a shell if-else
which checks at runtime whether the system is launched from are on a
Live ISO.
Most code in the OpenShift ecosystem is written with RHCOS in mind
and often assumes that tools like oc or crio.service are available.
These assumptions can be satisfied by applying this workaround to all
Live ISO boots. It will not remove functionality or overwrite
configuration files in /etc and thus side effects should be minimal.

The Go conditional {{- if .BootstrapInPlace }} in the release-image-\
pivot.service has been dropped completely. This service is only used
in OKD only, so OCP will not be impacted at all. The 'Before=' option
will not cause systemd to fail if a service does not exist. So, in
case bootkube.service or kubelet.service do not exist, the option will
have no effect.
When bootkube.service or kubelet.service do exist, it must always be
ensured that release-image-pivot.service is started first because it
might reboot the system or change /usr in the Live ISO use case.
So it is safe to drop the Go conditional and ask systemd to always
launch release-image-pivot.service before bootkube.service and
kubelet.service.

[0] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template
[1] https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootstrap-pivot.sh.template
[2] https://github.com/openshift/okd-machine-os
[3] coreos/rpm-ostree#4547
[4] openshift#7445
@andfasano andfasano changed the title [DNM] Jm1 ultra deluxe fx 32 [DNM] OKD: Combined test of PR #7484 and PR #7634 Oct 30, 2023
@andfasano andfasano changed the title [DNM] OKD: Combined test of PR #7484 and PR #7634 [DNM] [Test] OKD: Combined test of PR #7484 and PR #7634 Oct 30, 2023
@andfasano
Copy link
Contributor Author

/draft
/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 30, 2023
@openshift-ci openshift-ci bot requested review from elfosardo and r4f4 October 30, 2023 13:54
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from andfasano. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@r4f4
Copy link
Contributor

r4f4 commented Oct 30, 2023

/uncc

@openshift-ci openshift-ci bot removed the request for review from r4f4 October 30, 2023 14:00
@andfasano
Copy link
Contributor Author

/test ci/prow/okd-e2e-agent-compact-ipv4

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2023

@andfasano: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test agent-integration-tests
  • /test altinfra-images
  • /test aro-unit
  • /test e2e-agent-compact-ipv4
  • /test e2e-aws-ovn
  • /test e2e-aws-ovn-upi
  • /test e2e-azure-ovn
  • /test e2e-azure-ovn-upi
  • /test e2e-gcp-ovn
  • /test e2e-gcp-ovn-upi
  • /test e2e-metal-ipi-ovn-ipv6
  • /test e2e-openstack-ovn
  • /test e2e-vsphere-ovn
  • /test e2e-vsphere-upi
  • /test gofmt
  • /test golint
  • /test govet
  • /test images
  • /test okd-images
  • /test okd-scos-images
  • /test okd-unit
  • /test okd-verify-codegen
  • /test openstack-manifests
  • /test shellcheck
  • /test tf-lint
  • /test unit
  • /test verify-codegen
  • /test verify-vendor
  • /test yaml-lint

The following commands are available to trigger optional jobs:

  • /test altinfra-e2e-aws-ovn
  • /test altinfra-e2e-aws-ovn-imdsv2
  • /test altinfra-e2e-aws-ovn-localzones
  • /test altinfra-e2e-aws-ovn-shared-vpc
  • /test altinfra-e2e-aws-ovn-shared-vpc-localzones
  • /test altinfra-e2e-azure-ovn
  • /test altinfra-e2e-azure-ovn-resourcegroup
  • /test altinfra-e2e-azure-ovn-shared-vpc
  • /test e2e-agent-compact-ipv4-appliance
  • /test e2e-agent-compact-ipv4-none-platform
  • /test e2e-agent-ha-dualstack
  • /test e2e-agent-sno-ipv4-pxe
  • /test e2e-agent-sno-ipv6
  • /test e2e-alibaba
  • /test e2e-aws-custom-security-groups
  • /test e2e-aws-ovn-fips
  • /test e2e-aws-ovn-imdsv2
  • /test e2e-aws-ovn-localzones
  • /test e2e-aws-ovn-proxy
  • /test e2e-aws-ovn-public-subnets
  • /test e2e-aws-ovn-shared-vpc
  • /test e2e-aws-ovn-shared-vpc-localzones
  • /test e2e-aws-ovn-single-node
  • /test e2e-aws-ovn-upgrade
  • /test e2e-aws-ovn-workers-rhel8
  • /test e2e-aws-upi-proxy
  • /test e2e-azure-ovn-resourcegroup
  • /test e2e-azure-ovn-shared-vpc
  • /test e2e-azurestack
  • /test e2e-azurestack-upi
  • /test e2e-crc
  • /test e2e-gcp-ovn-shared-vpc
  • /test e2e-gcp-ovn-xpn
  • /test e2e-gcp-secureboot
  • /test e2e-gcp-upgrade
  • /test e2e-gcp-upi-xpn
  • /test e2e-ibmcloud-ovn
  • /test e2e-libvirt
  • /test e2e-metal-assisted
  • /test e2e-metal-ipi-ovn-dualstack
  • /test e2e-metal-ipi-sdn
  • /test e2e-metal-ipi-sdn-swapped-hosts
  • /test e2e-metal-ipi-sdn-virtualmedia
  • /test e2e-metal-single-node-live-iso
  • /test e2e-nutanix-ovn
  • /test e2e-nutanix-sdn
  • /test e2e-openstack-ccpmso
  • /test e2e-openstack-ccpmso-zone
  • /test e2e-openstack-dualstack-techpreview
  • /test e2e-openstack-externallb
  • /test e2e-openstack-nfv-intel
  • /test e2e-openstack-proxy
  • /test e2e-openstack-sdn-parallel
  • /test e2e-openstack-upi
  • /test e2e-vsphere-static-ovn
  • /test e2e-vsphere-upi-zones
  • /test e2e-vsphere-zones
  • /test e2e-vsphere-zones-techpreview
  • /test okd-e2e-agent-compact-ipv4
  • /test okd-e2e-agent-ha-dualstack
  • /test okd-e2e-agent-sno-ipv6
  • /test okd-e2e-aws-ovn
  • /test okd-e2e-aws-ovn-upgrade
  • /test okd-e2e-gcp
  • /test okd-e2e-gcp-ovn-upgrade
  • /test okd-e2e-vsphere
  • /test okd-scos-e2e-agent-compact-ipv4
  • /test okd-scos-e2e-agent-sno-ipv6
  • /test okd-scos-e2e-aws-ovn
  • /test okd-scos-e2e-aws-upgrade
  • /test okd-scos-e2e-gcp
  • /test okd-scos-e2e-gcp-ovn-upgrade
  • /test okd-scos-e2e-vsphere
  • /test okd-scos-unit
  • /test okd-scos-verify-codegen
  • /test tf-fmt

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-installer-master-altinfra-images
  • pull-ci-openshift-installer-master-aro-unit
  • pull-ci-openshift-installer-master-e2e-aws-custom-security-groups
  • pull-ci-openshift-installer-master-e2e-aws-ovn
  • pull-ci-openshift-installer-master-e2e-metal-assisted
  • pull-ci-openshift-installer-master-e2e-metal-ipi-ovn-dualstack
  • pull-ci-openshift-installer-master-e2e-metal-ipi-ovn-ipv6
  • pull-ci-openshift-installer-master-e2e-metal-ipi-sdn
  • pull-ci-openshift-installer-master-e2e-metal-ipi-sdn-swapped-hosts
  • pull-ci-openshift-installer-master-e2e-metal-ipi-sdn-virtualmedia
  • pull-ci-openshift-installer-master-e2e-metal-single-node-live-iso
  • pull-ci-openshift-installer-master-gofmt
  • pull-ci-openshift-installer-master-golint
  • pull-ci-openshift-installer-master-govet
  • pull-ci-openshift-installer-master-images
  • pull-ci-openshift-installer-master-okd-e2e-aws-ovn
  • pull-ci-openshift-installer-master-okd-e2e-aws-ovn-upgrade
  • pull-ci-openshift-installer-master-okd-images
  • pull-ci-openshift-installer-master-okd-scos-e2e-aws-ovn
  • pull-ci-openshift-installer-master-okd-scos-images
  • pull-ci-openshift-installer-master-okd-scos-unit
  • pull-ci-openshift-installer-master-okd-scos-verify-codegen
  • pull-ci-openshift-installer-master-okd-unit
  • pull-ci-openshift-installer-master-okd-verify-codegen
  • pull-ci-openshift-installer-master-shellcheck
  • pull-ci-openshift-installer-master-tf-fmt
  • pull-ci-openshift-installer-master-tf-lint
  • pull-ci-openshift-installer-master-unit
  • pull-ci-openshift-installer-master-verify-codegen
  • pull-ci-openshift-installer-master-verify-vendor
  • pull-ci-openshift-installer-master-yaml-lint
Details

In response to this:

/test ci/prow/okd-e2e-agent-compact-ipv4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4

4 similar comments
@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4

@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4

@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4

@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4

@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4
/test okd-e2e-agent-sno-ipv6

1 similar comment
@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4
/test okd-e2e-agent-sno-ipv6

@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-ha-dualstack

@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4
/test okd-e2e-agent-ha-dualstack

1 similar comment
@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-compact-ipv4
/test okd-e2e-agent-ha-dualstack

@andfasano
Copy link
Contributor Author

/test okd-e2e-agent-ha-dualstack

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 6, 2023

@andfasano: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-e2e-aws-ovn 715bb1c link false /test okd-e2e-aws-ovn
ci/prow/okd-e2e-aws-ovn-upgrade 715bb1c link false /test okd-e2e-aws-ovn-upgrade
ci/prow/e2e-metal-single-node-live-iso 715bb1c link false /test e2e-metal-single-node-live-iso
ci/prow/okd-e2e-agent-ha-dualstack 715bb1c link false /test okd-e2e-agent-ha-dualstack

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@andfasano
Copy link
Contributor Author

Not required anymore, this PR was just for additional testing of #7641

@andfasano andfasano closed this Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants