Skip to content

Conversation

@hardys
Copy link

@hardys hardys commented Dec 20, 2019

Some WIP to share cc @russellb @derekhiggins

This is using metal3-io/metal3-dev-env#160 in my environment applied to the checkout referenced via METAL3_DEV_ENV.

export WORKING_DIR="/home/dev-scripts"
export KNI_INSTALL_FROM_GIT="true"
export METAL3_DEV_ENV="/home/shardy/metal3-dev-env"
export EXTERNAL_SUBNET="fd2e:6f44:5dd8:c956:0:0:0:1/120"
export DNS_VIP="fd2e:6f44:5dd8:c956:0:0:0:2"

I also had to set the accept_ra=2 as mentioned in https://goodsquishy.com/upload/4a067f9d9677b6f770c7 - that's not yet handled by the metal3-dev-env PR.

I converted the macs to DUID's by prepending 01: but it sounds like @derekhiggins may have run into issues which needed a manually configured dnsmasq ref https://goodsquishy.com/upload/2ee1a771c2ad31f435ea

We also need the openshift-kni/installer fork rebased to include openshift/installer#2846

@hardys
Copy link
Author

hardys commented Dec 20, 2019

This also probably needs rebasing on #856 as we can't connect to upstream quay via ipv6

@derekhiggins
Copy link
Collaborator

See #833 for provisioning over IPv6

@yrobla
Copy link
Contributor

yrobla commented Jan 8, 2020

Some WIP to share cc @russellb @derekhiggins

This is using metal3-io/metal3-dev-env#160 in my environment applied to the checkout referenced via METAL3_DEV_ENV.

export WORKING_DIR="/home/dev-scripts"
export KNI_INSTALL_FROM_GIT="true"
export METAL3_DEV_ENV="/home/shardy/metal3-dev-env"
export EXTERNAL_SUBNET="fd2e:6f44:5dd8:c956:0:0:0:1/120"
export DNS_VIP="fd2e:6f44:5dd8:c956:0:0:0:2"

I also had to set the accept_ra=2 as mentioned in https://goodsquishy.com/upload/4a067f9d9677b6f770c7 - that's not yet handled by the metal3-dev-env PR.

I converted the macs to DUID's by prepending 01: but it sounds like @derekhiggins may have run into issues which needed a manually configured dnsmasq ref https://goodsquishy.com/upload/2ee1a771c2ad31f435ea

We also need the openshift-kni/installer fork rebased to include openshift/installer#2846

This actually needs to be: export EXTERNAL_SUBNET="fd2e:6f44:5dd8:c956::/120"

@hardys hardys force-pushed the wip_ipv6 branch 2 times, most recently from 7247290 to 79635f9 Compare January 9, 2020 11:13
@yrobla
Copy link
Contributor

yrobla commented Jan 9, 2020

When using a custom tag for the image (the ipv6 one), i had to modify 04_setup_ironic.sh file, and change

oc adm release mirror \
   --insecure=true \
    -a ${COMBINED_AUTH_FILE}  \
    --from ${OPENSHIFT_RELEASE_IMAGE} \
    --to-release-image ${LOCAL_REGISTRY_DNS_NAME}:${LOCAL_REGISTRY_PORT}/localimages/local-release-image:${TAG}

to

oc adm release mirror \
   --insecure=true \
    -a ${COMBINED_AUTH_FILE}  \
    --from ${OPENSHIFT_RELEASE_IMAGE} \
    --to-release-image ${LOCAL_REGISTRY_DNS_NAME}:${LOCAL_REGISTRY_PORT}/localimages/local-release-image:latest

@hardys
Copy link
Author

hardys commented Jan 13, 2020

Updated config for anyone wanting to replicate testing - note there are local checkouts for metal3-dev-env and baremetal-runtimecfg required, and some manual customization of the images which isn't currently handled by the scripts (see comments below):

Also it's necessary to add fd2e:6f44:5dd8:c956::1 virthost.ostest.test.metalkube.org to /etc/hosts

export WORKING_DIR="/home/dev-scripts"
# Note KNI_INSTALL_FROM_GIT is broken when using MIRROR_IMAGES or *LOCAL_IMAGE ref
# https://github.com/openshift-metal3/dev-scripts/issues/880
#export KNI_INSTALL_FROM_GIT="true"
# local checkout has https://github.com/openshift/baremetal-runtimecfg/pull/38/ applied
export BAREMETAL_RUNTIMECFG_LOCAL_IMAGE="https://github.com/openshift/baremetal-runtimecfg"
# Local checkout with https://github.com/metal3-io/metal3-dev-env/pull/160 applied
export METAL3_DEV_ENV="/home/shardy/metal3-dev-env"
export MIRROR_IMAGES=true
export OPENSHIFT_RELEASE_IMAGE="registry.svc.ci.openshift.org/ipv6/release:4.3.0-0.nightly-2020-01-09-234847-ipv6.2"
export EXTERNAL_SUBNET="fd2e:6f44:5dd8:c956::/120"
export DNS_VIP="fd2e:6f44:5dd8:c956:0:0:0:2"

# Modify downloaded rhcos images to work around:
# https://bugzilla.redhat.com/show_bug.cgi?id=1787620
# 
# cd $WORKING_DIR/ironic/html/images
# gunzip rhcos-43.81.201912131630.0-qemu.x86_64.qcow2.gz
# virt-edit -a rhcos-43.81.201912131630.0-qemu.x86_64.qcow2 -m /dev/sda1 -e "s/ip=any/ip=ens3:dhcp6/g" /grub2/grub.cfg 
# sha256sum rhcos-43.81.201912131630.0-qemu.x86_64.qcow2
# gzip rhcos-43.81.201912131630.0-qemu.x86_64.qcow2
# sha256sum rhcos-43.81.201912131630.0-qemu.x86_64.qcow2.gz
# gunzip rhcos-43.81.201912131630.0-openstack.x86_64.qcow2.gz
# virt-edit -a rhcos-43.81.201912131630.0-openstack.x86_64.qcow2 -m /dev/sda1 -e "s/ip=dhcp/ip=ens3:dhcp6/g" /grub2/grub.cfg 
# sha256sum rhcos-43.81.201912131630.0-openstack.x86_64.qcow2
# gzip rhcos-43.81.201912131630.0-openstack.x86_64.qcow2
# sha256sum rhcos-43.81.201912131630.0-openstack.x86_64.qcow2.gz
# vim rhcos-43.81.201912131630.0-openstack.x86_64.qcow2.gz.sha256sum
# 
export MACHINE_OS_BOOTSTRAP_IMAGE_UNCOMPRESSED_SHA256="2222263876b77b1bd73bfa7b169ca331f4e8dfd650afc00000d02cf2e548b3ea"
export MACHINE_OS_BOOTSTRAP_IMAGE_SHA256="64a96229e09839100251406434962048744c643113f5083cc22c2bd6aeb62d42"
export MACHINE_OS_IMAGE_UNCOMPRESSED_SHA256="61f44ed0b5d52bf8e5813f7401bdc9de94a8bc09861f9c01253e59772aadf020"
export MACHINE_OS_IMAGE_SHA256="896e05424a266ed910861568286a7ecf31ee853fe3216ffff05b9f96951b2337"
`

Steven Hardy and others added 7 commits January 13, 2020 18:30
So we can access it via ipv6
Instead of a regex which only accepts ipv4 addresses
Since the oc adm mirror command appears to reject target registries
and pullspecs that contain ipv6 addresses[1] we can work around this
by using a name instead

[1] openshift/oc#239
This is needed to work with ipv6 image, as it doesn't have
latest tag and bootstrap tries to use it.
According to the manpage [fd00::] gets expanded to the non link-local
address
@hardys
Copy link
Author

hardys commented Jan 16, 2020

Ok so the DUID stuff works with openshift/machine-config-operator#1375 and I pushed a corresponding update to metal3-io/metal3-dev-env#160

I'm now running with the MCO PR applied locally and

export MACHINE_CONFIG_OPERATOR_LOCAL_IMAGE="https://github.com/openshift/machine-config-operator"

This results in the expected leases:

$ sudo virsh net-dhcp-leases baremetal | grep master-
 2020-01-16 11:52:53   00:14:f0:50:6b:b4   ipv6       fd2e:6f44:5dd8:c956::14/120   master-0   00:03:00:01:00:14:f0:50:6b:b4
 2020-01-16 11:52:58   00:14:f0:50:6b:b8   ipv6       fd2e:6f44:5dd8:c956::15/120   master-1   00:03:00:01:00:14:f0:50:6b:b8
 2020-01-16 11:53:00   00:14:f0:50:6b:bc   ipv6       fd2e:6f44:5dd8:c956::16/120   master-2   00:03:00:01:00:14:f0:50:6b:bc

The hostname also seems to be set correctly (with no NM manual restart @derekhiggins)

[core@master-0 ~]$ hostname -f
master-0.ostest.test.metalkube.org

@mcornea
Copy link
Contributor

mcornea commented Jan 17, 2020

I've been testing this change on my environment and I'm seeing the following issues:

  • when the master nodes boot they have 'localhost' set as hostname even though the fqdn options were provided via DHCP options. After running 'systemctl restart NetworkManager' the correct hostname gets set:
[core@localhost ~]$ hostname -f
localhost
[core@localhost ~]$ sudo systemctl restart NetworkManager
[core@localhost ~]$ hostname -f
master-1.ocp-edge-cluster.qe.lab.redhat.com
  • mdns-publisher fails to start because bind_address is misconfigured in /etc/mdns/config.hcl
grep bind_address /etc/mdns/config.hcl
bind_address = "fd2e:6f44:5dd8:c956::"
  • discovery container fails because it cannot resolve _etcd-server-ssl._tcp hostnames as resolv.conf only includes the nameserver received over DHCP:
[root@master-0 core]# crictl logs 8ba696a5f18a3
I0117 18:03:07.778748       1 run.go:108] Version: machine-config-daemon-4.3.0-201910280117-166-gefc540d6 (efc540d6b210ece75296943113e7b1593d18c950)
I0117 18:03:07.779164       1 run.go:123] KUBERNETES_SERVICE_HOST or KUBERNETES_SERVICE_PORT contain no value, running in standalone mode.
E0117 18:03:07.780507       1 run.go:462] error looking up self for candidate IP 172.22.0.60: lookup _etcd-server-ssl._tcp.ocp-edge-cluster.qe.lab.redhat.com on [fd2e:6f44:5dd8:c956::1]:53: no such host
[root@master-0 core]# cat /etc/resolv.conf
# Generated by NetworkManager
search ocp-edge-cluster.qe.lab.redhat.com
nameserver fd2e:6f44:5dd8:c956::1

russellb and others added 6 commits January 17, 2020 21:42
On RHEL 8, when ifdown is run on a bridge's only (or last up) interface,
then the bridge is deleted. However, when ifup is run on the bridge's
interface, it is not correspondingly run on the bridge itself.

See:
https://github.com/fedora-sysv/initscripts/blob/rhel8-branch/network-scripts/ifdown-eth#L144

Since the provisioning interface is bounced with ifup/ifdown after
bringing the bridge up, then the bridge itself ends up not existing.
This patch adds an additional call to ifup the provisioning bridge after
bouncing the interface.

Signed-off-by: James Slagle <[email protected]>
Both MIRROR_IMAGES=true and *_LOCAL_IMAGE rely on this
image being build so we should build it in both cases.
We had only been building it if *_LOCAL_IMAGE was set.

Fixes: openshift-metal3#880
This should not be needed since we now prepend the DNS VIP
via the MCO, but I'm not clear yet if we should leave this
anyway as it stops NetworkManager adding the link-local IP
to resolv.conf?
These can get left as the --remove-all-storage doesn't remove them
(they're passed via -fw-cfg not strictly owned by the domain).
Default to the values from the installer for ipv4, but we
can override like this for ipv6:

export CLUSTER_SUBNET="fd01::/48"
export CLUSTER_HOST_PREFIX="64"
export SERVICE_SUBNET="fd02::/112"
@hardys
Copy link
Author

hardys commented Jan 20, 2020

@russellb FYI I had a similar commit for the configurable cluster/service networks, when you rebase you'll need to add some variables like:

export NETWORK_TYPE="OVNKubernetes"
export CLUSTER_SUBNET="fd01::/48"
export CLUSTER_HOST_PREFIX="64"
export SERVICE_SUBNET="fd02::/112"

We could perhaps just add the defaults for testing this PR as a WIP commit on this branch if that would be easier?

This script includes the current set of workarounds needed to get a
working etcd cluster.
@mcornea
Copy link
Contributor

mcornea commented Jan 20, 2020

Note: I had to adjust 02_configure_host.sh in order to get the address received on the $INT_IF interface set to the baremetal bridge:

@@ -104,9 +101,14 @@ if [ "$MANAGE_INT_BRIDGE" == "y" ]; then
     # external access so we need to make sure we maintain dhcp config if its available
     if [ "$INT_IF" ]; then
         echo -e "DEVICE=$INT_IF\nTYPE=Ethernet\nONBOOT=yes\nNM_CONTROLLED=no\nBRIDGE=baremetal" | sudo dd of=/etc/sysconfig/network-scripts/ifcfg-$INT_IF
-        if sudo nmap --script broadcast-dhcp-discover -e $INT_IF | grep "IP Offered" ; then
-            grep -q BOOTPROTO /etc/sysconfig/network-scripts/ifcfg-baremetal || (echo -e "\nBOOTPROTO=dhcp\n" | sudo tee -a /etc/sysconfig/network-scripts/ifcfg-baremetal)
-        fi
+       if [[ $EXTERNAL_SUBNET =~ .*:.* ]]; then
+            sudo firewall-cmd --zone=libvirt --add-service=dhcpv6-client
+            grep -q BOOTPROTO /etc/sysconfig/network-scripts/ifcfg-baremetal || (echo -e "BOOTPROTO=none\nIPV6INIT=yes\nIPV6_AUTOCONF=yes\nDHCPV6C=yes\nDHCPV6C_OPTIONS='-D LL'\n" | sudo tee -a /etc/sysconfig/network-scripts/ifcfg-baremetal)
+       else
+          if sudo nmap --script broadcast-dhcp-discover -e $INT_IF | grep "IP Offered" ; then
+              grep -q BOOTPROTO /etc/sysconfig/network-scripts/ifcfg-baremetal || (echo -e "\nBOOTPROTO=dhcp\n" | sudo tee -a /etc/sysconfig/network-scripts/ifcfg-baremetal)
+          fi
+       fi
         sudo systemctl restart network
     fi
 fi

Add a reference config file where we keep track of the settings being
used by those testing the IPv6 test release images.
MCO has been updated as of 4.3.0-0.nightly-2020-01-16-123848-ipv6.6 to
work around this issue without external workarounds.
The default configuration listens on IPv4 only.  With this
configuration, it seems to be listening on both IPv4 and IPv6
(localhost), so this should be a safe default in all cases for
dev-scripts.
This is handled automatically as of
4.3.0-0.nightly-2020-01-21-205041-ipv6.3
This is no longer needed as of
4.3.0-0.nightly-2020-01-21-205041-ipv6.4
@hardys
Copy link
Author

hardys commented Jan 28, 2020

The non WIP parts merged via #902 so closing this

@hardys hardys closed this Jan 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants