diff --git a/PRIMARY-UDN-CHANGES.md b/PRIMARY-UDN-CHANGES.md new file mode 100644 index 00000000000..d1dc36dc578 --- /dev/null +++ b/PRIMARY-UDN-CHANGES.md @@ -0,0 +1,995 @@ +# Primary UDN Support for HyperShift Hosted Clusters + +This document lists all code changes made to the HyperShift codebase to enable Primary User-Defined Network (UDN) support for KubeVirt hosted clusters. + +## What is Primary UDN? + +Primary UDN is an OVN-Kubernetes feature that isolates a namespace's pods on a separate network from the management cluster's default network. For HyperShift: +- Control plane pods get Primary UDN IPs as their default interface +- Worker VMs get Primary UDN IPs +- Both are isolated from the management cluster's default network + +--- + +## Setup Requirements for Primary UDN Hosted Clusters + +To enable Primary UDN for a HyperShift hosted cluster, the following configuration must be in place **before** creating the HostedCluster: + +### 1. Namespace Configuration + +The hosted cluster namespace must have the Primary UDN annotation: + +```yaml +apiVersion: v1 +kind: Namespace +metadata: + name: clusters- + annotations: + k8s.ovn.org/primary-user-defined-network: hcp- +``` + +### 2. UserDefinedNetwork Resource + +A UserDefinedNetwork resource with `role: Primary` must be created in the namespace: + +```yaml +apiVersion: k8s.ovn.org/v1 +kind: UserDefinedNetwork +metadata: + name: hcp- + namespace: clusters- +spec: + topology: Layer2 + layer2: + role: Primary # This makes it a Primary UDN + subnets: + - 10.150.0.0/16 # IP range for pods/VMs + ipam: + mode: Enabled + lifecycle: Persistent +``` + +### 3. HostedCluster Platform Configuration + +The HostedCluster must be configured with the KubeVirt platform and include a JSONPatch annotation to modify the VM template for Primary UDN compatibility: + +```yaml +apiVersion: hypershift.openshift.io/v1beta1 +kind: HostedCluster +metadata: + name: + namespace: clusters- + annotations: + hypershift.openshift.io/kubevirt-vm-jsonpatch: | + - op: replace + path: /spec/template/spec/domain/devices/interfaces/0/bridge + value: null + - op: add + path: /spec/template/spec/domain/devices/interfaces/0/binding + value: + name: l2bridge + - op: remove + path: /spec/template/metadata/annotations/kubevirt.io~1allow-pod-bridge-network-live-migration +spec: + platform: + type: KubeVirt + kubevirt: + # ... other kubevirt config +``` + +**What These JSONPatches Do**: + +1. **Remove bridge binding**: Removes the default `bridge` network binding (incompatible with Primary UDN) +2. **Add l2bridge binding**: Adds `l2bridge` binding which gives VMs direct access to the OVN L2 network and allows them to get their own Primary UDN IP +3. **Remove live migration annotation**: Removes `kubevirt.io/allow-pod-bridge-network-live-migration` annotation which causes OVN-K CNI to skip IP configuration on eth0, breaking route discovery for l2bridge on Primary UDN + +**Why These Changes Are Required**: +- `bridge` binding shares the pod's IP with the VM - doesn't work with Primary UDN's dual-network model +- `l2bridge` binding gives the VM direct access to the OVN L2 network, allowing it to get its own Primary UDN IP +- The live migration annotation interferes with Primary UDN networking setup and must be removed + +--- + +## Testing with Custom Images + +Since the code changes are not yet in official releases, custom operator images must be deployed for testing: + +### 1. Custom Control Plane Operator Image + +The HostedCluster needs an annotation to use a custom CPO image with the UDN fixes: + +```bash +oc annotate hostedcluster -n clusters \ + hypershift.openshift.io/control-plane-operator-image= \ + --overwrite +``` + +**Example**: +```bash +oc annotate hostedcluster test-primary-udn -n clusters \ + hypershift.openshift.io/control-plane-operator-image=quay.io/ramlavi/hypershift-control-plane-operator:udn-fix-20251130 \ + --overwrite +``` + +**What This Does**: +- Tells the hypershift-operator to use your custom CPO image instead of the default +- The CPO pod will automatically restart and apply changes from the custom image +- This image contains the etcd statefulset fix, ignition server certificate fix, and ignition endpoint logic + +### 2. Custom KubeVirt CAPI Provider Image (since I don't have access to the capi image URL) + +The hypershift-operator deployment needs an environment variable override for the KubeVirt CAPI provider: + +```bash +oc set env deployment/operator -n hypershift \ + IMAGE_KUBEVIRT_CAPI_PROVIDER= +``` + +**Example - Use image cloned from the original capi image**: +```bash +oc set env deployment/operator -n hypershift \ + IMAGE_KUBEVIRT_CAPI_PROVIDER=quay.io/ramlavi/cluster-api-provider-kubevirt:4.18 +``` + +**What This Does**: +- Overrides the default KubeVirt CAPI provider image used by the operator +- Required if there are any CAPI provider changes needed for Primary UDN support +- The operator will restart to pick up the new environment variable + +### 3. Custom HyperShift Operator Image + +If there are changes to the hypershift-operator itself (network policies, ignition endpoint detection): + +```bash +oc set image deployment/operator -n hypershift "*=" +``` + +**Example**: +```bash +oc set image deployment/operator -n hypershift "*=quay.io/ramlavi/hypershift-operator:udn-fix-20251130" +``` + +**What This Does**: +- Updates all containers in the operator deployment to use the custom image +- Required for testing network policy changes and Primary UDN detection logic + +--- + +## Problem Statement + +Even with the above setup in place, without the code changes below, HyperShift hosted clusters cannot use Primary UDN because: +1. Worker VMs cannot reach the ignition server during boot (wrong endpoint URL) +2. TLS certificate verification fails (wrong CA and server certificates) +3. Network policies block required service access (DNS, ClusterIP services) +4. Operator lacks permissions for new Kubernetes resources + +--- + +## Code Changes + +### Change 1: ETCD EndpointSlice Mirroring (Manual Workaround) + +**What Broke**: +After the hosted cluster is created and etcd starts, control plane components like kube-apiserver fail to connect to etcd with connection timeout errors. + +**Root Cause**: +Headless services (services without a ClusterIP) don't get automatic EndpointSlice mirroring by OVN-Kubernetes for Primary UDN networks because: +- There's no ClusterIP for OVN-K to mirror/program +- It's expected that the operator managing the headless service (in this case HyperShift) generates the appropriate EndpointSlices + +This causes: +- DNS lookups for `etcd-client.clusters-.svc` return the default network IP (e.g., `10.128.x.x`) +- Control plane pods using Primary UDN cannot reach etcd at the default network IP (infrastructure-locked) +- kube-apiserver and other components fail to start + +**What Needed to Change**: +Manual mirror EndpointSlices must be created for `etcd-client` and `etcd-discovery` services that point to etcd's UDN IP instead of the default network IP. The original EndpointSlices must be "orphaned" (removed from DNS) so only the UDN IP is returned. + +**Code/Workaround Required**: + +This is a **manual workaround** applied after cluster creation (automated via script in the spike, but not part of HyperShift code): + +1. Extract the UDN IP from the etcd pod: +```bash +oc -n clusters- get pod etcd-0 \ + -o jsonpath='{.metadata.annotations.k8s\.v1\.cni\.cncf\.io/network-status}' | \ + jq -r '.[] | select(.name == "ovn-kubernetes" and .default == true) | .ips[0]' +``` + +2. Create mirror EndpointSlices with special OVN-K labels pointing to the UDN IP: +```yaml +apiVersion: discovery.k8s.io/v1 +kind: EndpointSlice +metadata: + name: etcd-client-hcp--mirror + namespace: clusters- + labels: + endpointslice.kubernetes.io/managed-by: manual-workaround + kubernetes.io/service-name: etcd-client + k8s.ovn.org/service-name: etcd-client + ownerReferences: + - apiVersion: v1 + kind: Service + name: etcd-client + uid: +addressType: IPv4 +endpoints: +- addresses: + - 10.150.0.x # UDN IP instead of default 10.128.x.x + conditions: + ready: true + targetRef: + kind: Pod + name: etcd-0 + uid: +ports: +- name: etcd-client + port: 2379 + protocol: TCP +``` + +3. "Orphan" the original EndpointSlice by removing its labels: +```bash +# Stop the EndpointSlice controller from managing it +oc -n clusters- label endpointslice \ + endpointslice.kubernetes.io/managed-by- + +# Hide it from DNS (so only the mirror with UDN IP is used) +oc -n clusters- label endpointslice \ + kubernetes.io/service-name- +``` + +4. Repeat for `etcd-discovery` service + +**Impact**: +Without this workaround, the hosted control plane cannot start because kube-apiserver and other control plane components cannot reach etcd. + +**Note**: +This is a temporary workaround. Since headless services have no ClusterIP, OVN-Kubernetes cannot mirror them automatically - it's the responsibility of the operator (HyperShift) to create the appropriate EndpointSlices with the correct UDN IPs. + +--- + +### Change 2: ETCD Listen on All Interfaces for KubeVirt + +**What Broke**: +Even with the mirror EndpointSlices created, etcd was still unreachable from control plane pods because etcd was only listening on a specific IP (from the `POD_IP` environment variable), not on all interfaces. + +**Root Cause**: +The etcd StatefulSet configuration was using `POD_IP` environment variable for `ETCD_LISTEN_CLIENT_URLS`: +``` +ETCD_LISTEN_CLIENT_URLS="https://$(POD_IP):2379,https://localhost:2379" +``` + +In Primary UDN setups: +- The pod has two IPs: default network IP (e.g., `10.128.x.x`) and UDN IP (e.g., `10.150.x.x`) +- `POD_IP` only contains one of these IPs +- etcd was only listening on that single IP +- Control plane pods on Primary UDN needed to connect via the UDN IP + +**What Needed to Change**: +For KubeVirt platforms, etcd should listen on all interfaces (`10.128.x.x` AND `10.150.x.x`, or just `0.0.0.0`), so it can accept connections from both the default network and Primary UDN. + +**Code Changed**: + +**File**: `control-plane-operator/controllers/hostedcontrolplane/v2/etcd/statefulset.go` + +```go +// Detect KubeVirt platform +isKubeVirt := hcp.Spec.Platform.Type == hyperv1.KubevirtPlatform + +if !ipv4 { + // IPv6 logic... +} else if isKubeVirt { + // IPv4 KubeVirt: listen on all interfaces + // Note: 0.0.0.0 includes localhost (127.0.0.1), so no need to specify it separately + util.UpsertEnvVar(c, corev1.EnvVar{ + Name: "ETCD_LISTEN_CLIENT_URLS", + Value: "https://0.0.0.0:2379", + }) +} +``` + +**Before** (non-KubeVirt): +``` +ETCD_LISTEN_CLIENT_URLS="https://$(POD_IP):2379,https://localhost:2379" +``` + +**After** (KubeVirt): +``` +ETCD_LISTEN_CLIENT_URLS="https://0.0.0.0:2379" +``` + +**Impact**: +This code change makes etcd automatically listen on all interfaces for KubeVirt platforms, allowing control plane pods to reach etcd via any of the pod's IPs (default or UDN). + +**Note**: +Using `0.0.0.0` is temporary, and should be replaced by a init-container that fetches the primaryUDN IP and sets it as a env var or something like that. +--- + +### Change 3: Ignition Server Endpoint Selection + +**What Broke**: +Worker VMs failed to boot with the error on console: +``` +ignition[843]: GET https://ignition-server-clusters-test-primary-udn.apps.hypershift.qinqon.corp/ignition: +dial tcp: lookup ignition-server-clusters-test-primary-udn.apps.hypershift.qinqon.corp on 172.30.0.10:53: +read udp 10.150.0.120:39990->172.30.0.10:53: i/o timeout +``` + +The VM was trying to reach an external route hostname during early boot, but Primary UDN's network isolation prevented it from resolving external DNS names or reaching routes before kubelet established full connectivity. + +**Why it broke** +[Openshift ingress router](https://github.com/openshift/router) pod does not support primary UDN yet. + +**What Needed to Change**: +The ignition endpoint URL needed to be set to the internal ClusterIP service DNS name (`ignition-server.{namespace}.svc.cluster.local`) instead of the external route hostname for Primary UDN namespaces. + +**Code Added**: + +**File**: `hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go` + +```go +// Detect Primary UDN by checking namespace label +isPrimaryUDN := false +if hcluster.Spec.Platform.Type == hyperv1.KubevirtPlatform { + namespace := &corev1.Namespace{} + if err := r.Client.Get(ctx, client.ObjectKey{Name: hcluster.Namespace}, namespace); err == nil { + if label, exists := namespace.Labels["k8s.ovn.org/primary-user-defined-network"]; exists && label != "" { + isPrimaryUDN = true + } + } +} + +// Set ignition endpoint based on UDN type +if isPrimaryUDN { + hcluster.Status.IgnitionEndpoint = fmt.Sprintf("ignition-server.%s.svc.cluster.local", hcluster.Namespace) +} else { + hcluster.Status.IgnitionEndpoint = servicePublishingStrategyMapping.Route.Hostname +} +``` + +--- + +### Change 4: Service Network Access Policy + +**What Broke**: +Worker VMs failed to reach the internal ignition service with DNS timeout errors: +``` +dial tcp: lookup ignition-server.clusters-test-primary-udn.svc.cluster.local on 172.30.0.10:53: +read udp 10.150.0.120:39990->172.30.0.10:53: i/o timeout +``` + +The NetworkPolicy for `virt-launcher` pods was blocking ALL egress traffic to the service CIDR (`172.30.0.0/16`), which includes: +- DNS service at `172.30.0.10:53` +- All ClusterIP services including `ignition-server` + +**What Needed to Change**: +The `reconcileVirtLauncherNetworkPolicy()` function was adding the service CIDR to the blocked networks list. This needed to be removed because: +- Primary UDN namespaces have an "infrastructure-locked" default network interface that needs service access +- DNS resolution is essential for reaching services by name +- Primary UDN already provides network isolation at the OVN level, so blocking service CIDR was redundant and overly restrictive + +**Code Changed**: + +**File**: `hypershift-operator/controllers/hostedcluster/network_policies.go` + +```go +// BEFORE (wrong - blocked service network): +for _, network := range managementClusterNetwork.Spec.ServiceNetwork { + blockedIPv4Networks, blockedIPv6Networks = addToBlockedNetworks(network, blockedIPv4Networks, blockedIPv6Networks) +} + +// AFTER (correct - allow service network): +// Removed the loop entirely - service network is no longer added to blocked list +``` + +--- + +### Change 5: Ignition CA Certificate Reference + +**What Broke**: +Worker VMs failed to fetch ignition with TLS certificate verification error: +``` +ignition[843]: GET error: Get "https://ignition-server.clusters-test-primary-udn.svc.cluster.local/ignition": +tls: failed to verify certificate: x509: certificate signed by unknown authority +``` + +Even though the DNS resolution and network connectivity worked, the TLS handshake failed because the worker VM's ignition configuration contained the wrong CA certificate to verify the server. + +**What Needed to Change**: +The `getIgnitionCACert()` function was injecting the `ignition-server-ca-cert` secret into the worker's userdata, but the actual `ignition-server` TLS certificate is signed by the `root-ca`, not by `ignition-server-ca-cert`. The function needed to reference the correct CA secret. + +**Code Changed**: + +**File**: `hypershift-operator/controllers/nodepool/token.go` + +```go +// BEFORE (wrong): +caSecret := ignitionserver.IgnitionCACertSecret(t.controlplaneNamespace) + +// AFTER (correct): +caSecret := &corev1.Secret{ + ObjectMeta: metav1.ObjectMeta{ + Namespace: t.controlplaneNamespace, + Name: "root-ca", + }, +} +``` + +--- + +### Change 6: Ignition Server TLS Certificate + +**What Broke**: +After fixing the CA certificate issue, worker VMs still failed TLS verification with a different error - the server's certificate didn't match the DNS name being requested. The `ignition-server-proxy` was using a certificate that only contained the external route hostname in its SANs, not the internal service DNS name. + +**What Needed to Change**: +The `ignition-server-proxy` HAProxy deployment needed to mount a different certificate secret - one that includes the internal service DNS name (`ignition-server.{namespace}.svc.cluster.local`) in its Subject Alternative Names. The correct secret is `ignition-server`, which contains both external and internal DNS names. + +**Code Changed**: + +**File**: `control-plane-operator/controllers/hostedcontrolplane/v2/assets/ignition-server-proxy/deployment.yaml` + +```yaml +volumes: +- name: serving-cert + secret: + defaultMode: 416 + secretName: ignition-server # Changed from: ignition-server-serving-cert +``` + +--- + +### Change 7: Internal DNS for Apps Domain (Ingress Fix) + +**What Broke**: + +The `ingress` and `console` cluster operators are degraded due to **DNS resolving to unreachable IPs**. + +**Symptoms**: +- `ingress` operator: `Degraded=True` with `CanaryChecksRepetitiveFailures` +- `console` operator: `Degraded=True` with OAuth authentication failures + +**Error Messages**: +```bash +# Get ingress operator status +oc --kubeconfig ${CLUSTER_NAME}-kubeconfig get co ingress -o jsonpath='{.status.conditions[?(@.type=="Degraded")].message}' +# Output: +# Canary route checks for the default ingress controller are failing. +# error sending canary HTTP Request: Timeout: Get "https://canary-openshift-ingress-canary.apps.test-primary-udn.apps.hypershift.qinqon.corp": +# context deadline exceeded (Client.Timeout exceeded while awaiting headers) + +# Get console operator status +oc --kubeconfig ${CLUSTER_NAME}-kubeconfig get co console -o jsonpath='{.status.conditions[?(@.type=="Degraded")].message}' +# Output: +# Error initializing authenticator: failed to construct OAuth endpoint cache: +# failed to setup an async cache - caching func returned error: context deadline exceeded +``` + +**Root Cause - DNS Resolution Points to Wrong Network**: + +The core issue is that **DNS queries return IPs on the wrong network**: + +| What Resolves | Returns | Actually Needed | +|---------------|---------|-----------------| +| `*.apps.test-primary-udn.apps...` | `192.168.122.253` (external VIP) | `172.31.x.x` (guest ClusterIP) | +| `oauth-clusters-test-primary-udn.apps...` | `192.168.122.253` (external VIP) | Internal service IP | + +**Why External VIP Doesn't Work**: + +When DNS returns the external IP (`192.168.122.253`), the traffic flow becomes: + +1. Control plane pod resolves hostname β†’ gets `192.168.122.253` (external VIP) +2. Request goes to management cluster router (passthrough route) +3. Router forwards to EndpointSlice target: `10.150.0.111` (worker VM's **Primary UDN IP**) +4. **πŸ’₯ FAILS**: Management cluster cannot route to Primary UDN (`10.150.0.0/16`) - network is isolated + +**Network Flow Diagram (Before Fix)**: +``` +ingress-operator pod + | + | DNS: "canary-openshift-ingress-canary.apps.test-primary-udn.apps..." + | + v +❌ Management cluster DNS β†’ Returns: 192.168.122.253 (external VIP) + | + v +Management cluster router β†’ Passthrough route β†’ NodePort service + | + v +EndpointSlice target: 10.150.0.111 (Primary UDN IP) + | + X UNREACHABLE - Primary UDN is isolated from management cluster network +``` + +**Verification Commands**: + +```bash +# Check what DNS returns from inside a UDN pod (what konnectivity-https-proxy sees) +oc -n clusters-${CLUSTER_NAME} exec deploy/konnectivity-agent -- \ + nslookup canary-openshift-ingress-canary.apps.test-primary-udn.apps.hypershift.qinqon.corp + +# Returns: 192.168.122.253 (external VIP - WRONG!) + +# Check what the passthrough service points to +oc get endpointslice -n clusters-${CLUSTER_NAME} -l kubernetes.io/service-name=router-default -o yaml + +# Shows: 10.150.0.111 (Primary UDN IP - UNREACHABLE from management cluster) + +# Get the internal ClusterIP that WOULD work +oc --kubeconfig ${CLUSTER_NAME}-kubeconfig get svc -n openshift-ingress router-internal-default \ + -o jsonpath='{.spec.clusterIP}' + +# Returns: 172.31.88.98 (ClusterIP - REACHABLE via konnectivity tunnel) + +# Test from UDN pod - external VIP FAILS +oc -n clusters-${CLUSTER_NAME} exec deploy/konnectivity-agent -- \ + curl -k --connect-timeout 5 https://192.168.122.253/healthz +# Result: timeout + +# Test from UDN pod - internal ClusterIP WORKS +oc -n clusters-${CLUSTER_NAME} exec deploy/konnectivity-agent -- \ + curl -k --connect-timeout 5 https://172.31.88.98/healthz +# Result: success +``` + +**The Fix - Override DNS Resolution**: + +The problem is DNS returning the wrong IP. The solution is to **make DNS return the internal ClusterIP instead**. + +**Solution Implemented** (for Ingress Operator): + +The `konnectivity-https-proxy` already has `ResolveFromGuestClusterDNS=true` [enabled](https://github.com/openshift/hypershift/blob/9ee4ac7c2d9303e92e4b282262142df28efc2ff7/konnectivity-https-proxy/cmd.go#L47), meaning it queries the guest cluster's CoreDNS for hostname resolution. The fix leverages this by configuring the guest cluster's DNS to return the **internal router ClusterIP** instead of the external VIP. + +**How It Works**: +1. Create a custom CoreDNS deployment inside the guest cluster (`internal-apps-dns` namespace) +2. Configure it with `hosts` entries that map apps domain hostnames to the internal router ClusterIP +3. Patch the DNS operator to forward apps domain queries to this internal DNS server +4. Konnectivity proxy resolves apps hostnames β†’ gets internal router IP β†’ traffic stays internal + +**Implementation** (added to `cluster-sync.sh`): + +```bash +# 1. Get the internal router ClusterIP from guest cluster +ROUTER_IP=$(oc --kubeconfig ${GUEST_KUBECONFIG} get svc router-internal-default \ + -n openshift-ingress -o jsonpath='{.spec.clusterIP}') + +# 2. Create internal DNS namespace +oc --kubeconfig ${GUEST_KUBECONFIG} create ns internal-apps-dns + +# 3. Create CoreDNS ConfigMap with hosts entries +cat <`) is a **Primary UDN** namespace and the mirrored management Service is intentionally **selectorless**. + +**Symptoms**: + +- LoadBalancer VIP is assigned (e.g., `192.168.122.x`), but infra-client pod traffic to the VIP fails (commonly `Connection refused`). +- OVN-Kubernetes programs the service load balancer with `reject=true` because it sees **no usable UDN endpoints** for the Service. + +**Root Cause - EndpointSlice label/annotation mismatch for Primary UDN**: + +In Primary UDN namespaces, OVN-Kubernetes' UDN service controller **does not consume** the β€œdefault” EndpointSlices labeled `kubernetes.io/service-name`. Instead, it expects **UDN EndpointSlices** labeled and annotated as described in the OVN-K docs: + +- label: `k8s.ovn.org/service-name: ` +- annotation: `k8s.ovn.org/endpointslice-network: ` + +**Note**: The OVN-K EndpointSlice mirror controller is behaving as designed here. It only mirrors EndpointSlices created by the Kubernetes default controller (`endpointslice-controller.k8s.io`), and it does **not** process custom/out-of-band EndpointSlices. Since our passthrough Service is selectorless by design, there is nothing for the mirror controller to mirror. + +The actual problem is that HyperShift's hostedclusterconfigoperator creates these passthrough EndpointSlices only in the default form (label `kubernetes.io/service-name`) and **does not add** the UDN label/annotation (`k8s.ovn.org/service-name` + `k8s.ovn.org/endpointslice-network`). As a result, OVN-K's UDN service controller ignores them and programs the LoadBalancer VIP with no usable UDN backends. + +**Manual workaround (deploy UDN-style EndpointSlices)**: + +Create **one EndpointSlice per VM** (mirroring the existing per-VM EndpointSlices) but add the minimal OVN-K metadata so the UDN service controller consumes it: + +- label: `k8s.ovn.org/service-name: ` +- annotation: `k8s.ovn.org/endpointslice-network: ` (for HyperShift/KubeVirt this is typically `clusters-_hcp-`) + +**Manifest used** (apply once per VM / `virt-launcher`): + +```yaml +apiVersion: discovery.k8s.io/v1 +kind: EndpointSlice +metadata: + # One per VM. Pick a stable name (example shown). + name: udn---ipv4 + namespace: clusters- + labels: + endpointslice.kubernetes.io/managed-by: manual-workaround + k8s.ovn.org/service-name: + annotations: + k8s.ovn.org/endpointslice-network: clusters-_hcp- +addressType: IPv4 +ports: +- name: http + protocol: TCP + # Use the mirrored management Service's .spec.ports[?port==8080].targetPort + port: +endpoints: +- addresses: + # Use the VM / virt-launcher **Primary UDN** IP (role:"primary" from k8s.ovn.org/pod-networks) + - + conditions: + ready: true + serving: true + terminating: false + # Optional (recommended; matches OVN mirror output): + nodeName: + targetRef: + kind: Pod + namespace: clusters- + name: + uid: +``` + +**Note (proper fix)**: + +This manual workaround should be replaced by a real fix in HyperShift HCCO, where these selectorless per-VM EndpointSlices are created today: + +- `control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine.go` + +Key code change (add UDN label/annotation so OVN-K consumes these EndpointSlices on Primary UDN): + +```go +endpointSlice.Labels["k8s.ovn.org/service-name"] = cpService.Name +endpointSlice.Annotations["k8s.ovn.org/endpointslice-network"] = "_hcp->" +delete(endpointSlice.Labels, discoveryv1.LabelServiceName) // remove kubernetes.io/service-name +``` + +--- + +## Origin conformance: KubeVirt services (Primary UDN) + +This section is a **validation checklist** for the Origin `[sig-kubevirt] services` conformance tests when the HyperShift **hosted control plane namespace** (`clusters-`) is configured as a **Primary UDN namespace**. + +**Validation notes**: + +- Tests were run **serially, one-by-one** (to eliminate cross-talk / parallel interference between tests). +- Tests were run with **minimal monitors** by setting `MONITORS="event-collector"` (i.e., disabling the default broader monitor set). With the broader/default monitors enabled, additional monitor failures can occur and may not reflect the actual connectivity assertions of the test itself. + +| Test (full string) | Current status on Primary UDN | Why it fails (when it fails) | Change needed to pass | +|---|---:|---|---| +| `[sig-kubevirt] services when running openshift cluster on KubeVirt virtual machines should allow connections to pods from guest cluster PodNetwork pod via LoadBalancer service across different guest nodes [Suite:openshift/conformance/parallel]` | βœ… passed | - | - | +| `[sig-kubevirt] services when running openshift cluster on KubeVirt virtual machines should allow connections to pods from guest hostNetwork pod via NodePort across different guest nodes [Suite:openshift/conformance/parallel]` | βœ… passed | - | - | +| `[sig-kubevirt] services when running openshift cluster on KubeVirt virtual machines should allow connections to pods from guest podNetwork pod via NodePort across different guest nodes [Suite:openshift/conformance/parallel]` | βœ… passed | - | - | +| `[sig-kubevirt] services when running openshift cluster on KubeVirt virtual machines should allow direct connections to pods from guest cluster pod in pod network across different guest nodes [Suite:openshift/conformance/parallel]` | βœ… passed | - | - | +| `[sig-kubevirt] services when running openshift cluster on KubeVirt virtual machines should allow direct connections to pods from guest cluster pod in host network across different guest nodes [Suite:openshift/conformance/parallel]` | βœ… passed | - | - | +| `[sig-kubevirt] services when running openshift cluster on KubeVirt virtual machines should allow connections to pods from infra cluster pod via LoadBalancer service across different guest nodes [Suite:openshift/conformance/parallel]` | βœ… passed | - | -| +| `[sig-kubevirt] services when running openshift cluster on KubeVirt virtual machines should allow connections to pods from infra cluster pod via NodePort across different infra nodes [Suite:openshift/conformance/parallel]` | ⚠️ passed with test alteration | The test uses `serverVMPod.Status.PodIP` as the β€œguest node IP”, which in Primary UDN environments is **not** the Primary UDN IP. | **Origin test change**: when Primary UDN is configured, use the `virt-launcher` **Primary UDN** IP (from `k8s.ovn.org/pod-networks`, `role:"primary"`) instead of `pod.status.podIP`.

**Repro**: `repro-kubevirt-infra-nodeport.sh` | + +--- + +## Summary + +**Total Changes**: 5 HyperShift code files modified + 2 workarounds in cluster-sync.sh + +### Completed Changes: +1. **ETCD EndpointSlice mirroring** (manual workaround) - Create manual mirror EndpointSlices pointing to etcd's UDN IP so control plane can reach etcd +2. **ETCD listen address** - Listen on all interfaces for KubeVirt platforms +3. **Ignition endpoint** - Use internal DNS for Primary UDN +4. **Network policy** - Allow service CIDR access for DNS and ClusterIP services +5. **CA certificate** - Use correct CA (`root-ca`) in worker userdata +6. **Server certificate** - Use cert with internal DNS name in SANs +7. **Internal DNS for apps domain** (cluster-sync.sh) - Create internal CoreDNS in guest cluster that returns internal router ClusterIP for apps routes, fixing ingress operator canary checks +8. **Console OAuth bridge** (cluster-sync.sh) - Create Service/Endpoints in guest cluster pointing to OAuth pod's Primary UDN IP, with DNS override + +**Current Result**: +- βœ… Control plane can start and connect to etcd +- βœ… Worker VMs can boot, fetch ignition, and join the hosted cluster +- βœ… `ingress` operator healthy (`Available=True`, `Degraded=False`) +- βœ… `console` operator healthy (`Available=True`, `Degraded=False`) - OAuth bridge to pod's Primary UDN IP + +--- +## Architecture Notes + +### Primary UDN Network Model + +**Control Plane Pods**: +- Primary UDN IP on `ovn-udn1` interface (default route) +- Infrastructure-locked IP on `eth0` interface (restricted) + +**Worker VMs**: +- Primary UDN IP (`10.150.0.0/16` range) +- Direct L2 connectivity via `l2bridge` binding +- Routes to service CIDR (`172.30.0.0/16`) for ClusterIP access + +**Key Constraint**: Primary UDN provides network isolation at OVN level. Pods/VMs on Primary UDN cannot reach management cluster default network without explicit routes. + +### Required Configuration + +For Primary UDN hosted clusters: +1. **Namespace** must have label: `k8s.ovn.org/primary-user-defined-network=` +2. **UserDefinedNetwork** resource with `role: Primary` must exist +3. **VMs** must use `l2bridge` network binding (not `bridge` or `masquerade`) +4. **Routes** must be configured to service CIDR and host network if needed + diff --git a/api/hypershift/v1beta1/hostedcluster_types.go b/api/hypershift/v1beta1/hostedcluster_types.go index 92e1852fb2c..be74b9485d6 100644 --- a/api/hypershift/v1beta1/hostedcluster_types.go +++ b/api/hypershift/v1beta1/hostedcluster_types.go @@ -278,6 +278,13 @@ const ( // JSONPatchAnnotation allow modifying the kubevirt VM template using jsonpatch JSONPatchAnnotation = "hypershift.openshift.io/kubevirt-vm-jsonpatch" + // PrimaryUDNNameAnnotation enables Primary UDN for KubeVirt hosted clusters by specifying the + // UserDefinedNetwork name to be used as the primary network for the hosted control plane namespace. + PrimaryUDNNameAnnotation = "hypershift.openshift.io/primary-udn-name" + // PrimaryUDNSubnetAnnotation specifies the subnet CIDR for the Primary UDN to be created/ensured + // in the hosted control plane namespace (e.g. "10.150.0.0/16"). + PrimaryUDNSubnetAnnotation = "hypershift.openshift.io/primary-udn-subnet" + // KubeAPIServerGOGCAnnotation allows modifying the kube-apiserver GOGC environment variable to impact how often // the GO garbage collector runs. This can be used to reduce the memory footprint of the kube-apiserver. KubeAPIServerGOGCAnnotation = "hypershift.openshift.io/kube-apiserver-gogc" diff --git a/cluster-sync.sh b/cluster-sync.sh new file mode 100755 index 00000000000..2af38667b6e --- /dev/null +++ b/cluster-sync.sh @@ -0,0 +1,866 @@ +#!/bin/bash + +set -euo pipefail + +# NOTE: These scripts started life as a one-off spike. Keep them runnable in +# other environments by allowing env overrides (and only defaulting when unset). +: "${KUBECONFIG:=/root/.kcli/clusters/hypershift/auth/kubeconfig}" +: "${CLUSTER_NAME:=test-primary-udn}" +: "${UDN_NAME:=hcp-${CLUSTER_NAME}}" +: "${PULL_SECRET:=/root/ralavi/merged-pull-secret.json}" +: "${MEM:=10Gi}" +: "${CPU:=4}" +: "${WORKER_COUNT:=1}" +: "${CLUSTER_SUBNET:=10.128.0.0/14}" +: "${WAIT_GUEST_NODES_TIMEOUT_SECONDS:=2400}" +: "${WAIT_GUEST_NODES_READY_MIN:=1}" + +export KUBECONFIG + +if ! command -v jq >/dev/null 2>&1; then + echo "ERROR: jq is required but was not found in PATH." + exit 1 +fi + +if [ -z "${RELEASE_IMAGE:-}" ]; then + RELEASE_IMAGE="$(curl -s https://amd64.ocp.releases.ci.openshift.org/api/v1/releasestream/4-stable/latest | jq -r .pullSpec)" +fi + +# Read custom CPO image if available (built by cluster-up.sh) +if [ -f .cpo-image ]; then + export CPO_IMAGE=$(cat .cpo-image) + echo "Using custom control-plane-operator image: ${CPO_IMAGE}" + echo "" +else + if [ -n "${CPO_IMAGE}" ]; then + echo "Using provided control-plane-operator image: ${CPO_IMAGE}" + echo "" + else + echo "ERROR: No custom control-plane-operator image found." + echo "Run ./cluster-up.sh first (it writes .cpo-image), or export CPO_IMAGE to a valid image reference." + exit 1 + fi +fi + +# Destroy existing cluster if it exists +if oc get hostedcluster $CLUSTER_NAME -n clusters &>/dev/null; then + echo "Hosted cluster $CLUSTER_NAME already exists. Destroying it first..." + + cleanup_finalizers_best_effort() { + # Best-effort cleanup. Safe to run multiple times. + echo "Cleaning up stuck finalizers (best-effort)..." + + # Remove finalizers from VirtualMachineInstances + for vmi in $(oc get virtualmachineinstances.kubevirt.io -n "clusters-${CLUSTER_NAME}" -o name 2>/dev/null); do + echo " Removing finalizers from $vmi" + oc patch "$vmi" -n "clusters-${CLUSTER_NAME}" --type=merge -p '{"metadata":{"finalizers":null}}' 2>/dev/null || true + done + + # Remove finalizers from VirtualMachines + for vm in $(oc get virtualmachines.kubevirt.io -n "clusters-${CLUSTER_NAME}" -o name 2>/dev/null); do + echo " Removing finalizers from $vm" + oc patch "$vm" -n "clusters-${CLUSTER_NAME}" --type=merge -p '{"metadata":{"finalizers":null}}' 2>/dev/null || true + done + + # Remove finalizers from Machines + for machine in $(oc get machines.cluster.x-k8s.io -n "clusters-${CLUSTER_NAME}" -o name 2>/dev/null); do + echo " Removing finalizers from $machine" + oc patch "$machine" -n "clusters-${CLUSTER_NAME}" --type=merge -p '{"metadata":{"finalizers":null}}' 2>/dev/null || true + done + + # Remove finalizers from MachineSets + for ms in $(oc get machinesets.cluster.x-k8s.io -n "clusters-${CLUSTER_NAME}" -o name 2>/dev/null); do + echo " Removing finalizers from $ms" + oc patch "$ms" -n "clusters-${CLUSTER_NAME}" --type=merge -p '{"metadata":{"finalizers":null}}' 2>/dev/null || true + done + + # Remove finalizers from Cluster (cluster.x-k8s.io) + for cluster in $(oc get clusters.cluster.x-k8s.io -n "clusters-${CLUSTER_NAME}" -o name 2>/dev/null); do + echo " Removing finalizers from $cluster" + oc patch "$cluster" -n "clusters-${CLUSTER_NAME}" --type=merge -p '{"metadata":{"finalizers":null}}' 2>/dev/null || true + done + + # Remove finalizers from HostedControlPlane + for hcp in $(oc get hostedcontrolplanes.hypershift.openshift.io -n "clusters-${CLUSTER_NAME}" -o name 2>/dev/null); do + echo " Removing finalizers from $hcp" + oc patch "$hcp" -n "clusters-${CLUSTER_NAME}" --type=merge -p '{"metadata":{"finalizers":null}}' 2>/dev/null || true + done + + # Remove finalizers from NodePools + for np in $(oc get nodepools.hypershift.openshift.io -n clusters -o name 2>/dev/null | grep "$CLUSTER_NAME"); do + echo " Removing finalizers from $np" + oc patch "$np" -n clusters --type=merge -p '{"metadata":{"finalizers":null}}' 2>/dev/null || true + done + + # Remove finalizers from HostedCluster itself + oc patch "hostedcluster/${CLUSTER_NAME}" -n clusters --type=merge -p '{"metadata":{"finalizers":null}}' 2>/dev/null || true + } + + wait_for_gone() { + local kind="$1" + local name="$2" + local namespace="${3:-}" + local timeout_seconds="${4:-900}" + local elapsed=0 + + while true; do + if [ -n "${namespace}" ]; then + if ! oc -n "${namespace}" get "${kind}" "${name}" >/dev/null 2>&1; then + return 0 + fi + else + if ! oc get "${kind}" "${name}" >/dev/null 2>&1; then + return 0 + fi + fi + + if [ "${elapsed}" -ge "${timeout_seconds}" ]; then + echo "ERROR: Timed out waiting for ${kind}/${name} to be deleted (ns=${namespace:-})." >&2 + return 1 + fi + + # Keep trying finalizer cleanup while we wait; some deletions only unblock after retries. + # (run every ~30s to reduce API spam) + if [ $((elapsed % 30)) -eq 0 ]; then + cleanup_finalizers_best_effort + fi + + echo "Still waiting for ${kind}/${name} to be deleted... (${elapsed}/${timeout_seconds}s)" + sleep 10 + elapsed=$((elapsed + 10)) + done + } + + # Run destroy in background so we can clean up finalizers in parallel + hypershift destroy cluster kubevirt --name $CLUSTER_NAME & + DESTROY_PID=$! + + # Wait a bit for deletion to start + sleep 10 + + # Remove stuck finalizers while destroy is running (and again while waiting below). + cleanup_finalizers_best_effort + + # Wait for destroy to complete + echo "Waiting for destroy to complete..." + wait $DESTROY_PID || true + + echo "" + echo "Waiting for HostedCluster and namespace deletion to complete..." + # One more cleanup pass after destroy returns (in case it exited early). + cleanup_finalizers_best_effort + # 1) hostedcluster object in the clusters namespace + wait_for_gone hostedcluster "${CLUSTER_NAME}" "clusters" 1200 + # 2) the hosted control plane namespace itself + wait_for_gone namespace "clusters-${CLUSTER_NAME}" "" 1200 +fi + +# Create namespace with PRIMARY UDN label (IMMUTABLE - must be set at creation!) +cat </dev/null; do + if [ $elapsed -ge $timeout ]; then + echo "ERROR: Timeout waiting for etcd StatefulSet to be created" + exit 1 + fi + echo "Still waiting for etcd StatefulSet... ($elapsed/$timeout seconds)" + sleep 10 + elapsed=$((elapsed + 10)) +done + +echo "βœ“ etcd StatefulSet created" +echo "" + +# Wait for etcd-0 pod to be running and ready +echo "Waiting for etcd-0 pod to be running and ready..." +oc -n "$NAMESPACE" wait --for=condition=Ready pod/etcd-0 --timeout=600s + +echo "βœ“ etcd-0 pod is ready" +echo "" + +# Apply the EndpointSlice workaround +echo "==========================================" +echo "Applying etcd EndpointSlice Workaround" +echo "==========================================" +echo "" + +if [ -f "./create-etcd-mirrors.sh" ]; then + ./create-etcd-mirrors.sh + + if [ $? -eq 0 ]; then + echo "" + echo "βœ“ EndpointSlice workaround applied successfully" + else + echo "" + echo "WARNING: EndpointSlice workaround script failed" + echo "You may need to run it manually: ./create-etcd-mirrors.sh" + fi +else + echo "ERROR: create-etcd-mirrors.sh not found in current directory" + echo "Please ensure the script is present and run it manually" +fi + +echo "" +echo "==========================================" +echo "Waiting for Hosted Cluster to be Available" +echo "==========================================" +echo "" + +# Wait for the HostedCluster to have Available=True condition +echo "Waiting for hosted cluster to be available..." +timeout=1800 # 30 minutes - cluster creation takes time +elapsed=0 +while true; do + AVAILABLE=$(oc get hostedcluster ${CLUSTER_NAME} -n clusters -o jsonpath='{.status.conditions[?(@.type=="Available")].status}' 2>/dev/null) + if [ "$AVAILABLE" = "True" ]; then + echo "βœ“ Hosted cluster is available!" + break + fi + + if [ $elapsed -ge $timeout ]; then + echo "WARNING: Timeout waiting for hosted cluster to be available." + echo "DNS fix may need manual setup after cluster is ready." + echo "Current cluster status:" + oc get hostedcluster ${CLUSTER_NAME} -n clusters -o jsonpath='{.status.conditions[*].type}={.status.conditions[*].status}' 2>/dev/null || true + break + fi + + # Show progress + PROGRESS=$(oc get hostedcluster ${CLUSTER_NAME} -n clusters -o jsonpath='{.status.conditions[?(@.type=="Available")].message}' 2>/dev/null | head -c 80) + echo "Waiting... ($elapsed/$timeout seconds) - ${PROGRESS:-initializing}" + sleep 30 + elapsed=$((elapsed + 30)) +done + +echo "" +echo "==========================================" +echo "Setting Up Internal DNS for Primary UDN" +echo "==========================================" +echo "" + +# Extract guest cluster kubeconfig +echo "Extracting guest cluster kubeconfig..." +if ! oc get secret/${CLUSTER_NAME}-admin-kubeconfig -n clusters &>/dev/null; then + echo "WARNING: Kubeconfig secret not found. Cluster may not be ready yet." + echo "Skipping DNS setup - you may need to run this manually later." +else + oc extract -n clusters secret/${CLUSTER_NAME}-admin-kubeconfig --to=. --confirm + mv kubeconfig ${CLUSTER_NAME}-kubeconfig + export GUEST_KUBECONFIG="${PWD}/${CLUSTER_NAME}-kubeconfig" + echo "Guest kubeconfig saved to: ${GUEST_KUBECONFIG}" + + echo "" + echo "Waiting for guest nodes to be Ready (required to schedule router/DNS pods)..." + timeout="${WAIT_GUEST_NODES_TIMEOUT_SECONDS}" + elapsed=0 + while true; do + ready_nodes="$( + oc --kubeconfig "${GUEST_KUBECONFIG}" get nodes -o json 2>/dev/null | \ + jq -r '[.items[] | select(any(.status.conditions[]?; .type=="Ready" and .status=="True"))] | length' 2>/dev/null || echo 0 + )" + if [ "${ready_nodes}" -ge "${WAIT_GUEST_NODES_READY_MIN}" ]; then + echo "βœ“ Guest Ready nodes: ${ready_nodes}" + break + fi + if [ "${elapsed}" -ge "${timeout}" ]; then + echo "WARNING: Only ${ready_nodes} Ready guest nodes after ${timeout}s. Skipping internal DNS/OAuth setup for now." + echo "This is expected right after control plane becomes Available=True." + echo "Re-run this script later once nodes join." + goto_after_dns_setup=true + break + fi + echo "Still waiting for guest Ready nodes (have ${ready_nodes}, need >= ${WAIT_GUEST_NODES_READY_MIN})... (${elapsed}/${timeout} seconds)" + sleep 15 + elapsed=$((elapsed + 15)) + done + if [ "${goto_after_dns_setup:-false}" = "true" ]; then + # Close the kubeconfig extraction block early (skip DNS/OAuth setup). + echo "" + echo "Skipping DNS/OAuth setup due to missing guest nodes." + else + + # Discover the guest cluster apps domain (avoid hardcoding env-specific suffixes) + APPS_DOMAIN="$(oc --kubeconfig "${GUEST_KUBECONFIG}" get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}' 2>/dev/null || true)" + if [ -z "${APPS_DOMAIN}" ]; then + # Backward-compatible fallback (old spike default) + APPS_DOMAIN="apps.${CLUSTER_NAME}.apps.hypershift.qinqon.corp" + echo "WARNING: Could not determine guest apps domain from guest cluster." + echo "Falling back to: ${APPS_DOMAIN}" + else + echo "Guest cluster apps domain: ${APPS_DOMAIN}" + fi + +# Wait for router to be available in guest cluster +echo "" +echo "Waiting for guest cluster router to be available..." +timeout=600 +elapsed=0 +while ! oc --kubeconfig ${GUEST_KUBECONFIG} get svc router-internal-default -n openshift-ingress &>/dev/null; do + if [ $elapsed -ge $timeout ]; then + echo "WARNING: Timeout waiting for router service. DNS fix may need manual setup." + break + fi + echo "Still waiting for router service... ($elapsed/$timeout seconds)" + sleep 30 + elapsed=$((elapsed + 30)) +done + +# Get the internal router ClusterIP +ROUTER_IP=$(oc --kubeconfig ${GUEST_KUBECONFIG} get svc router-internal-default -n openshift-ingress -o jsonpath='{.spec.clusterIP}' 2>/dev/null) +if [ -z "$ROUTER_IP" ]; then + echo "WARNING: Could not get router ClusterIP. DNS fix may need manual setup." +else + echo "Internal router ClusterIP: ${ROUTER_IP}" + + echo "" + echo "Creating internal DNS server in guest cluster..." + + # Create namespace, ConfigMap, Deployment, and Service for internal DNS + cat </dev/null | jq -e '.subsets | length > 0' >/dev/null; then + echo "WARNING: internal-apps-dns Service has no Endpoints yet." + echo "Not patching the guest DNS operator to avoid breaking cluster DNS." + skip_dns_operator_patch=true + fi + fi + + if [ "${skip_dns_operator_patch:-false}" != "true" ]; then + echo "Configuring DNS operator to forward selected zones to internal DNS..." + + # Update DNS operator servers idempotently without clobbering existing entries. + # We forward: + # - the guest apps domain (so console/canary/downloads resolve to internal router) + # - (later) the OAuth zone (so oauth host resolves to oauth-bridge) + DNS_ZONES_JSON="$(jq -c -n --arg apps "${APPS_DOMAIN}" '[ $apps ]')" + + update_dns_operator_servers() { + local zones_json="$1" + local upstream="${2}" + + local current_servers + current_servers="$(oc --kubeconfig "${GUEST_KUBECONFIG}" get dns.operator.openshift.io default -o json 2>/dev/null | jq -c '.spec.servers // []' || echo '[]')" + + local updated_servers + updated_servers="$(jq -c --argjson zones "${zones_json}" --arg upstream "${upstream}" ' + . as $servers + | (map(.name) | index("internal-apps")) as $idx + | if $idx == null then + ($servers + [{"name":"internal-apps","zones":$zones,"forwardPlugin":{"upstreams":[$upstream]}}]) + else + ($servers | map(if .name=="internal-apps" then (.zones=$zones | .forwardPlugin.upstreams=[$upstream]) else . end)) + end + ' <<<"${current_servers}")" + + oc --kubeconfig "${GUEST_KUBECONFIG}" patch dns.operator.openshift.io default --type=merge \ + -p "$(jq -c -n --argjson servers "${updated_servers}" '{spec:{servers:$servers}}')" + } + + update_dns_operator_servers "${DNS_ZONES_JSON}" "${DNS_SVC_IP}:5353" + + echo "" + echo "βœ“ Internal DNS configured for Primary UDN" + echo " - Apps routes (console, canary, downloads) will resolve to internal router: ${ROUTER_IP}" + echo " - This fixes the ingress ClusterOperator health checks" + + echo "" + echo "Verifying apps domain forwarding is effective..." + DNSPOD="$(oc --kubeconfig "${GUEST_KUBECONFIG}" -n openshift-dns get pod -o json 2>/dev/null | jq -r '.items[0].metadata.name' || true)" + if [ -n "${DNSPOD}" ] && [ -n "${APPS_DOMAIN}" ]; then + RESOLVED_IP="$(oc --kubeconfig "${GUEST_KUBECONFIG}" -n openshift-dns exec "${DNSPOD}" -- sh -c "nslookup canary-openshift-ingress-canary.${APPS_DOMAIN} 2>/dev/null | sed -n 's/^Address: \\([0-9.]*\\).*/\\1/p' | tail -n1" || true)" + echo "canary-openshift-ingress-canary.${APPS_DOMAIN} -> ${RESOLVED_IP:-}" + if [ -n "${RESOLVED_IP}" ] && [ "${RESOLVED_IP}" != "${ROUTER_IP}" ]; then + echo "WARNING: Canary hostname did not resolve to router-internal ClusterIP (${ROUTER_IP})." + echo "Ingress canary checks from the management control plane may still fail until this matches." + fi + else + echo "WARNING: Could not verify DNS resolution (openshift-dns pod or apps domain missing)." + fi + fi +fi + +# ========================================== +# OAuth Bridge Setup for Console Access +# ========================================== +# Problem: Console pod in guest cluster needs to reach OAuth server, but: +# - Console runs in guest VM (Primary UDN only: 10.150.0.0/16) +# - OAuth server runs in HCP namespace on management cluster +# - Guest VM has no route to management cluster network +# +# Solution: All HCP namespace pods (including OAuth) get a Primary UDN IP. +# Create a Service in the guest cluster pointing directly to OAuth pod's Primary UDN IP. +# No proxy needed - direct connection! +# ========================================== + +echo "" +echo "==========================================" +echo "Setting Up OAuth Bridge for Console Access" +echo "==========================================" +echo "" + +# Discover the OAuth route host from the management cluster (avoid hardcoding) +OAUTH_HOST="$( + oc get route -n "clusters-${CLUSTER_NAME}" -o json 2>/dev/null | \ + jq -r --arg cn "${CLUSTER_NAME}" ' + [.items[] + | select(.spec.host != null) + | select(.spec.host | test("^oauth-")) + | select(.spec.host | contains("clusters-" + $cn)) + | .spec.host][0] // empty + ' 2>/dev/null +)" +if [ -z "${OAUTH_HOST}" ]; then + # Backward-compatible fallback (old spike default) + OAUTH_HOST="oauth-clusters-${CLUSTER_NAME}.apps.hypershift.qinqon.corp" + echo "WARNING: Could not discover OAuth route host from management cluster; falling back to: ${OAUTH_HOST}" +else + echo "OAuth external hostname (discovered): ${OAUTH_HOST}" +fi + +OAUTH_ZONE="${OAUTH_HOST#*.}" + +# Get OAuth pod's Primary UDN IP from the pod-networks annotation +echo "Getting OAuth pod's Primary UDN IP..." +OAUTH_UDN_IP="" +timeout=600 +elapsed=0 +while [ $elapsed -lt $timeout ]; do + OAUTH_UDN_IP="$(oc get pod -n "clusters-${CLUSTER_NAME}" -l app=oauth-openshift \ + -o jsonpath='{.items[0].metadata.annotations.k8s\.ovn\.org/pod-networks}' 2>/dev/null | \ + jq -r ".\"clusters-${CLUSTER_NAME}/hcp-${CLUSTER_NAME}\".ip_address" 2>/dev/null | cut -d/ -f1 || true)" + if [ -n "${OAUTH_UDN_IP}" ] && [ "${OAUTH_UDN_IP}" != "null" ]; then + break + fi + echo "Still waiting for OAuth pod Primary UDN IP... ($elapsed/$timeout seconds)" + sleep 15 + elapsed=$((elapsed + 15)) +done + +if [ -z "$OAUTH_UDN_IP" ] || [ "$OAUTH_UDN_IP" = "null" ]; then + echo "WARNING: Could not get OAuth pod Primary UDN IP." + echo "OAuth setup may need manual configuration after OAuth pod is running." +else + if [ "${skip_dns_operator_patch:-false}" = "true" ]; then + echo "WARNING: Internal DNS is not ready; skipping OAuth DNS forwarding setup." + echo "You can rerun this script later once internal-apps-dns is Available." + fi + echo "OAuth pod Primary UDN IP: ${OAUTH_UDN_IP}" + + # Create service in guest cluster pointing directly to OAuth pod + echo "" + echo "Creating OAuth bridge service in guest cluster..." + cat < ${OAUTH_UDN_IP}:6443" + echo " - No proxy needed - direct connection to OAuth pod!" + echo " - This allows console to authenticate users" +fi + +fi # Close the guest-nodes gating if/else + +fi # Close the kubeconfig extraction if block (or the early-skip branch) + +# ========================================== +# L2 Connectivity Verification +# ========================================== +# Phase 1 of Full Isolation: Verify that control plane pods and worker VMs +# can communicate directly on the Primary L2 UDN (10.150.0.0/16) +# This is a prerequisite for full isolation where both CP and workers +# share the same isolated L2 network segment. +# ========================================== + +echo "" +echo "==========================================" +echo "Verifying L2 Connectivity (Full Isolation Phase 1)" +echo "==========================================" +echo "" + +NAMESPACE="clusters-${CLUSTER_NAME}" + +# Function to get Primary UDN IP from pod-networks annotation +get_udn_ip() { + local pod_name=$1 + local namespace=$2 + oc get pod "$pod_name" -n "$namespace" \ + -o jsonpath='{.metadata.annotations.k8s\.ovn\.org/pod-networks}' 2>/dev/null | \ + jq -r "to_entries[] | select(.value.role==\"primary\") | .value.ip_address" 2>/dev/null | \ + cut -d/ -f1 +} + +# Get a control plane pod (kube-apiserver) +echo "Finding control plane pod..." +CP_POD=$(oc get pod -n ${NAMESPACE} -l app=kube-apiserver -o name 2>/dev/null | head -1) +if [ -z "$CP_POD" ]; then + echo "WARNING: No kube-apiserver pod found. Skipping L2 verification." +else + CP_POD_NAME=$(echo "$CP_POD" | sed 's|pod/||') + CP_UDN_IP=$(get_udn_ip "$CP_POD_NAME" "$NAMESPACE") + echo "Control plane pod: ${CP_POD_NAME}" + echo "Control plane Primary UDN IP: ${CP_UDN_IP:-not found}" + + # Get a worker VM (virt-launcher pod) + echo "" + echo "Finding worker VM pod..." + WORKER_POD=$(oc get pod -n ${NAMESPACE} -l kubevirt.io=virt-launcher -o name 2>/dev/null | head -1) + if [ -z "$WORKER_POD" ]; then + echo "WARNING: No virt-launcher pod found. Worker VMs may not be running yet." + echo "Skipping L2 connectivity test." + else + WORKER_POD_NAME=$(echo "$WORKER_POD" | sed 's|pod/||') + WORKER_UDN_IP=$(get_udn_ip "$WORKER_POD_NAME" "$NAMESPACE") + echo "Worker VM pod: ${WORKER_POD_NAME}" + echo "Worker VM Primary UDN IP: ${WORKER_UDN_IP:-not found}" + + if [ -n "$CP_UDN_IP" ] && [ -n "$WORKER_UDN_IP" ] && [ "$CP_UDN_IP" != "null" ] && [ "$WORKER_UDN_IP" != "null" ]; then + echo "" + echo "Testing L2 connectivity: Control Plane (${CP_UDN_IP}) -> Worker VM (${WORKER_UDN_IP})" + echo "---" + + # Test connectivity from control plane to worker + # Note: kube-apiserver container may not have ping, try with different methods + L2_TEST_RESULT="unknown" + + # Try ping first (may not be available in all containers) + if oc exec -n ${NAMESPACE} ${CP_POD} -c kube-apiserver -- ping -c 3 -W 5 ${WORKER_UDN_IP} 2>/dev/null; then + L2_TEST_RESULT="success" + else + # Try with nc (netcat) if available + if oc exec -n ${NAMESPACE} ${CP_POD} -c kube-apiserver -- nc -zv -w 5 ${WORKER_UDN_IP} 22 2>/dev/null; then + L2_TEST_RESULT="success" + else + # Try with curl timeout to any port + if oc exec -n ${NAMESPACE} ${CP_POD} -c kube-apiserver -- timeout 5 bash -c "echo > /dev/tcp/${WORKER_UDN_IP}/22" 2>/dev/null; then + L2_TEST_RESULT="success" + else + L2_TEST_RESULT="failed" + fi + fi + fi + + echo "" + if [ "$L2_TEST_RESULT" = "success" ]; then + echo "βœ“ L2 connectivity verified!" + echo " Control plane pods and worker VMs can communicate on Primary UDN." + echo " This is the foundation for full isolation." + else + echo "⚠ L2 connectivity test inconclusive." + echo " The test tools may not be available in the container." + echo " Manual verification recommended:" + echo " oc exec -n ${NAMESPACE} ${CP_POD} -c kube-apiserver -- ping -c 3 ${WORKER_UDN_IP}" + fi + + echo "" + echo "Network Summary:" + echo " Primary UDN Subnet: 10.150.0.0/16" + echo " Control Plane IP: ${CP_UDN_IP}" + echo " Worker VM IP: ${WORKER_UDN_IP}" + echo "" + echo "Both are on the same L2 segment - direct communication possible!" + else + echo "" + echo "WARNING: Could not determine UDN IPs for both pods." + echo "L2 connectivity verification skipped." + fi + fi +fi + +echo "" +echo "==========================================" +echo "Cluster Setup Complete" +echo "==========================================" +echo "" +if [ -n "${GUEST_KUBECONFIG}" ] && [ -f "${GUEST_KUBECONFIG}" ]; then + echo "Guest cluster kubeconfig: ${GUEST_KUBECONFIG}" + echo "" + echo "To access the guest cluster:" + echo " export KUBECONFIG=${GUEST_KUBECONFIG}" + echo " oc get nodes" +else + echo "Guest cluster kubeconfig not extracted yet." + echo "To extract it manually:" + echo " oc extract -n clusters secret/${CLUSTER_NAME}-admin-kubeconfig --to=. --confirm" + echo " mv kubeconfig ${CLUSTER_NAME}-kubeconfig" +fi +echo "" + +echo "Rescale nodepool to 2 replicas..." +NODEPOOL_NAME=$CLUSTER_NAME +NODEPOOL_REPLICAS=2 + +oc scale nodepool/$NODEPOOL_NAME --namespace clusters --replicas=$NODEPOOL_REPLICAS + +echo "Waiting for guest nodes to be Ready (required to rescale nodepool)..." +timeout="${WAIT_GUEST_NODES_TIMEOUT_SECONDS}" +elapsed=0 +while true; do + ready_nodes="$( + oc --kubeconfig "${GUEST_KUBECONFIG}" get nodes -o json 2>/dev/null | \ + jq -r '[.items[] | select(any(.status.conditions[]?; .type=="Ready" and .status=="True"))] | length' 2>/dev/null || echo 0 + )" + if [ "${ready_nodes}" -ge "${NODEPOOL_REPLICAS}" ]; then + echo "βœ“ Guest Ready nodes: ${ready_nodes}" + break + fi + if [ "${elapsed}" -ge "${timeout}" ]; then + echo "WARNING: Only ${ready_nodes} Ready guest nodes after ${timeout}s. Skipping internal DNS/OAuth setup for now." + echo "This is expected right after control plane becomes Available=True." + echo "Re-run this script later once nodes join." + goto_after_dns_setup=true + break + fi + echo "Still waiting for guest Ready nodes (have ${ready_nodes}, need >= ${NODEPOOL_REPLICAS})... (${elapsed}/${timeout} seconds)" + sleep 15 + elapsed=$((elapsed + 15)) +done + +echo "Decrease hosted cluster Prometheus retention period to 8 hours to prevent over logging..." +oc --kubeconfig "${CLUSTER_NAME}-kubeconfig" -n openshift-monitoring patch prometheus k8s --type=merge -p '{"spec":{"retention":"8h"}}' + +# oc login --server=https://api.hypershift.qinqon.corp:6443 -u kubeadmin -p SFs6e-gkwGC-H8Uk8-w3TTU diff --git a/cluster-up.sh b/cluster-up.sh new file mode 100755 index 00000000000..e4522bc53f8 --- /dev/null +++ b/cluster-up.sh @@ -0,0 +1,183 @@ +#!/bin/bash + +set -e + +export KUBECONFIG=/root/.kcli/clusters/hypershift/auth/kubeconfig + +echo "" +echo "==========================================" +echo "Building HyperShift Binaries" +echo "==========================================" +echo "" + +# Build all binaries using make +make hypershift product-cli control-plane-operator hypershift-operator + +# Install CLIs +sudo install -m 0755 -o root -g root bin/hypershift /usr/local/bin/hypershift +sudo install -m 0755 -o root -g root bin/hcp /usr/local/bin/hcp + +echo "" +echo "βœ“ Binaries built and installed" +echo "" + +echo "" +echo "==========================================" +echo "Installing HyperShift Operator" +echo "==========================================" +echo "" + +hypershift install \ + --enable-conversion-webhook=true \ + --wait-until-available + +# Patch hypershift-operator deployment to use custom kubevirt CAPI provider image +oc set env deployment/operator -n hypershift \ + IMAGE_KUBEVIRT_CAPI_PROVIDER=quay.io/ramlavi/cluster-api-provider-kubevirt:4.18 + +# Wait for the operator to restart +oc rollout status deployment/operator -n hypershift + +# Force rollout of capi-provider in all hosted cluster namespaces +echo "Rolling out capi-provider in all hosted cluster namespaces..." +for ns in $(oc get ns -l hypershift.openshift.io/hosted-control-plane -o jsonpath='{.items[*].metadata.name}'); do + if oc get deployment capi-provider -n "$ns" &>/dev/null; then + echo " Restarting capi-provider in namespace: $ns" + oc rollout restart deployment/capi-provider -n "$ns" + oc rollout status deployment/capi-provider -n "$ns" --timeout=60s || true + fi +done + +echo "Done! All capi-provider deployments updated." + +echo "" +echo "==========================================" +echo "Building Custom HyperShift Operator Image" +echo "==========================================" +echo "" + +# Build and push custom hypershift-operator with UDN support +export QUAY_REPO_OP="quay.io/ramlavi/hypershift-operator" +export OP_IMAGE_TAG="udn-fix-$(date +%Y%m%d-%H%M%S)" +export OP_IMAGE="${QUAY_REPO_OP}:${OP_IMAGE_TAG}" +export REGISTRY_AUTH_FILE=/root/ralavi/merged-pull-secret.json + +echo "Building container image: ${OP_IMAGE}" +# Explicitly pass auth file to podman build +REGISTRY_AUTH_FILE=/root/ralavi/merged-pull-secret.json podman build . -f Dockerfile -t ${OP_IMAGE} + +echo "" +echo "Pushing to Quay.io..." +# Try to use existing podman auth, fall back to docker config if needed +if [ -f "/run/user/$(id -u)/containers/auth.json" ]; then + podman push --authfile "/run/user/$(id -u)/containers/auth.json" ${OP_IMAGE} +elif [ -f "$HOME/.docker/config.json" ]; then + podman push --authfile "$HOME/.docker/config.json" ${OP_IMAGE} +else + podman push ${OP_IMAGE} +fi + +echo "" +echo "Cleaning up old local images..." +# Remove old local images with the same repo but different tags (keep the one we just built) +podman images --filter "reference=${QUAY_REPO_OP}:udn-fix-*" --format "{{.ID}} {{.Repository}}:{{.Tag}}" | \ + grep -v "${OP_IMAGE_TAG}" | awk '{print $1}' | xargs -r podman rmi -f 2>/dev/null || true + +echo "" +echo "βœ“ Custom operator image pushed: ${OP_IMAGE}" +echo "" + +# Update the running hypershift-operator deployment +echo "Updating hypershift-operator deployment to use new image..." +# Use wildcard to update ALL containers at once (simpler and more reliable) +oc set image deployment/operator -n hypershift "*=${OP_IMAGE}" +oc rollout status deployment/operator -n hypershift + +echo "" +echo "βœ“ HyperShift operator updated with UDN fix!" +echo "" + +echo "" +echo "==========================================" +echo "Building Custom Control Plane Operator Image" +echo "==========================================" +echo "" + +# Build and push custom control-plane-operator with UDN support +export QUAY_REPO="quay.io/ramlavi/hypershift-control-plane-operator" +export CPO_IMAGE_TAG="udn-fix-$(date +%Y%m%d-%H%M%S)" +export CPO_IMAGE="${QUAY_REPO}:${CPO_IMAGE_TAG}" + +echo "Building container image: ${CPO_IMAGE}" +# Explicitly pass auth file to podman build +REGISTRY_AUTH_FILE=/root/ralavi/merged-pull-secret.json podman build -f Dockerfile.control-plane -t ${CPO_IMAGE} . + +echo "" +echo "Pushing to Quay.io..." +# Try to use existing podman auth, fall back to docker config if needed +if [ -f "/run/user/$(id -u)/containers/auth.json" ]; then + podman push --authfile "/run/user/$(id -u)/containers/auth.json" ${CPO_IMAGE} +elif [ -f "$HOME/.docker/config.json" ]; then + podman push --authfile "$HOME/.docker/config.json" ${CPO_IMAGE} +else +podman push ${CPO_IMAGE} +fi + +echo "" +echo "Cleaning up old local images..." +# Remove old local images with the same repo but different tags (keep the one we just built) +podman images --filter "reference=${QUAY_REPO}:udn-fix-*" --format "{{.ID}} {{.Repository}}:{{.Tag}}" | \ + grep -v "${CPO_IMAGE_TAG}" | awk '{print $1}' | xargs -r podman rmi -f 2>/dev/null || true + +echo "" +echo "βœ“ Custom CPO image pushed: ${CPO_IMAGE}" +echo "" + +# Save the image reference for cluster-sync.sh to use +echo "${CPO_IMAGE}" > ${PWD}/.cpo-image +echo "Image reference saved to .cpo-image" + +echo "" +echo "==========================================" +echo "Updating Existing Clusters" +echo "" + +# Apply the CPO image annotation to all existing HostedClusters +# This tells the hypershift-operator to use our custom CPO image +echo "Checking for existing hosted clusters..." +for hc in $(oc get hostedcluster --all-namespaces -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}{"\n"}{end}'); do + namespace=$(echo "$hc" | cut -d'/' -f1) + name=$(echo "$hc" | cut -d'/' -f2) + + echo " Updating HostedCluster: $namespace/$name" + oc annotate hostedcluster "$name" -n "$namespace" \ + hypershift.openshift.io/control-plane-operator-image="${CPO_IMAGE}" \ + --overwrite + + # The hypershift-operator will automatically update the CPO deployment when it sees the annotation change + # But we can speed things up by restarting the CPO pod + if oc get pod -n "clusters-${name}" -l app=control-plane-operator &>/dev/null; then + echo " Restarting control-plane-operator pod..." + oc delete pod -n "clusters-${name}" -l app=control-plane-operator + echo " Waiting for new pod to become ready..." + oc wait --for=condition=ready pod -n "clusters-${name}" -l app=control-plane-operator --timeout=120s || echo " Warning: CPO pod not ready yet, continuing..." + fi +done + +echo "" +echo "βœ“ All hosted clusters updated to use new CPO image: ${CPO_IMAGE}" +echo "" + +echo "" +echo "==========================================" +echo "Cleaning Up Dangling Images" +echo "==========================================" +echo "" + +# Remove dangling images (intermediate build layers with tag) +echo "Removing dangling/intermediate build images..." +podman image prune -f + +echo "" +echo "βœ“ Cleanup complete!" +echo "" diff --git a/cmd/cluster/kubevirt/create.go b/cmd/cluster/kubevirt/create.go index 0382f6eda4a..dd2a061d4ea 100644 --- a/cmd/cluster/kubevirt/create.go +++ b/cmd/cluster/kubevirt/create.go @@ -3,6 +3,7 @@ package kubevirt import ( "context" "fmt" + "net/netip" "os" "strings" @@ -37,6 +38,8 @@ func bindCoreOptions(opts *RawCreateOptions, flags *pflag.FlagSet) { flags.StringVar(&opts.InfraNamespace, "infra-namespace", opts.InfraNamespace, "The namespace in the external infra cluster that is used to host the KubeVirt virtual machines. The namespace must exist prior to creating the HostedCluster") flags.StringArrayVar(&opts.InfraStorageClassMappings, "infra-storage-class-mapping", opts.InfraStorageClassMappings, "KubeVirt CSI mapping of an infra StorageClass to a guest cluster StorageCluster. Mapping is structured as /. Example, mapping the infra storage class ocs-storagecluster-ceph-rbd to a guest storage class called ceph-rdb. --infra-storage-class-mapping=ocs-storagecluster-ceph-rbd/ceph-rdb. Group storage classes and volumesnapshot classes by adding ,group=") flags.StringArrayVar(&opts.InfraVolumeSnapshotClassMappings, "infra-volumesnapshot-class-mapping", opts.InfraVolumeSnapshotClassMappings, "KubeVirt CSI mapping of an infra VolumeSnapshotClass to a guest cluster VolumeSnapshotCluster. Mapping is structured as /. Example, mapping the infra volume snapshot class ocs-storagecluster-rbd-snap to a guest volume snapshot class called rdb-snap. --infra-volumesnapshot-class-mapping=ocs-storagecluster-rbd-snap/rdb-snap. Group storage classes and volumesnapshot classes by adding ,group=") + flags.StringVar(&opts.PrimaryUDNName, "primary-udn-name", opts.PrimaryUDNName, "Enable Primary UDN for the hosted control plane namespace by specifying the UserDefinedNetwork name (KubeVirt)") + flags.StringVar(&opts.PrimaryUDNSubnet, "primary-udn-subnet", opts.PrimaryUDNSubnet, "Subnet CIDR for the Primary UDN to create/ensure (e.g. 10.150.0.0/16). Required when --primary-udn-name is set") } func BindDeveloperOptions(opts *RawCreateOptions, flags *pflag.FlagSet) { @@ -55,10 +58,33 @@ type RawCreateOptions struct { InfraNamespace string InfraStorageClassMappings []string InfraVolumeSnapshotClassMappings []string + PrimaryUDNName string + PrimaryUDNSubnet string NodePoolOpts *kubevirtnodepool.RawKubevirtPlatformCreateOptions } +func shouldDefaultPrimaryUDNForCIConformanceJob(createOpts *core.CreateOptions) bool { + // This is intentionally narrow-scoped. It's meant as a temporary spike mechanism to + // enable Primary UDN in a single CI lane without changing openshift/release. + if os.Getenv("OPENSHIFT_CI") != "true" { + return false + } + jobName := os.Getenv("JOB_NAME") + if jobName == "" { + return false + } + // HyperShift CI lanes that should default to Primary UDN. + if !strings.Contains(jobName, "e2e-kubevirt-metal-conformance") && + !strings.Contains(jobName, "e2e-azure-kubevirt-ovn") { + return false + } + if createOpts == nil || createOpts.Name == "" { + return false + } + return true +} + // validatedCreateOptions is a private wrapper that enforces a call of Validate() before Complete() can be invoked. type validatedCreateOptions struct { *RawCreateOptions @@ -107,6 +133,23 @@ func (o *RawCreateOptions) Validate(ctx context.Context, opts *core.CreateOption return nil, fmt.Errorf("external infra cluster kubeconfig was provided but an infra namespace is missing") } + // CI spike: default Primary UDN for a specific conformance lane only when the user didn't + // provide any Primary UDN configuration. + if o.PrimaryUDNName == "" && o.PrimaryUDNSubnet == "" && shouldDefaultPrimaryUDNForCIConformanceJob(opts) { + o.PrimaryUDNName = fmt.Sprintf("hcp-%s", opts.Name) + o.PrimaryUDNSubnet = "10.150.0.0/16" + } + + // Primary UDN enablement is atomic: either both name+subnet are set, or neither. + if o.PrimaryUDNName != "" || o.PrimaryUDNSubnet != "" { + if o.PrimaryUDNName == "" || o.PrimaryUDNSubnet == "" { + return nil, fmt.Errorf("--primary-udn-name and --primary-udn-subnet must be set together") + } + if _, err := netip.ParsePrefix(o.PrimaryUDNSubnet); err != nil { + return nil, fmt.Errorf("invalid --primary-udn-subnet %q: %w", o.PrimaryUDNSubnet, err) + } + } + validOpts := &ValidatedCreateOptions{ validatedCreateOptions: &validatedCreateOptions{ RawCreateOptions: o, @@ -187,6 +230,14 @@ func (o *CreateOptions) ApplyPlatformSpecifics(cluster *hyperv1.HostedCluster) e Kubevirt: &hyperv1.KubevirtPlatformSpec{}, } + if o.PrimaryUDNName != "" { + if cluster.Annotations == nil { + cluster.Annotations = map[string]string{} + } + cluster.Annotations[hyperv1.PrimaryUDNNameAnnotation] = o.PrimaryUDNName + cluster.Annotations[hyperv1.PrimaryUDNSubnetAnnotation] = o.PrimaryUDNSubnet + } + if len(o.InfraKubeConfigFile) > 0 { cluster.Spec.Platform.Kubevirt.Credentials = &hyperv1.KubevirtPlatformCredentials{ InfraKubeConfigSecret: &hyperv1.KubeconfigSecretRef{ diff --git a/cmd/install/assets/hypershift_operator.go b/cmd/install/assets/hypershift_operator.go index 65520093ff2..cede0270253 100644 --- a/cmd/install/assets/hypershift_operator.go +++ b/cmd/install/assets/hypershift_operator.go @@ -1391,16 +1391,26 @@ func (o HyperShiftOperatorClusterRole) Build() *rbacv1.ClusterRole { Verbs: []string{rbacv1.VerbAll}, }, { - APIGroups: []string{"admissionregistration.k8s.io"}, - Resources: []string{"validatingwebhookconfigurations"}, - Verbs: []string{"delete"}, - ResourceNames: []string{hyperv1.GroupVersion.Group}, - }, - { - APIGroups: []string{"certificates.k8s.io"}, - Resources: []string{"certificatesigningrequests"}, - Verbs: []string{"get", "list", "watch"}, + APIGroups: []string{"k8s.ovn.org"}, + Resources: []string{"userdefinednetworks"}, + Verbs: []string{"get", "list", "watch", "create", "update", "patch", "delete"}, }, + { + APIGroups: []string{"admissionregistration.k8s.io"}, + Resources: []string{"validatingwebhookconfigurations"}, + Verbs: []string{"delete"}, + ResourceNames: []string{hyperv1.GroupVersion.Group}, + }, + { + APIGroups: []string{"admissionregistration.k8s.io"}, + Resources: []string{"validatingadmissionpolicies", "validatingadmissionpolicybindings"}, + Verbs: []string{"get", "list", "watch", "create", "update", "patch", "delete"}, + }, + { + APIGroups: []string{"certificates.k8s.io"}, + Resources: []string{"certificatesigningrequests"}, + Verbs: []string{"get", "list", "watch"}, + }, { APIGroups: []string{"certificates.k8s.io"}, Resources: []string{"certificatesigningrequests/status"}, diff --git a/control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go b/control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go index df118a065fe..ab00f623e45 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go +++ b/control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go @@ -43,6 +43,7 @@ import ( cvov2 "github.com/openshift/hypershift/control-plane-operator/controllers/hostedcontrolplane/v2/cvo" dnsoperatorv2 "github.com/openshift/hypershift/control-plane-operator/controllers/hostedcontrolplane/v2/dnsoperator" endpointresolverv2 "github.com/openshift/hypershift/control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver" + etcdudn "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/etcd" etcdv2 "github.com/openshift/hypershift/control-plane-operator/controllers/hostedcontrolplane/v2/etcd" fgv2 "github.com/openshift/hypershift/control-plane-operator/controllers/hostedcontrolplane/v2/fg" ignitionserverv2 "github.com/openshift/hypershift/control-plane-operator/controllers/hostedcontrolplane/v2/ignitionserver" @@ -1196,6 +1197,17 @@ func (r *HostedControlPlaneReconciler) reconcileCPOV2(ctx context.Context, hcp * } } + // On Primary UDN namespaces, the default etcd-client/etcd-discovery EndpointSlices + // carry default-network IPs that are unreachable from UDN pods. Create UDN-annotated + // EndpointSlices and orphan the defaults so kube-apiserver can reach etcd. + // This MUST run in the CPO (not the HCCO) because the HCCO is only deployed after + // kube-apiserver is available, creating a chicken-and-egg problem. + if hcp.Spec.Platform.Type == hyperv1.KubevirtPlatform && hcp.Annotations["hypershift.openshift.io/primary-udn"] == "true" { + if err := etcdudn.ReconcileUDNEndpointSlices(ctx, r.Client, upsert.New(r.EnableCIDebugOutput), hcp.Namespace); err != nil { + errs = append(errs, fmt.Errorf("failed to reconcile etcd UDN EndpointSlices: %w", err)) + } + } + return utilerrors.NewAggregate(errs) } diff --git a/control-plane-operator/controllers/hostedcontrolplane/infra/infra.go b/control-plane-operator/controllers/hostedcontrolplane/infra/infra.go index 78efcd80a0d..ea5533dc523 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/infra/infra.go +++ b/control-plane-operator/controllers/hostedcontrolplane/infra/infra.go @@ -544,6 +544,18 @@ func (r *Reconciler) reconcileKonnectivityServiceStatus(ctx context.Context, hcp err = fmt.Errorf("failed to get konnectivity service: %w", err) return } + + // For KubeVirt with Primary UDN, use internal service endpoint + // Worker VMs on Primary UDN cannot reach external routes + if hcp.Spec.Platform.Type == hyperv1.KubevirtPlatform { + if hcp.Annotations != nil && hcp.Annotations["hypershift.openshift.io/primary-udn"] == "true" { + // Primary UDN detected - use internal ClusterIP service endpoint + host = fmt.Sprintf("konnectivity-server.%s.svc.cluster.local", hcp.Namespace) + port = 8091 + return + } + } + var route *routev1.Route if serviceStrategy.Type == hyperv1.Route { route = manifests.KonnectivityServerRoute(hcp.Namespace) diff --git a/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/GCP/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml b/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/GCP/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml index 8b606059f31..2491604177f 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/GCP/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml +++ b/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/GCP/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml @@ -53,6 +53,8 @@ rules: - get - list - watch + - patch + - update - apiGroups: - hypershift.openshift.io resources: @@ -89,6 +91,18 @@ rules: - watch - patch - update +- apiGroups: + - discovery.k8s.io + resources: + - endpointslices + verbs: + - get + - list + - watch + - create + - patch + - update + - delete - apiGroups: - cluster.x-k8s.io resources: diff --git a/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml b/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml index 8b606059f31..2491604177f 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml +++ b/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml @@ -53,6 +53,8 @@ rules: - get - list - watch + - patch + - update - apiGroups: - hypershift.openshift.io resources: @@ -89,6 +91,18 @@ rules: - watch - patch - update +- apiGroups: + - discovery.k8s.io + resources: + - endpointslices + verbs: + - get + - list + - watch + - create + - patch + - update + - delete - apiGroups: - cluster.x-k8s.io resources: diff --git a/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml b/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml index 8b606059f31..2491604177f 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml +++ b/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml @@ -53,6 +53,8 @@ rules: - get - list - watch + - patch + - update - apiGroups: - hypershift.openshift.io resources: @@ -89,6 +91,18 @@ rules: - watch - patch - update +- apiGroups: + - discovery.k8s.io + resources: + - endpointslices + verbs: + - get + - list + - watch + - create + - patch + - update + - delete - apiGroups: - cluster.x-k8s.io resources: diff --git a/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml b/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml index 8b606059f31..2491604177f 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml +++ b/control-plane-operator/controllers/hostedcontrolplane/testdata/hosted-cluster-config-operator/zz_fixture_TestControlPlaneComponents_hosted_cluster_config_operator_role.yaml @@ -53,6 +53,8 @@ rules: - get - list - watch + - patch + - update - apiGroups: - hypershift.openshift.io resources: @@ -89,6 +91,18 @@ rules: - watch - patch - update +- apiGroups: + - discovery.k8s.io + resources: + - endpointslices + verbs: + - get + - list + - watch + - create + - patch + - update + - delete - apiGroups: - cluster.x-k8s.io resources: diff --git a/control-plane-operator/controllers/hostedcontrolplane/v2/assets/hosted-cluster-config-operator/role.yaml b/control-plane-operator/controllers/hostedcontrolplane/v2/assets/hosted-cluster-config-operator/role.yaml index 35f3de2cd9a..c23e8e40c99 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/v2/assets/hosted-cluster-config-operator/role.yaml +++ b/control-plane-operator/controllers/hostedcontrolplane/v2/assets/hosted-cluster-config-operator/role.yaml @@ -45,6 +45,8 @@ rules: - get - list - watch + - patch + - update - apiGroups: - hypershift.openshift.io resources: @@ -81,6 +83,18 @@ rules: - watch - patch - update +- apiGroups: + - discovery.k8s.io + resources: + - endpointslices + verbs: + - get + - list + - watch + - create + - patch + - update + - delete - apiGroups: - cluster.x-k8s.io resources: diff --git a/control-plane-operator/controllers/hostedcontrolplane/v2/assets/ignition-server-proxy/deployment.yaml b/control-plane-operator/controllers/hostedcontrolplane/v2/assets/ignition-server-proxy/deployment.yaml index 06e866ddc15..f740cd9fc55 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/v2/assets/ignition-server-proxy/deployment.yaml +++ b/control-plane-operator/controllers/hostedcontrolplane/v2/assets/ignition-server-proxy/deployment.yaml @@ -64,7 +64,7 @@ spec: - name: serving-cert secret: defaultMode: 416 - secretName: ignition-server-serving-cert + secretName: ignition-server - configMap: defaultMode: 420 name: root-ca diff --git a/control-plane-operator/controllers/hostedcontrolplane/v2/etcd/statefulset.go b/control-plane-operator/controllers/hostedcontrolplane/v2/etcd/statefulset.go index 0fe9d4bd5e6..da3ea1b95e5 100644 --- a/control-plane-operator/controllers/hostedcontrolplane/v2/etcd/statefulset.go +++ b/control-plane-operator/controllers/hostedcontrolplane/v2/etcd/statefulset.go @@ -39,18 +39,45 @@ func adaptStatefulSet(cpContext component.WorkloadContext, sts *appsv1.StatefulS }, ) + // For KubeVirt platform, use 0.0.0.0 to listen on all interfaces + // This is required for Primary UDN support where pods have multiple IPs + // and the default route goes through the UDN network + isKubeVirt := hcp.Spec.Platform.Type == hyperv1.KubevirtPlatform + if !ipv4 { - util.UpsertEnvVar(c, corev1.EnvVar{ - Name: "ETCD_LISTEN_PEER_URLS", - Value: "https://[$(POD_IP)]:2380", - }) + if isKubeVirt { + // IPv6 KubeVirt: listen on all interfaces + // Note: [::] includes localhost, so no need to specify it separately + util.UpsertEnvVar(c, corev1.EnvVar{ + Name: "ETCD_LISTEN_CLIENT_URLS", + Value: "https://[::]:2379", + }) + util.UpsertEnvVar(c, corev1.EnvVar{ + Name: "ETCD_LISTEN_METRICS_URLS", + Value: "https://[::]:2382", + }) + } else { + // IPv6 non-KubeVirt: use POD_IP + util.UpsertEnvVar(c, corev1.EnvVar{ + Name: "ETCD_LISTEN_PEER_URLS", + Value: "https://[$(POD_IP)]:2380", + }) + util.UpsertEnvVar(c, corev1.EnvVar{ + Name: "ETCD_LISTEN_CLIENT_URLS", + Value: "https://[$(POD_IP)]:2379,https://localhost:2379", + }) + util.UpsertEnvVar(c, corev1.EnvVar{ + Name: "ETCD_LISTEN_METRICS_URLS", + Value: "https://[::]:2382", + }) + } + } else if isKubeVirt { + // IPv4 KubeVirt: listen on all interfaces + // Note: 0.0.0.0 includes localhost (127.0.0.1), so no need to specify it separately + // Adding localhost would cause "bind: address already in use" error util.UpsertEnvVar(c, corev1.EnvVar{ Name: "ETCD_LISTEN_CLIENT_URLS", - Value: "https://[$(POD_IP)]:2379,https://localhost:2379", - }) - util.UpsertEnvVar(c, corev1.EnvVar{ - Name: "ETCD_LISTEN_METRICS_URLS", - Value: "https://[::]:2382", + Value: "https://0.0.0.0:2379", }) } }) diff --git a/control-plane-operator/hostedclusterconfigoperator/cmd.go b/control-plane-operator/hostedclusterconfigoperator/cmd.go index d5a34ea14fb..3360365db4c 100644 --- a/control-plane-operator/hostedclusterconfigoperator/cmd.go +++ b/control-plane-operator/hostedclusterconfigoperator/cmd.go @@ -23,12 +23,14 @@ import ( "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/configmetrics" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/cmca" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/drainer" + "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/etcd" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/globalps" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/hcpstatus" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/inplaceupgrader" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/machine" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/node" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/nodecount" + "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/resources" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/controllers/spotremediation" "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/operator" @@ -67,6 +69,8 @@ var controllerFuncs = map[string]operator.ControllerSetupFunc{ "node": node.Setup, nodecount.ControllerName: nodecount.Setup, "machine": machine.Setup, + etcd.ControllerName: etcd.Setup, + primaryudn.ControllerName: primaryudn.Setup, "drainer": drainer.Setup, hcpstatus.ControllerName: hcpstatus.Setup, spotremediation.ControllerName: spotremediation.Setup, diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/endpointslices.go b/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/endpointslices.go new file mode 100644 index 00000000000..cf385bb499c --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/endpointslices.go @@ -0,0 +1,333 @@ +package etcd + +import ( + "context" + "encoding/json" + "fmt" + "net/netip" + "strings" + "time" + + hyperv1 "github.com/openshift/hypershift/api/hypershift/v1beta1" + "github.com/openshift/hypershift/support/upsert" + + corev1 "k8s.io/api/core/v1" + discoveryv1 "k8s.io/api/discovery/v1" + apierrors "k8s.io/apimachinery/pkg/api/errors" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/types" + "k8s.io/utils/ptr" + ctrl "sigs.k8s.io/controller-runtime" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/log" + "sigs.k8s.io/controller-runtime/pkg/predicate" + "sigs.k8s.io/controller-runtime/pkg/reconcile" +) + +const ( + managedByValue = "control-plane-operator.hypershift.openshift.io" + ovnPodNetworksAnnotationKey = "k8s.ovn.org/pod-networks" + ovnPrimaryRole = "primary" + ovnUDNServiceNameLabelKey = "k8s.ovn.org/service-name" + ovnUDNEndpointSliceNetworkAnnoKey = "k8s.ovn.org/endpointslice-network" + + etcdClientServiceName = "etcd-client" + etcdDiscoveryServiceName = "etcd-discovery" +) + +type podNetworksAnnotationEntry struct { + Role string `json:"role"` + IPAddress string `json:"ip_address"` +} + +type reconciler struct { + client client.Client + hcpKey types.NamespacedName + upsert.CreateOrUpdateProvider +} + +func (r *reconciler) Reconcile(ctx context.Context, req reconcile.Request) (reconcile.Result, error) { + logger := log.FromContext(ctx).WithValues("controller", ControllerName, "hcp", r.hcpKey.String()) + + hcp := &hyperv1.HostedControlPlane{} + if err := r.client.Get(ctx, r.hcpKey, hcp); err != nil { + if apierrors.IsNotFound(err) { + return ctrl.Result{}, nil + } + return ctrl.Result{}, err + } + if hcp.Spec.Platform.Type != hyperv1.KubevirtPlatform { + return ctrl.Result{}, nil + } + + etcdPods, err := r.listEtcdPods(ctx) + if err != nil { + return ctrl.Result{}, err + } + if len(etcdPods) == 0 { + return ctrl.Result{}, nil + } + + udnNetworkName, endpointsByFamily, ok, shouldRequeue := desiredEtcdEndpointsFromPods(etcdPods) + if !ok { + if shouldRequeue { + logger.V(1).Info("etcd pods found but primary UDN network not detected yet, requeueing") + return ctrl.Result{RequeueAfter: 10 * time.Second}, nil + } + return ctrl.Result{}, nil + } + + for _, svcName := range []string{etcdClientServiceName, etcdDiscoveryServiceName} { + if err := r.reconcileUDNEndpointSlicesForService(ctx, svcName, udnNetworkName, endpointsByFamily); err != nil { + return ctrl.Result{}, err + } + } + + return ctrl.Result{}, nil +} + +func (r *reconciler) listEtcdPods(ctx context.Context) ([]corev1.Pod, error) { + pods := &corev1.PodList{} + if err := r.client.List(ctx, pods, client.InNamespace(r.hcpKey.Namespace), client.MatchingLabels{"app": "etcd"}); err != nil { + return nil, err + } + return pods.Items, nil +} + +func (r *reconciler) reconcileUDNEndpointSlicesForService(ctx context.Context, serviceName, udnNetworkName string, endpointsByFamily map[discoveryv1.AddressType][]discoveryv1.Endpoint) error { + svc := &corev1.Service{} + if err := r.client.Get(ctx, types.NamespacedName{Namespace: r.hcpKey.Namespace, Name: serviceName}, svc); err != nil { + if apierrors.IsNotFound(err) { + return nil + } + return err + } + + ports := serviceEndpointPorts(svc) + + for addressType, endpoints := range endpointsByFamily { + // Only create slices that have endpoints for this family. + if len(endpoints) == 0 { + continue + } + + name := fmt.Sprintf("%s-udn-%s", serviceName, strings.ToLower(string(addressType))) + es := &discoveryv1.EndpointSlice{ObjectMeta: metav1.ObjectMeta{ + Namespace: r.hcpKey.Namespace, + Name: name, + }} + + _, err := r.CreateOrUpdate(ctx, r.client, es, func() error { + es.Labels = ensureStringMap(es.Labels) + es.Annotations = ensureStringMap(es.Annotations) + + es.Labels[discoveryv1.LabelManagedBy] = managedByValue + es.Labels[discoveryv1.LabelServiceName] = svc.Name + es.Labels[ovnUDNServiceNameLabelKey] = svc.Name + es.Annotations[ovnUDNEndpointSliceNetworkAnnoKey] = udnNetworkName + + es.AddressType = addressType + es.Ports = ports + es.Endpoints = endpoints + return nil + }) + if err != nil { + return err + } + } + + return nil +} + +func desiredEtcdEndpointsFromPods(pods []corev1.Pod) (udnNetworkName string, endpointsByFamily map[discoveryv1.AddressType][]discoveryv1.Endpoint, ok bool, shouldRequeue bool) { + endpointsByFamily = map[discoveryv1.AddressType][]discoveryv1.Endpoint{ + discoveryv1.AddressTypeIPv4: {}, + discoveryv1.AddressTypeIPv6: {}, + } + + anyAnnotation := false + anyParsed := false + + for i := range pods { + pod := pods[i] + networkName, ip, isPrimaryUDN, hasAnnotation, parsed := primaryUDNInfoFromPodNetworks(&pod) + anyAnnotation = anyAnnotation || hasAnnotation + anyParsed = anyParsed || parsed + + if !isPrimaryUDN || ip == "" { + continue + } + if udnNetworkName == "" { + udnNetworkName = networkName + } + + addr, err := netip.ParseAddr(ip) + if err != nil { + continue + } + + addressType := discoveryv1.AddressTypeIPv4 + if addr.Is6() { + addressType = discoveryv1.AddressTypeIPv6 + } + + ep := discoveryv1.Endpoint{ + Addresses: []string{ip}, + Conditions: discoveryv1.EndpointConditions{ + Ready: ptr.To(podReady(&pod)), + Serving: ptr.To(podReady(&pod)), + }, + TargetRef: &corev1.ObjectReference{ + APIVersion: "v1", + Kind: "Pod", + Namespace: pod.Namespace, + Name: pod.Name, + UID: pod.UID, + }, + } + if pod.Spec.Hostname != "" { + ep.Hostname = ptr.To(pod.Spec.Hostname) + } + endpointsByFamily[addressType] = append(endpointsByFamily[addressType], ep) + } + + if udnNetworkName != "" { + return udnNetworkName, endpointsByFamily, true, false + } + + // If we can parse pod-networks annotations and they don't indicate Primary UDN, we're in a non-UDN namespace. + if anyAnnotation && anyParsed { + return "", endpointsByFamily, false, false + } + + // Pods exist but annotations aren't available yet; retry a few times to avoid races early in rollout. + return "", endpointsByFamily, false, true +} + +func primaryUDNInfoFromPodNetworks(pod *corev1.Pod) (networkName, ip string, isPrimaryUDN, hasAnnotation, parsed bool) { + raw := "" + if pod.Annotations != nil { + raw = pod.Annotations[ovnPodNetworksAnnotationKey] + } + if raw == "" { + return "", "", false, false, false + } + + hasAnnotation = true + m := map[string]podNetworksAnnotationEntry{} + if err := json.Unmarshal([]byte(raw), &m); err != nil { + return "", "", false, true, false + } + parsed = true + + for networkKey, v := range m { + if v.Role != ovnPrimaryRole { + continue + } + if networkKey == "default" { + continue + } + ip = strings.SplitN(v.IPAddress, "/", 2)[0] + if ip == "" { + continue + } + return strings.ReplaceAll(networkKey, "/", "_"), ip, true, true, true + } + return "", "", false, true, true +} + +func podReady(pod *corev1.Pod) bool { + for i := range pod.Status.Conditions { + c := pod.Status.Conditions[i] + if c.Type == corev1.PodReady { + return c.Status == corev1.ConditionTrue + } + } + return false +} + +func serviceEndpointPorts(svc *corev1.Service) []discoveryv1.EndpointPort { + out := make([]discoveryv1.EndpointPort, 0, len(svc.Spec.Ports)) + for i := range svc.Spec.Ports { + p := svc.Spec.Ports[i] + out = append(out, discoveryv1.EndpointPort{ + Name: ptr.To(p.Name), + Protocol: ptr.To(p.Protocol), + Port: ptr.To(p.Port), + }) + } + return out +} + +func ensureStringMap(in map[string]string) map[string]string { + if in == nil { + return map[string]string{} + } + return in +} + +func ReconcileUDNEndpointSlices(ctx context.Context, c client.Client, createOrUpdateProvider upsert.CreateOrUpdateProvider, namespace string) error { + pods := &corev1.PodList{} + if err := c.List(ctx, pods, client.InNamespace(namespace), client.MatchingLabels{"app": "etcd"}); err != nil { + return err + } + if len(pods.Items) == 0 { + return nil + } + + udnNetworkName, endpointsByFamily, ok, _ := desiredEtcdEndpointsFromPods(pods.Items) + if !ok { + return nil + } + + r := &reconciler{ + client: c, + hcpKey: types.NamespacedName{Namespace: namespace}, + CreateOrUpdateProvider: createOrUpdateProvider, + } + + for _, svcName := range []string{etcdClientServiceName, etcdDiscoveryServiceName} { + if err := r.reconcileUDNEndpointSlicesForService(ctx, svcName, udnNetworkName, endpointsByFamily); err != nil { + return err + } + } + return nil +} + +func isEtcdPodInNamespace(namespace string) predicate.Predicate { + return predicate.NewPredicateFuncs(func(obj client.Object) bool { + return obj.GetNamespace() == namespace && obj.GetLabels()["app"] == "etcd" + }) +} + +func isEtcdServiceInNamespace(namespace string) predicate.Predicate { + return predicate.NewPredicateFuncs(func(obj client.Object) bool { + if obj.GetNamespace() != namespace { + return false + } + switch obj.GetName() { + case etcdClientServiceName, etcdDiscoveryServiceName: + return true + default: + return false + } + }) +} + +func isEtcdEndpointSliceInNamespace(namespace string) predicate.Predicate { + return predicate.NewPredicateFuncs(func(obj client.Object) bool { + if obj.GetNamespace() != namespace { + return false + } + labels := obj.GetLabels() + if labels == nil { + return false + } + switch labels[discoveryv1.LabelServiceName] { + case etcdClientServiceName, etcdDiscoveryServiceName: + return true + default: + return false + } + }) +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/endpointslices_test.go b/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/endpointslices_test.go new file mode 100644 index 00000000000..bac861dc8aa --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/endpointslices_test.go @@ -0,0 +1,151 @@ +package etcd + +import ( + "context" + "testing" + + hyperv1 "github.com/openshift/hypershift/api/hypershift/v1beta1" + "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/api" + "github.com/openshift/hypershift/support/upsert" + + corev1 "k8s.io/api/core/v1" + discoveryv1 "k8s.io/api/discovery/v1" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/types" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/client/fake" + "sigs.k8s.io/controller-runtime/pkg/reconcile" +) + +func TestReconcileCreatesUDNEndpointSliceAndOrphansDefault(t *testing.T) { + ctx := context.Background() + ns := "clusters-test" + hcpName := "hcp" + + hcp := &hyperv1.HostedControlPlane{ + ObjectMeta: metav1.ObjectMeta{ + Namespace: ns, + Name: hcpName, + }, + Spec: hyperv1.HostedControlPlaneSpec{ + Platform: hyperv1.PlatformSpec{ + Type: hyperv1.KubevirtPlatform, + }, + }, + } + + etcdPod := &corev1.Pod{ + ObjectMeta: metav1.ObjectMeta{ + Namespace: ns, + Name: "etcd-0", + UID: types.UID("pod-uid"), + Labels: map[string]string{"app": "etcd"}, + Annotations: map[string]string{ + ovnPodNetworksAnnotationKey: `{"default":{"ip_address":"192.168.0.10/24","role":"secondary"},"` + ns + `/myudn":{"ip_address":"10.150.0.10/24","role":"primary"}}`, + }, + }, + Status: corev1.PodStatus{ + Conditions: []corev1.PodCondition{ + {Type: corev1.PodReady, Status: corev1.ConditionTrue}, + }, + }, + } + + etcdClientSvc := &corev1.Service{ + ObjectMeta: metav1.ObjectMeta{ + Namespace: ns, + Name: etcdClientServiceName, + UID: types.UID("svc-client-uid"), + }, + Spec: corev1.ServiceSpec{ + ClusterIP: "None", + Ports: []corev1.ServicePort{ + {Name: "etcd-client", Protocol: corev1.ProtocolTCP, Port: 2379}, + {Name: "metrics", Protocol: corev1.ProtocolTCP, Port: 2381}, + }, + Selector: map[string]string{"app": "etcd"}, + }, + } + + etcdDiscoverySvc := &corev1.Service{ + ObjectMeta: metav1.ObjectMeta{ + Namespace: ns, + Name: etcdDiscoveryServiceName, + UID: types.UID("svc-discovery-uid"), + }, + Spec: corev1.ServiceSpec{ + ClusterIP: "None", + Ports: []corev1.ServicePort{ + {Name: "peer", Protocol: corev1.ProtocolTCP, Port: 2380}, + {Name: "etcd-client", Protocol: corev1.ProtocolTCP, Port: 2379}, + }, + Selector: map[string]string{"app": "etcd"}, + }, + } + + defaultSlice := &discoveryv1.EndpointSlice{ + ObjectMeta: metav1.ObjectMeta{ + Namespace: ns, + Name: "etcd-client-default", + Labels: map[string]string{ + discoveryv1.LabelServiceName: etcdClientServiceName, + discoveryv1.LabelManagedBy: defaultEndpointSliceController, + }, + }, + AddressType: discoveryv1.AddressTypeIPv4, + Endpoints: []discoveryv1.Endpoint{ + {Addresses: []string{"192.168.0.10"}}, + }, + } + + c := fake.NewClientBuilder(). + WithScheme(api.Scheme). + WithObjects(hcp, etcdPod, etcdClientSvc, etcdDiscoverySvc, defaultSlice). + Build() + + r := &reconciler{ + client: c, + hcpKey: types.NamespacedName{Namespace: ns, Name: hcpName}, + CreateOrUpdateProvider: upsert.New(false), + } + + if _, err := r.Reconcile(ctx, reconcile.Request{NamespacedName: r.hcpKey}); err != nil { + t.Fatalf("reconcile failed: %v", err) + } + + udnSlice := &discoveryv1.EndpointSlice{} + if err := c.Get(ctx, client.ObjectKey{Namespace: ns, Name: "etcd-client-udn-ipv4"}, udnSlice); err != nil { + t.Fatalf("expected UDN EndpointSlice to be created: %v", err) + } + if udnSlice.Labels[discoveryv1.LabelServiceName] != etcdClientServiceName { + t.Fatalf("expected kubernetes.io/service-name label to be %q, got %q", etcdClientServiceName, udnSlice.Labels[discoveryv1.LabelServiceName]) + } + if udnSlice.Labels[ovnUDNServiceNameLabelKey] != etcdClientServiceName { + t.Fatalf("expected ovn service-name label to be %q, got %q", etcdClientServiceName, udnSlice.Labels[ovnUDNServiceNameLabelKey]) + } + if udnSlice.Annotations[ovnUDNEndpointSliceNetworkAnnoKey] != ns+"_myudn" { + t.Fatalf("expected endpointslice-network annotation to be %q, got %q", ns+"_myudn", udnSlice.Annotations[ovnUDNEndpointSliceNetworkAnnoKey]) + } + if len(udnSlice.Endpoints) != 1 || len(udnSlice.Endpoints[0].Addresses) != 1 || udnSlice.Endpoints[0].Addresses[0] != "10.150.0.10" { + t.Fatalf("expected UDN endpoint address to be 10.150.0.10, got %#v", udnSlice.Endpoints) + } + if udnSlice.Endpoints[0].Conditions.Ready == nil || *udnSlice.Endpoints[0].Conditions.Ready != true { + t.Fatalf("expected endpoint to be ready, got %#v", udnSlice.Endpoints[0].Conditions.Ready) + } + if udnSlice.Endpoints[0].Conditions.Serving == nil || *udnSlice.Endpoints[0].Conditions.Serving != true { + t.Fatalf("expected endpoint to be serving, got %#v", udnSlice.Endpoints[0].Conditions.Serving) + } + + updatedDefault := &discoveryv1.EndpointSlice{} + if err := c.Get(ctx, client.ObjectKeyFromObject(defaultSlice), updatedDefault); err != nil { + t.Fatalf("expected default EndpointSlice to still exist: %v", err) + } + if updatedDefault.Labels != nil { + if _, ok := updatedDefault.Labels[discoveryv1.LabelServiceName]; ok { + t.Fatalf("expected default slice to be orphaned (no service-name label), got labels=%v", updatedDefault.Labels) + } + if _, ok := updatedDefault.Labels[discoveryv1.LabelManagedBy]; ok { + t.Fatalf("expected default slice to be orphaned (no managed-by label), got labels=%v", updatedDefault.Labels) + } + } +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/setup.go b/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/setup.go new file mode 100644 index 00000000000..584b9d9195a --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/etcd/setup.go @@ -0,0 +1,59 @@ +package etcd + +import ( + "context" + "fmt" + + hyperv1 "github.com/openshift/hypershift/api/hypershift/v1beta1" + corev1 "k8s.io/api/core/v1" + discoveryv1 "k8s.io/api/discovery/v1" + "k8s.io/apimachinery/pkg/types" + ctrl "sigs.k8s.io/controller-runtime" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/controller" + "sigs.k8s.io/controller-runtime/pkg/handler" + "sigs.k8s.io/controller-runtime/pkg/reconcile" + "sigs.k8s.io/controller-runtime/pkg/source" + + "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/operator" +) + +const ( + ControllerName = "etcd-udn-endpointslices" +) + +func Setup(ctx context.Context, opts *operator.HostedClusterConfigOperatorConfig) error { + // Primary UDN is currently only supported/used with the KubeVirt platform. + if opts.PlatformType != hyperv1.KubevirtPlatform { + return nil + } + + r := &reconciler{ + client: opts.CPCluster.GetClient(), + hcpKey: types.NamespacedName{Namespace: opts.Namespace, Name: opts.HCPName}, + CreateOrUpdateProvider: opts.TargetCreateOrUpdateProvider, + } + + enqueueHCP := handler.EnqueueRequestsFromMapFunc(func(ctx context.Context, o client.Object) []reconcile.Request { + return []reconcile.Request{{NamespacedName: r.hcpKey}} + }) + + c, err := controller.New(ControllerName, opts.Manager, controller.Options{Reconciler: r}) + if err != nil { + return fmt.Errorf("failed to construct controller: %w", err) + } + + if err := c.Watch(source.Kind[client.Object](opts.CPCluster.GetCache(), &corev1.Pod{}, enqueueHCP, isEtcdPodInNamespace(opts.Namespace))); err != nil { + return fmt.Errorf("failed to watch etcd Pods: %w", err) + } + if err := c.Watch(source.Kind[client.Object](opts.CPCluster.GetCache(), &corev1.Service{}, enqueueHCP, isEtcdServiceInNamespace(opts.Namespace))); err != nil { + return fmt.Errorf("failed to watch etcd Services: %w", err) + } + if err := c.Watch(source.Kind[client.Object](opts.CPCluster.GetCache(), &discoveryv1.EndpointSlice{}, enqueueHCP, isEtcdEndpointSliceInNamespace(opts.Namespace))); err != nil { + return fmt.Errorf("failed to watch etcd EndpointSlices: %w", err) + } + + logger := ctrl.LoggerFrom(ctx) + logger.Info("Setup", "controller", ControllerName) + return nil +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine.go b/control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine.go index c0f9876fe8e..34d09d05f5a 100644 --- a/control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine.go +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine.go @@ -2,6 +2,7 @@ package machine import ( "context" + "encoding/json" "fmt" "net/netip" "strings" @@ -28,6 +29,12 @@ import ( const ( managedByValue = "control-plane-operator.hypershift.openshift.io" + + // OVN-Kubernetes UDN EndpointSlice plumbing. + ovnUDNServiceNameLabelKey = "k8s.ovn.org/service-name" + ovnUDNEndpointSliceNetworkAnnoKey = "k8s.ovn.org/endpointslice-network" + ovnPodNetworksAnnotationKey = "k8s.ovn.org/pod-networks" + ovnPrimaryRole = "primary" ) func (r *reconciler) Reconcile(ctx context.Context, req reconcile.Request) (reconcile.Result, error) { @@ -190,8 +197,25 @@ func (r *reconciler) reconcileKubevirtPassthroughServiceEndpointsByIPFamily(ctx if endpointSlice.Labels == nil { endpointSlice.Labels = map[string]string{} } - endpointSlice.Labels[discoveryv1.LabelServiceName] = cpService.Name endpointSlice.Labels[discoveryv1.LabelManagedBy] = managedByValue + + udnNetworkName, isPrimaryUDN := r.primaryUDNNetworkName(ctx, cpService.Namespace) + + if isPrimaryUDN { + if endpointSlice.Annotations == nil { + endpointSlice.Annotations = map[string]string{} + } + endpointSlice.Labels[ovnUDNServiceNameLabelKey] = cpService.Name + endpointSlice.Annotations[ovnUDNEndpointSliceNetworkAnnoKey] = udnNetworkName + delete(endpointSlice.Labels, discoveryv1.LabelServiceName) // remove kubernetes.io/service-name + } else { + endpointSlice.Labels[discoveryv1.LabelServiceName] = cpService.Name + delete(endpointSlice.Labels, ovnUDNServiceNameLabelKey) + if endpointSlice.Annotations != nil { + delete(endpointSlice.Annotations, ovnUDNEndpointSliceNetworkAnnoKey) + } + } + endpointSlice.AddressType = discoveryv1.AddressType(ipFamily) if len(machineAddresses) > 0 { endpointSlice.Endpoints = []discoveryv1.Endpoint{{ @@ -211,6 +235,46 @@ func (r *reconciler) reconcileKubevirtPassthroughServiceEndpointsByIPFamily(ctx return nil } +type podNetworksAnnotationEntry struct { + Role string `json:"role"` +} + +// primaryUDNNetworkName determines if the namespace is using a Primary UDN network and returns the OVN-K +// endpointslice-network value (_) when it is. +// +// Note: HCCO runs with namespaced RBAC, so it cannot reliably read Namespace labels (cluster-scoped). +// Instead, we infer Primary UDN by inspecting the OVN pod-networks annotation on virt-launcher pods. +func (r *reconciler) primaryUDNNetworkName(ctx context.Context, namespace string) (string, bool) { + derived, ok := r.detectPrimaryUDNNetworkNameFromVirtLauncher(ctx, namespace) + return derived, ok +} + +// detectPrimaryUDNNetworkNameFromVirtLauncher derives the OVN-K primary network name used in +// k8s.ovn.org/endpointslice-network, from any virt-launcher pod's k8s.ovn.org/pod-networks annotation. +// The annotation JSON key is typically "/" which OVN-K expects as "_". +func (r *reconciler) detectPrimaryUDNNetworkNameFromVirtLauncher(ctx context.Context, namespace string) (string, bool) { + podList := &corev1.PodList{} + if err := r.kubevirtInfraClient.List(ctx, podList, client.InNamespace(namespace), client.MatchingLabels{"kubevirt.io": "virt-launcher"}); err != nil { + return "", false + } + for i := range podList.Items { + raw := podList.Items[i].Annotations[ovnPodNetworksAnnotationKey] + if raw == "" { + continue + } + var m map[string]podNetworksAnnotationEntry + if err := json.Unmarshal([]byte(raw), &m); err != nil { + continue + } + for networkKey, v := range m { + if v.Role == ovnPrimaryRole && networkKey != "default" { + return strings.ReplaceAll(networkKey, "/", "_"), true + } + } + } + return "", false +} + func serviceHasIPFamily(service *corev1.Service, ipFamilyToFind corev1.IPFamily) bool { for _, ipFamily := range service.Spec.IPFamilies { if ipFamily == ipFamilyToFind { @@ -237,6 +301,10 @@ func (r *reconciler) removeOrphanKubevirtPassthroughEndpointSlices(ctx context.C for _, endpointSlice := range endpointSliceList.Items { serviceName := endpointSlice.Labels[discoveryv1.LabelServiceName] + if serviceName == "" { + // UDN EndpointSlices use a different service name label. + serviceName = endpointSlice.Labels[ovnUDNServiceNameLabelKey] + } if serviceName == "" { continue } diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine_test.go b/control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine_test.go index 54455beb85a..53edef5d00b 100644 --- a/control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine_test.go +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/machine/machine_test.go @@ -220,6 +220,21 @@ func TestReconcileDefaultIngressEndpoints(t *testing.T) { } } + asPrimaryUDN := func(serviceName, udnNetworkName string) func(eps discoveryv1.EndpointSlice) discoveryv1.EndpointSlice { + return func(eps discoveryv1.EndpointSlice) discoveryv1.EndpointSlice { + if eps.Labels == nil { + eps.Labels = map[string]string{} + } + if eps.Annotations == nil { + eps.Annotations = map[string]string{} + } + eps.Labels[ovnUDNServiceNameLabelKey] = serviceName + eps.Annotations[ovnUDNEndpointSliceNetworkAnnoKey] = udnNetworkName + delete(eps.Labels, discoveryv1.LabelServiceName) + return eps + } + } + defaultIngressEndpointSliceIPv4 := func(machine capiv1.Machine, vm kubevirtv1.VirtualMachine, endpointSliceTransform ...func(discoveryv1.EndpointSlice) discoveryv1.EndpointSlice) discoveryv1.EndpointSlice { endpointSlice := discoveryv1.EndpointSlice{ ObjectMeta: metav1.ObjectMeta{ @@ -387,6 +402,7 @@ func TestReconcileDefaultIngressEndpoints(t *testing.T) { name string machines []capiv1.Machine virtualMachines []kubevirtv1.VirtualMachine + infraPods []corev1.Pod services []corev1.Service endpointSlices []discoveryv1.EndpointSlice hcp *hyperv1.HostedControlPlane @@ -438,6 +454,30 @@ func TestReconcileDefaultIngressEndpoints(t *testing.T) { }, hcp: kubevirtHCP, }, + { + name: "With Primary UDN namespace should create UDN endpointslices", + machines: pairOfDualStackRunningMachines, + virtualMachines: pairOfVirtualMachines, + infraPods: []corev1.Pod{{ + ObjectMeta: metav1.ObjectMeta{ + Namespace: "ns1", + Name: "virt-launcher-vm-worker1-abcde", + Labels: map[string]string{"kubevirt.io": "virt-launcher"}, + Annotations: map[string]string{ + ovnPodNetworksAnnotationKey: `{"ns1/hcp-cluster1":{"role":"primary"}}`, + }, + }, + }}, + services: []corev1.Service{defaultIngressService}, + expectedServices: []corev1.Service{defaultIngressService}, + expectedIngressEndpointSlices: []discoveryv1.EndpointSlice{ + defaultIngressEndpointSliceIPv4(pairOfDualStackRunningMachines[0], pairOfVirtualMachines[0], asPrimaryUDN(defaultIngressService.Name, "ns1_hcp-cluster1")), + defaultIngressEndpointSliceIPv4(pairOfDualStackRunningMachines[1], pairOfVirtualMachines[1], asPrimaryUDN(defaultIngressService.Name, "ns1_hcp-cluster1")), + defaultIngressEndpointSliceIPv6(pairOfDualStackRunningMachines[0], pairOfVirtualMachines[0], asPrimaryUDN(defaultIngressService.Name, "ns1_hcp-cluster1")), + defaultIngressEndpointSliceIPv6(pairOfDualStackRunningMachines[1], pairOfVirtualMachines[1], asPrimaryUDN(defaultIngressService.Name, "ns1_hcp-cluster1")), + }, + hcp: kubevirtHCP, + }, { name: "Should remove orphan endpoint slices", machines: pairOfDualStackRunningMachines, @@ -464,6 +504,10 @@ func TestReconcileDefaultIngressEndpoints(t *testing.T) { for _, tc := range testCases { t.Run(tc.name, func(t *testing.T) { kubevirtInfraClusterObjects := []client.Object{} + for _, p := range tc.infraPods { + pod := p // golang bug referencing for loop vars + kubevirtInfraClusterObjects = append(kubevirtInfraClusterObjects, &pod) + } for _, vm := range tc.virtualMachines { virtualMachine := vm // golang bug referencing for loop vars kubevirtInfraClusterObjects = append(kubevirtInfraClusterObjects, &virtualMachine) diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/dns_operator.go b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/dns_operator.go new file mode 100644 index 00000000000..33781d6511a --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/dns_operator.go @@ -0,0 +1,46 @@ +package primaryudn + +import ( + "context" + + operatorv1 "github.com/openshift/api/operator/v1" + + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" +) + +const internalAppsDNSServerName = "internal-apps" + +func ensureDNSOperatorForwarding(ctx context.Context, c client.Client, zones []string, upstream string) error { + dns := &operatorv1.DNS{ObjectMeta: metav1.ObjectMeta{Name: "default"}} + _, err := controllerutil.CreateOrUpdate(ctx, c, dns, func() error { + dns.Spec.Servers = upsertDNSServer(dns.Spec.Servers, operatorv1.Server{ + Name: internalAppsDNSServerName, + Zones: zones, + ForwardPlugin: operatorv1.ForwardPlugin{ + Upstreams: []string{upstream}, + }, + }) + return nil + }) + return err +} + +func upsertDNSServer(servers []operatorv1.Server, desired operatorv1.Server) []operatorv1.Server { + out := make([]operatorv1.Server, 0, len(servers)+1) + found := false + for i := range servers { + s := servers[i] + if s.Name == desired.Name { + out = append(out, desired) + found = true + } else { + out = append(out, s) + } + } + if !found { + out = append(out, desired) + } + return out +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/guest_info.go b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/guest_info.go new file mode 100644 index 00000000000..95e39050dff --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/guest_info.go @@ -0,0 +1,50 @@ +package primaryudn + +import ( + "context" + "fmt" + + configv1 "github.com/openshift/api/config/v1" + + appsv1 "k8s.io/api/apps/v1" + corev1 "k8s.io/api/core/v1" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "sigs.k8s.io/controller-runtime/pkg/client" +) + +func guestIngressDomain(ctx context.Context, c client.Client) (string, error) { + ing := &configv1.Ingress{ObjectMeta: metav1.ObjectMeta{Name: "cluster"}} + if err := c.Get(ctx, client.ObjectKeyFromObject(ing), ing); err != nil { + return "", err + } + if ing.Spec.Domain == "" { + return "", fmt.Errorf("guest ingress domain is empty") + } + return ing.Spec.Domain, nil +} + +func guestRouterInternalClusterIP(ctx context.Context, c client.Client) (string, error) { + const ( + routerNamespace = "openshift-ingress" + routerService = "router-internal-default" + ) + svc := &corev1.Service{ObjectMeta: metav1.ObjectMeta{Namespace: routerNamespace, Name: routerService}} + if err := c.Get(ctx, client.ObjectKeyFromObject(svc), svc); err != nil { + return "", err + } + if svc.Spec.ClusterIP == "" || svc.Spec.ClusterIP == "None" { + return "", fmt.Errorf("%s/%s has no ClusterIP yet", routerNamespace, routerService) + } + return svc.Spec.ClusterIP, nil +} + +func guestDNSImage(ctx context.Context, c client.Client) (string, error) { + ds := &appsv1.DaemonSet{ObjectMeta: metav1.ObjectMeta{Namespace: "openshift-dns", Name: "dns-default"}} + if err := c.Get(ctx, client.ObjectKeyFromObject(ds), ds); err != nil { + return "", err + } + if len(ds.Spec.Template.Spec.Containers) == 0 || ds.Spec.Template.Spec.Containers[0].Image == "" { + return "", fmt.Errorf("openshift-dns/dns-default has no container image") + } + return ds.Spec.Template.Spec.Containers[0].Image, nil +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/internal_dns.go b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/internal_dns.go new file mode 100644 index 00000000000..6655e1593e2 --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/internal_dns.go @@ -0,0 +1,159 @@ +package primaryudn + +import ( + "context" + "fmt" + "sort" + "strings" + + appsv1 "k8s.io/api/apps/v1" + corev1 "k8s.io/api/core/v1" + apierrors "k8s.io/apimachinery/pkg/api/errors" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/util/intstr" + "k8s.io/utils/ptr" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" +) + +const ( + internalAppsDNSNamespace = "internal-apps-dns" + internalAppsDNSConfigMap = "coredns-config" + internalAppsDNSDeploy = "internal-apps-dns" + internalAppsDNSSvc = "internal-apps-dns" + internalAppsDNSPort = 5353 +) + +func ensureInternalAppsDNSBase(ctx context.Context, c client.Client, dnsImage string) error { + ns := &corev1.Namespace{ObjectMeta: metav1.ObjectMeta{Name: internalAppsDNSNamespace}} + // Namespace is cluster-scoped; we only need to ensure it exists. + if err := c.Create(ctx, ns); err != nil && !apierrors.IsAlreadyExists(err) { + return err + } + + cm := &corev1.ConfigMap{ObjectMeta: metav1.ObjectMeta{Namespace: internalAppsDNSNamespace, Name: internalAppsDNSConfigMap}} + if _, err := controllerutil.CreateOrUpdate(ctx, c, cm, func() error { + if cm.Data == nil { + cm.Data = map[string]string{} + } + // Default Corefile is installed separately once we know the desired host mappings. + if cm.Data["Corefile"] == "" { + cm.Data["Corefile"] = ".:5353 {\n errors\n reload\n whoami\n}\n" + } + return nil + }); err != nil { + return err + } + + deploy := &appsv1.Deployment{ObjectMeta: metav1.ObjectMeta{Namespace: internalAppsDNSNamespace, Name: internalAppsDNSDeploy}} + if _, err := controllerutil.CreateOrUpdate(ctx, c, deploy, func() error { + labels := map[string]string{"app": internalAppsDNSDeploy} + deploy.Spec.Replicas = ptr.To[int32](1) + deploy.Spec.Selector = &metav1.LabelSelector{MatchLabels: labels} + deploy.Spec.Template.ObjectMeta.Labels = labels + deploy.Spec.Template.Spec.Containers = []corev1.Container{{ + Name: "coredns", + Image: dnsImage, + Args: []string{"-conf", "/etc/coredns/Corefile"}, + Ports: []corev1.ContainerPort{ + {Name: "dns-udp", ContainerPort: internalAppsDNSPort, Protocol: corev1.ProtocolUDP}, + {Name: "dns-tcp", ContainerPort: internalAppsDNSPort, Protocol: corev1.ProtocolTCP}, + }, + VolumeMounts: []corev1.VolumeMount{{Name: "config", MountPath: "/etc/coredns"}}, + }} + deploy.Spec.Template.Spec.Volumes = []corev1.Volume{{ + Name: "config", + VolumeSource: corev1.VolumeSource{ + ConfigMap: &corev1.ConfigMapVolumeSource{LocalObjectReference: corev1.LocalObjectReference{Name: internalAppsDNSConfigMap}}, + }, + }} + return nil + }); err != nil { + return err + } + + svc := &corev1.Service{ObjectMeta: metav1.ObjectMeta{Namespace: internalAppsDNSNamespace, Name: internalAppsDNSSvc}} + if _, err := controllerutil.CreateOrUpdate(ctx, c, svc, func() error { + svc.Spec.Selector = map[string]string{"app": internalAppsDNSDeploy} + svc.Spec.Ports = []corev1.ServicePort{ + {Name: "dns-udp", Port: internalAppsDNSPort, TargetPort: intstr.FromInt(internalAppsDNSPort), Protocol: corev1.ProtocolUDP}, + {Name: "dns-tcp", Port: internalAppsDNSPort, TargetPort: intstr.FromInt(internalAppsDNSPort), Protocol: corev1.ProtocolTCP}, + } + return nil + }); err != nil { + return err + } + + return nil +} + +func internalAppsDNSUpstream(ctx context.Context, c client.Client) (upstream string, ready bool, err error) { + clusterIP, ready, err := serviceClusterIPAndReadyEndpoints(ctx, c, internalAppsDNSNamespace, internalAppsDNSSvc) + if err != nil { + return "", false, err + } + if clusterIP == "" || !ready { + return "", false, nil + } + return fmt.Sprintf("%s:%d", clusterIP, internalAppsDNSPort), true, nil +} + +func ensureInternalAppsDNSCorefile(ctx context.Context, c client.Client, hosts map[string]string) error { + corefile := renderInternalAppsDNSCorefile(hosts) + cm := &corev1.ConfigMap{ObjectMeta: metav1.ObjectMeta{Namespace: internalAppsDNSNamespace, Name: internalAppsDNSConfigMap}} + _, err := controllerutil.CreateOrUpdate(ctx, c, cm, func() error { + if cm.Data == nil { + cm.Data = map[string]string{} + } + cm.Data["Corefile"] = corefile + return nil + }) + return err +} + +func renderInternalAppsDNSCorefile(hosts map[string]string) string { + // This CoreDNS only serves the specific host overrides. + // No forward plugin: unmatched queries get NXDOMAIN, which is correct because + // dns-default handles all other zones. Avoiding forward prevents the server from + // deadlocking on upstream DNS timeouts. + var b strings.Builder + b.WriteString(".:5353 {\n") + b.WriteString(" errors\n") + b.WriteString(" reload\n") + b.WriteString(" hosts {\n") + keys := make([]string, 0, len(hosts)) + for k := range hosts { + keys = append(keys, k) + } + sort.Strings(keys) + for _, host := range keys { + b.WriteString(fmt.Sprintf(" %s %s\n", hosts[host], host)) + } + b.WriteString(" }\n") + b.WriteString("}\n") + return b.String() +} + +func serviceClusterIPAndReadyEndpoints(ctx context.Context, c client.Client, namespace, name string) (clusterIP string, ready bool, err error) { + svc := &corev1.Service{ObjectMeta: metav1.ObjectMeta{Namespace: namespace, Name: name}} + if err := c.Get(ctx, client.ObjectKeyFromObject(svc), svc); err != nil { + return "", false, err + } + if svc.Spec.ClusterIP == "" || svc.Spec.ClusterIP == "None" { + return "", false, nil + } + + ep := &corev1.Endpoints{ObjectMeta: metav1.ObjectMeta{Namespace: namespace, Name: name}} + if err := c.Get(ctx, client.ObjectKeyFromObject(ep), ep); err != nil { + if apierrors.IsNotFound(err) { + return svc.Spec.ClusterIP, false, nil + } + return "", false, err + } + for i := range ep.Subsets { + if len(ep.Subsets[i].Addresses) > 0 { + return svc.Spec.ClusterIP, true, nil + } + } + return svc.Spec.ClusterIP, false, nil +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/mgmt_oauth.go b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/mgmt_oauth.go new file mode 100644 index 00000000000..f154a56ad27 --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/mgmt_oauth.go @@ -0,0 +1,85 @@ +package primaryudn + +import ( + "context" + "encoding/json" + "fmt" + "net/netip" + "strings" + + routev1 "github.com/openshift/api/route/v1" + + corev1 "k8s.io/api/core/v1" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "sigs.k8s.io/controller-runtime/pkg/client" +) + +const ( + oauthRouteName = "oauth" + oauthPodLabelValue = "oauth-openshift" + ovnPodNetworksAnnotation = "k8s.ovn.org/pod-networks" + ovnPrimaryRole = "primary" +) + +type podNetworksEntry struct { + Role string `json:"role"` + IPAddress string `json:"ip_address"` +} + +func mgmtOAuthRouteHost(ctx context.Context, c client.Client, namespace string) (string, error) { + route := &routev1.Route{ObjectMeta: metav1.ObjectMeta{Namespace: namespace, Name: oauthRouteName}} + if err := c.Get(ctx, client.ObjectKeyFromObject(route), route); err != nil { + return "", err + } + if route.Spec.Host == "" { + return "", fmt.Errorf("oauth route has empty host") + } + return route.Spec.Host, nil +} + +func mgmtOAuthPrimaryUDNIP(ctx context.Context, c client.Client, namespace string) (string, error) { + pods := &corev1.PodList{} + if err := c.List(ctx, pods, client.InNamespace(namespace), client.MatchingLabels{"app": oauthPodLabelValue}); err != nil { + return "", err + } + for i := range pods.Items { + ip, ok := primaryUDNIPFromPodNetworks(&pods.Items[i]) + if ok { + return ip, nil + } + } + return "", fmt.Errorf("no oauth pod with primary UDN IP found") +} + +func primaryUDNIPFromPodNetworks(pod *corev1.Pod) (string, bool) { + raw := "" + if pod.Annotations != nil { + raw = pod.Annotations[ovnPodNetworksAnnotation] + } + if raw == "" { + return "", false + } + + m := map[string]podNetworksEntry{} + if err := json.Unmarshal([]byte(raw), &m); err != nil { + return "", false + } + + for networkKey, v := range m { + if v.Role != ovnPrimaryRole { + continue + } + if networkKey == "default" { + continue + } + ip := strings.SplitN(v.IPAddress, "/", 2)[0] + if ip == "" { + continue + } + if _, err := netip.ParseAddr(ip); err != nil { + continue + } + return ip, true + } + return "", false +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/oauth_bridge.go b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/oauth_bridge.go new file mode 100644 index 00000000000..6df87c0f3a7 --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/oauth_bridge.go @@ -0,0 +1,54 @@ +package primaryudn + +import ( + "context" + + corev1 "k8s.io/api/core/v1" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/util/intstr" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" +) + +const ( + oauthBridgeNamespace = "openshift-authentication" + oauthBridgeName = "oauth-bridge" + oauthListenPort = 6443 +) + +func ensureGuestOAuthBridge(ctx context.Context, c client.Client, oauthUDNIP string) (serviceClusterIP string, err error) { + svc := &corev1.Service{ObjectMeta: metav1.ObjectMeta{Namespace: oauthBridgeNamespace, Name: oauthBridgeName}} + if _, err := controllerutil.CreateOrUpdate(ctx, c, svc, func() error { + svc.Spec.Ports = []corev1.ServicePort{{ + Name: "https", + Port: 443, + TargetPort: intstr.FromInt(oauthListenPort), + Protocol: corev1.ProtocolTCP, + }} + // Selectorless service; endpoints are managed separately. + svc.Spec.Selector = nil + return nil + }); err != nil { + return "", err + } + + ep := &corev1.Endpoints{ObjectMeta: metav1.ObjectMeta{Namespace: oauthBridgeNamespace, Name: oauthBridgeName}} + if _, err := controllerutil.CreateOrUpdate(ctx, c, ep, func() error { + ep.Subsets = []corev1.EndpointSubset{{ + Addresses: []corev1.EndpointAddress{{IP: oauthUDNIP}}, + Ports: []corev1.EndpointPort{{Name: "https", Port: oauthListenPort, Protocol: corev1.ProtocolTCP}}, + }} + return nil + }); err != nil { + return "", err + } + + // Re-get service to observe allocated ClusterIP. + if err := c.Get(ctx, client.ObjectKeyFromObject(svc), svc); err != nil { + return "", err + } + if svc.Spec.ClusterIP == "" || svc.Spec.ClusterIP == "None" { + return "", nil + } + return svc.Spec.ClusterIP, nil +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/reconciler.go b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/reconciler.go new file mode 100644 index 00000000000..1fc0b724394 --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/reconciler.go @@ -0,0 +1,129 @@ +package primaryudn + +import ( + "context" + "fmt" + "time" + + ctrl "sigs.k8s.io/controller-runtime" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/event" + "sigs.k8s.io/controller-runtime/pkg/predicate" +) + +type reconciler struct { + guestClient client.Client + mgmtClient client.Client + namespace string + hcpName string +} + +func (r *reconciler) Reconcile(ctx context.Context, _ ctrl.Request) (ctrl.Result, error) { + log := ctrl.LoggerFrom(ctx).WithValues("controller", ControllerName) + + ingressDomain, err := guestIngressDomain(ctx, r.guestClient) + if err != nil { + log.Info("waiting for guest ingress domain") + return ctrl.Result{RequeueAfter: 30 * time.Second}, nil + } + routerIP, err := guestRouterInternalClusterIP(ctx, r.guestClient) + if err != nil { + log.Info("waiting for guest router-internal ClusterIP") + return ctrl.Result{RequeueAfter: 30 * time.Second}, nil + } + dnsImage, err := guestDNSImage(ctx, r.guestClient) + if err != nil { + log.Info("waiting for guest dns-default DaemonSet") + return ctrl.Result{RequeueAfter: 30 * time.Second}, nil + } + + if err := ensureInternalAppsDNSBase(ctx, r.guestClient, dnsImage); err != nil { + return ctrl.Result{}, err + } + + internalUpstream, ready, err := internalAppsDNSUpstream(ctx, r.guestClient) + if err != nil { + return ctrl.Result{}, err + } + if !ready { + log.Info("waiting for internal-apps-dns endpoints") + return ctrl.Result{RequeueAfter: 10 * time.Second}, nil + } + + consoleHost := fmt.Sprintf("console-openshift-console.%s", ingressDomain) + canaryHost := fmt.Sprintf("canary-openshift-ingress-canary.%s", ingressDomain) + downloadsHost := fmt.Sprintf("downloads-openshift-console.%s", ingressDomain) + + hosts := map[string]string{ + consoleHost: routerIP, + canaryHost: routerIP, + downloadsHost: routerIP, + } + if err := ensureInternalAppsDNSCorefile(ctx, r.guestClient, hosts); err != nil { + return ctrl.Result{}, err + } + if err := ensureDNSOperatorForwarding(ctx, r.guestClient, []string{ingressDomain}, internalUpstream); err != nil { + return ctrl.Result{}, err + } + + oauthHost, err := mgmtOAuthRouteHost(ctx, r.mgmtClient, r.namespace) + if err != nil { + log.Info("waiting for OAuth route host") + return ctrl.Result{RequeueAfter: 30 * time.Second}, nil + } + oauthUDNIP, err := mgmtOAuthPrimaryUDNIP(ctx, r.mgmtClient, r.namespace) + if err != nil { + log.Info("waiting for OAuth pod primary UDN IP") + return ctrl.Result{RequeueAfter: 10 * time.Second}, nil + } + + oauthBridgeSvcIP, err := ensureGuestOAuthBridge(ctx, r.guestClient, oauthUDNIP) + if err != nil { + return ctrl.Result{}, err + } + if oauthBridgeSvcIP == "" { + return ctrl.Result{RequeueAfter: 10 * time.Second}, nil + } + + hosts[oauthHost] = oauthBridgeSvcIP + if err := ensureInternalAppsDNSCorefile(ctx, r.guestClient, hosts); err != nil { + return ctrl.Result{}, err + } + + zones := uniqueStrings([]string{ingressDomain, dnsZoneFromHostname(oauthHost)}) + if err := ensureDNSOperatorForwarding(ctx, r.guestClient, zones, internalUpstream); err != nil { + return ctrl.Result{}, err + } + + return ctrl.Result{}, nil +} + +func isOAuthPodInNamespace(namespace string) predicate.Funcs { + match := func(obj client.Object) bool { + if obj.GetNamespace() != namespace { + return false + } + if labels := obj.GetLabels(); labels != nil { + return labels["app"] == oauthPodLabelValue + } + return false + } + return predicate.Funcs{ + CreateFunc: func(e event.CreateEvent) bool { return match(e.Object) }, + UpdateFunc: func(e event.UpdateEvent) bool { return match(e.ObjectNew) }, + DeleteFunc: func(e event.DeleteEvent) bool { return match(e.Object) }, + GenericFunc: func(e event.GenericEvent) bool { return match(e.Object) }, + } +} + +func isOAuthRouteInNamespace(namespace string) predicate.Funcs { + match := func(obj client.Object) bool { + return obj.GetNamespace() == namespace && obj.GetName() == oauthRouteName + } + return predicate.Funcs{ + CreateFunc: func(e event.CreateEvent) bool { return match(e.Object) }, + UpdateFunc: func(e event.UpdateEvent) bool { return match(e.ObjectNew) }, + DeleteFunc: func(e event.DeleteEvent) bool { return match(e.Object) }, + GenericFunc: func(e event.GenericEvent) bool { return match(e.Object) }, + } +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/setup.go b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/setup.go new file mode 100644 index 00000000000..4c48e8435f1 --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/setup.go @@ -0,0 +1,74 @@ +package primaryudn + +import ( + "context" + "fmt" + + hyperv1 "github.com/openshift/hypershift/api/hypershift/v1beta1" + "github.com/openshift/hypershift/control-plane-operator/hostedclusterconfigoperator/operator" + hyperapi "github.com/openshift/hypershift/support/api" + + routev1 "github.com/openshift/api/route/v1" + + corev1 "k8s.io/api/core/v1" + "k8s.io/apimachinery/pkg/types" + ctrl "sigs.k8s.io/controller-runtime" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/controller" + "sigs.k8s.io/controller-runtime/pkg/handler" + "sigs.k8s.io/controller-runtime/pkg/reconcile" + "sigs.k8s.io/controller-runtime/pkg/source" +) + +const ControllerName = "primary-udn-guest-fixups" + +func Setup(ctx context.Context, opts *operator.HostedClusterConfigOperatorConfig) error { + if opts.PlatformType != hyperv1.KubevirtPlatform { + return nil + } + + logger := ctrl.LoggerFrom(ctx) + + // The Primary UDN annotation on the HCP is immutable, so checking once at setup + // is sufficient. Use a direct API client to read the HCP from the management + // cluster β€” it's namespace-scoped, so the HCCO SA has access. + directClient, err := client.New(opts.Config, client.Options{Scheme: hyperapi.Scheme}) + if err != nil { + return fmt.Errorf("failed to create direct client: %w", err) + } + hcp := &hyperv1.HostedControlPlane{} + if err := directClient.Get(ctx, types.NamespacedName{Namespace: opts.Namespace, Name: opts.HCPName}, hcp); err != nil { + return fmt.Errorf("failed to get HostedControlPlane %s/%s: %w", opts.Namespace, opts.HCPName, err) + } + if hcp.Annotations == nil || hcp.Annotations["hypershift.openshift.io/primary-udn"] != "true" { + logger.Info("Skipping controller: HCP is not Primary UDN", "controller", ControllerName) + return nil + } + + r := &reconciler{ + guestClient: opts.Manager.GetClient(), + mgmtClient: opts.CPCluster.GetClient(), + namespace: opts.Namespace, + hcpName: opts.HCPName, + } + + enqueueHCP := handler.EnqueueRequestsFromMapFunc(func(ctx context.Context, o client.Object) []reconcile.Request { + return []reconcile.Request{{NamespacedName: types.NamespacedName{Namespace: r.namespace, Name: r.hcpName}}} + }) + + c, err := controller.New(ControllerName, opts.Manager, controller.Options{Reconciler: r}) + if err != nil { + return fmt.Errorf("failed to construct controller: %w", err) + } + + // Re-reconcile when OAuth pods or routes change in the management cluster. + if err := c.Watch(source.Kind[client.Object](opts.CPCluster.GetCache(), &corev1.Pod{}, enqueueHCP, isOAuthPodInNamespace(opts.Namespace))); err != nil { + return fmt.Errorf("failed to watch OAuth pods: %w", err) + } + if err := c.Watch(source.Kind[client.Object](opts.CPCluster.GetCache(), &routev1.Route{}, enqueueHCP, isOAuthRouteInNamespace(opts.Namespace))); err != nil { + return fmt.Errorf("failed to watch OAuth routes: %w", err) + } + + logger.Info("Setup", "controller", ControllerName) + return nil +} diff --git a/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/util.go b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/util.go new file mode 100644 index 00000000000..5151633c024 --- /dev/null +++ b/control-plane-operator/hostedclusterconfigoperator/controllers/primaryudn/util.go @@ -0,0 +1,30 @@ +package primaryudn + +import "strings" + +func dnsZoneFromHostname(host string) string { + if host == "" { + return "" + } + if i := strings.IndexByte(host, '.'); i > 0 && i+1 < len(host) { + return host[i+1:] + } + return host +} + +func uniqueStrings(in []string) []string { + seen := map[string]struct{}{} + out := make([]string, 0, len(in)) + for _, s := range in { + s = strings.TrimSpace(s) + if s == "" { + continue + } + if _, ok := seen[s]; ok { + continue + } + seen[s] = struct{}{} + out = append(out, s) + } + return out +} diff --git a/control-plane-operator/hostedclusterconfigoperator/operator/config.go b/control-plane-operator/hostedclusterconfigoperator/operator/config.go index e59c3b97463..7cf2035f7fe 100644 --- a/control-plane-operator/hostedclusterconfigoperator/operator/config.go +++ b/control-plane-operator/hostedclusterconfigoperator/operator/config.go @@ -22,6 +22,7 @@ import ( operatorv1 "github.com/openshift/api/operator/v1" admissionregistrationv1 "k8s.io/api/admissionregistration/v1" + appsv1 "k8s.io/api/apps/v1" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/labels" "k8s.io/apimachinery/pkg/util/wait" @@ -134,6 +135,11 @@ func Mgr(ctx context.Context, cfg, cpConfig *rest.Config, namespace string, hcpN // Needed for inplace upgrader. &corev1.Node{}: allSelector, + // Needed for primary UDN guest DNS/OAuth fixups. + &appsv1.DaemonSet{}: allSelector, + &corev1.Endpoints{}: allSelector, + &operatorv1.DNS{}: allSelector, + // Needed for resource cleanup &corev1.Service{}: allSelector, &corev1.PersistentVolume{}: allSelector, diff --git a/hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go b/hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go index 36c362fc9ab..f0c0281cd4a 100644 --- a/hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go +++ b/hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go @@ -43,6 +43,7 @@ import ( platformaws "github.com/openshift/hypershift/hypershift-operator/controllers/hostedcluster/internal/platform/aws" "github.com/openshift/hypershift/hypershift-operator/controllers/hostedcluster/internal/proxy" hcmetrics "github.com/openshift/hypershift/hypershift-operator/controllers/hostedcluster/metrics" + "github.com/openshift/hypershift/hypershift-operator/controllers/hostedcluster/primaryudn" validations "github.com/openshift/hypershift/hypershift-operator/controllers/hostedcluster/validations" "github.com/openshift/hypershift/hypershift-operator/controllers/manifests" "github.com/openshift/hypershift/hypershift-operator/controllers/manifests/clusterapi" @@ -1042,6 +1043,21 @@ func (r *HostedClusterReconciler) reconcile(ctx context.Context, req ctrl.Reques } switch serviceStrategy.Type { case hyperv1.Route: + // For KubeVirt with Primary UDN, use internal service DNS instead of external route + // because Primary UDN namespace isolation prevents external ingress from reaching backend pods. + // Workers on Primary UDN can reach ClusterIP services directly via the UDN gateway. + if hcluster.Spec.Platform.Type == hyperv1.KubevirtPlatform { + // Check if namespace has Primary UDN label + if err := r.Client.Get(ctx, client.ObjectKeyFromObject(controlPlaneNamespace), controlPlaneNamespace); err == nil { + if _, hasPrimaryUDN := controlPlaneNamespace.Labels["k8s.ovn.org/primary-user-defined-network"]; hasPrimaryUDN { + // Use internal service DNS instead of external route + hcluster.Status.IgnitionEndpoint = fmt.Sprintf("ignition-server.%s.svc.cluster.local", controlPlaneNamespace.GetName()) + // Skip the rest of the Route logic + break + } + } + } + if serviceStrategy.Route != nil && serviceStrategy.Route.Hostname != "" { hcluster.Status.IgnitionEndpoint = serviceStrategy.Route.Hostname } else { @@ -1331,14 +1347,67 @@ func (r *HostedClusterReconciler) reconcile(ctx context.Context, req ctrl.Reques _, controlPlanePKIOperatorSignsCSRs := controlPlaneOperatorImageLabels[controlPlanePKIOperatorSignsCSRsLabel] _, useRestrictedPSA := controlPlaneOperatorImageLabels[useRestrictedPodSecurityLabel] _, defaultToControlPlaneV2 := controlPlaneOperatorImageLabels[defaultToControlPlaneV2Label] + primaryUDNName := "" + primaryUDNSubnet := "" + if hcluster.Spec.Platform.Type == hyperv1.KubevirtPlatform && hcluster.Annotations != nil { + primaryUDNName = hcluster.Annotations[hyperv1.PrimaryUDNNameAnnotation] + primaryUDNSubnet = hcluster.Annotations[hyperv1.PrimaryUDNSubnetAnnotation] + } + // Primary UDN enablement is atomic: either both name+subnet are set, or neither. + if primaryUDNName != "" || primaryUDNSubnet != "" { + if primaryUDNName == "" || primaryUDNSubnet == "" { + return ctrl.Result{}, fmt.Errorf("hostedcluster %s/%s must set both %q and %q (no partial Primary UDN configuration allowed)", hcluster.Namespace, hcluster.Name, hyperv1.PrimaryUDNNameAnnotation, hyperv1.PrimaryUDNSubnetAnnotation) + } + } + + // HACK(CI): The CI script pre-creates the HCP namespace without Primary UDN metadata. + // OVN-K requires the k8s.ovn.org/primary-user-defined-network label+annotation at namespace + // creation time, so we cannot simply patch them on. Instead, delete the stale namespace and + // let the createOrUpdate below recreate it with the correct metadata. + if primaryUDNName != "" && primaryUDNSubnet != "" { + const ovnPrimaryUDNKey = "k8s.ovn.org/primary-user-defined-network" + existingNS := &corev1.Namespace{} + if err := r.Client.Get(ctx, client.ObjectKeyFromObject(controlPlaneNamespace), existingNS); err == nil { + if existingNS.DeletionTimestamp != nil { + log.Info("waiting for namespace deletion to complete for Primary UDN re-creation", "namespace", existingNS.Name) + return ctrl.Result{RequeueAfter: 5 * time.Second}, nil + } + existingLabel := existingNS.Labels[ovnPrimaryUDNKey] + existingAnno := "" + if existingNS.Annotations != nil { + existingAnno = existingNS.Annotations[ovnPrimaryUDNKey] + } + if existingLabel != primaryUDNName || existingAnno != primaryUDNName { + log.Info("namespace missing required Primary UDN metadata; deleting for re-creation", + "namespace", existingNS.Name, + "expectedUDN", primaryUDNName, + "currentLabel", existingLabel, + "currentAnnotation", existingAnno) + if err := r.Client.Delete(ctx, existingNS); err != nil { + return ctrl.Result{}, fmt.Errorf("failed to delete namespace %s for Primary UDN re-creation: %w", existingNS.Name, err) + } + return ctrl.Result{RequeueAfter: 5 * time.Second}, nil + } + } + } // Reconcile the hosted cluster namespace _, err = createOrUpdate(ctx, r.Client, controlPlaneNamespace, func() error { + const ovnPrimaryUDNKey = "k8s.ovn.org/primary-user-defined-network" + if controlPlaneNamespace.Labels == nil { controlPlaneNamespace.Labels = make(map[string]string) } controlPlaneNamespace.Labels[ControlPlaneNamespaceLabelKey] = "true" + if primaryUDNName != "" && primaryUDNSubnet != "" { + controlPlaneNamespace.Labels[ovnPrimaryUDNKey] = primaryUDNName + if controlPlaneNamespace.Annotations == nil { + controlPlaneNamespace.Annotations = make(map[string]string) + } + controlPlaneNamespace.Annotations[ovnPrimaryUDNKey] = primaryUDNName + } + // Set pod security labels on HCP namespace psaOverride := hcluster.Annotations[hyperv1.PodSecurityAdmissionLabelOverrideAnnotation] if psaOverride != "" { @@ -1385,6 +1454,13 @@ func (r *HostedClusterReconciler) reconcile(ctx context.Context, req ctrl.Reques return ctrl.Result{}, fmt.Errorf("failed to reconcile namespace: %w", err) } + if primaryUDNName != "" && primaryUDNSubnet != "" { + if err := primaryudn.EnsureUserDefinedNetwork(ctx, r.Client, controlPlaneNamespace.Name, primaryUDNName, primaryUDNSubnet); err != nil { + return ctrl.Result{}, fmt.Errorf("failed to ensure primary UDN %q in namespace %q: %w", primaryUDNName, controlPlaneNamespace.Name, err) + } + + } + p, err := platform.GetPlatform(ctx, hcluster, releaseProvider, utilitiesImage, pullSecretBytes) if err != nil { return ctrl.Result{}, err @@ -1778,12 +1854,23 @@ func (r *HostedClusterReconciler) reconcile(ctx context.Context, req ctrl.Reques } hcp = controlplaneoperator.HostedControlPlane(controlPlaneNamespace.Name, hcluster.Name) _, err = createOrUpdate(ctx, r.Client, hcp, func() error { - return reconcileHostedControlPlane(hcp, hcluster, isAutoscalingNeeded, isAWSNodeTerminationHandlerNeeded, + if err := reconcileHostedControlPlane(hcp, hcluster, isAutoscalingNeeded, isAWSNodeTerminationHandlerNeeded, annotationsForCertRenewal(log, hcp, shouldCheckForStaleCerts(hcluster, defaultToControlPlaneV2), r.kasServingCertHashFromSecret(ctx, hcp), - r.kasServingCertHashFromEndpoint(kasHostAndPortFromHCP(hcp)))) + r.kasServingCertHashFromEndpoint(kasHostAndPortFromHCP(hcp)))); err != nil { + return err + } + // Set Primary UDN annotation on HCP if namespace has Primary UDN label + if hcluster.Spec.Platform.Type == hyperv1.KubevirtPlatform { + if _, hasPrimaryUDN := controlPlaneNamespace.Labels["k8s.ovn.org/primary-user-defined-network"]; hasPrimaryUDN { + hcp.Annotations["hypershift.openshift.io/primary-udn"] = "true" + } else { + delete(hcp.Annotations, "hypershift.openshift.io/primary-udn") + } + } + return nil }) if err != nil { return ctrl.Result{}, fmt.Errorf("failed to reconcile hostedcontrolplane: %w", err) diff --git a/hypershift-operator/controllers/hostedcluster/internal/platform/kubevirt/kubevirt.go b/hypershift-operator/controllers/hostedcluster/internal/platform/kubevirt/kubevirt.go index b8a00a3daa5..557877a3875 100644 --- a/hypershift-operator/controllers/hostedcluster/internal/platform/kubevirt/kubevirt.go +++ b/hypershift-operator/controllers/hostedcluster/internal/platform/kubevirt/kubevirt.go @@ -3,10 +3,8 @@ package kubevirt import ( "context" "fmt" - "os" hyperv1 "github.com/openshift/hypershift/api/hypershift/v1beta1" - "github.com/openshift/hypershift/support/images" "github.com/openshift/hypershift/support/upsert" appsv1 "k8s.io/api/apps/v1" @@ -69,16 +67,16 @@ func reconcileKubevirtCluster(kubevirtCluster *capikubevirt.KubevirtCluster, hcl } func (p Kubevirt) CAPIProviderDeploymentSpec(hcluster *hyperv1.HostedCluster, _ *hyperv1.HostedControlPlane) (*appsv1.DeploymentSpec, error) { - providerImage := "" - if envImage := os.Getenv(images.KubevirtCAPIProviderEnvVar); len(envImage) > 0 { - providerImage = envImage - } - if override, ok := hcluster.Annotations[hyperv1.ClusterAPIKubeVirtProviderImage]; ok { - providerImage = override - } - if providerImage == "" { - return nil, fmt.Errorf("kubevirt CAPI provider image not specified by environment variable %s or annotation %s", images.KubevirtCAPIProviderEnvVar, hyperv1.ClusterAPIKubeVirtProviderImage) - } + const providerImage = "registry.ci.openshift.org/ocp/4.18:cluster-api-provider-kubevirt" + //if envImage := os.Getenv(images.KubevirtCAPIProviderEnvVar); len(envImage) > 0 { + // providerImage = envImage + //} + //if override, ok := hcluster.Annotations[hyperv1.ClusterAPIKubeVirtProviderImage]; ok { + // providerImage = override + //} + //if providerImage == "" { + // providerImage = "registry.ci.openshift.org/ocp/4.18:cluster-api-provider-kubevirt" + //} defaultMode := int32(0640) return &appsv1.DeploymentSpec{ Replicas: ptr.To[int32](1), diff --git a/hypershift-operator/controllers/hostedcluster/network_policies.go b/hypershift-operator/controllers/hostedcluster/network_policies.go index 9c645fbc51a..b7fa79cf70a 100644 --- a/hypershift-operator/controllers/hostedcluster/network_policies.go +++ b/hypershift-operator/controllers/hostedcluster/network_policies.go @@ -142,9 +142,16 @@ func (r *HostedClusterReconciler) reconcileNetworkPolicies(ctx context.Context, case hyperv1.KubevirtPlatform: if hcluster.Spec.Platform.Kubevirt.Credentials == nil { // network policy is being set on centralized infra only, not on external infra + + // Get the control plane namespace to check for Primary UDN label + controlPlaneNamespace := &corev1.Namespace{} + if err := r.Get(ctx, client.ObjectKey{Name: controlPlaneNamespaceName}, controlPlaneNamespace); err != nil { + return fmt.Errorf("failed to get control plane namespace: %w", err) + } + policy = networkpolicy.VirtLauncherNetworkPolicy(controlPlaneNamespaceName) if _, err := createOrUpdate(ctx, r.Client, policy, func() error { - return reconcileVirtLauncherNetworkPolicy(log, policy, hcluster, managementClusterNetwork) + return reconcileVirtLauncherNetworkPolicy(log, policy, hcluster, managementClusterNetwork, controlPlaneNamespace) }); err != nil { return fmt.Errorf("failed to reconcile virt launcher policy: %w", err) } @@ -534,19 +541,28 @@ func addToBlockedNetworks(network string, blockedIPv4Networks []string, blockedI return blockedIPv4Networks, blockedIPv6Networks } -func reconcileVirtLauncherNetworkPolicy(log logr.Logger, policy *networkingv1.NetworkPolicy, hcluster *hyperv1.HostedCluster, managementClusterNetwork *configv1.Network) error { +func reconcileVirtLauncherNetworkPolicy(log logr.Logger, policy *networkingv1.NetworkPolicy, hcluster *hyperv1.HostedCluster, managementClusterNetwork *configv1.Network, controlPlaneNamespace *corev1.Namespace) error { protocolTCP := corev1.ProtocolTCP protocolUDP := corev1.ProtocolUDP protocolSCTP := corev1.ProtocolSCTP + // Check if namespace has Primary UDN enabled + _, hasPrimaryUDN := controlPlaneNamespace.Labels["k8s.ovn.org/primary-user-defined-network"] + blockedIPv4Networks := []string{} blockedIPv6Networks := []string{} for _, network := range managementClusterNetwork.Spec.ClusterNetwork { blockedIPv4Networks, blockedIPv6Networks = addToBlockedNetworks(network.CIDR, blockedIPv4Networks, blockedIPv6Networks) } - for _, network := range managementClusterNetwork.Spec.ServiceNetwork { - blockedIPv4Networks, blockedIPv6Networks = addToBlockedNetworks(network, blockedIPv4Networks, blockedIPv6Networks) + // For Primary UDN, allow access to service CIDR (for DNS and ClusterIP services) + // For non-Primary UDN, block service CIDR as before + if !hasPrimaryUDN { + for _, network := range managementClusterNetwork.Spec.ServiceNetwork { + blockedIPv4Networks, blockedIPv6Networks = addToBlockedNetworks(network, blockedIPv4Networks, blockedIPv6Networks) + } + } else { + log.Info("Primary UDN detected, allowing service CIDR access for DNS and ClusterIP services") } policy.Spec.PolicyTypes = []networkingv1.PolicyType{networkingv1.PolicyTypeIngress, networkingv1.PolicyTypeEgress} diff --git a/hypershift-operator/controllers/hostedcluster/primaryudn/management_udn.go b/hypershift-operator/controllers/hostedcluster/primaryudn/management_udn.go new file mode 100644 index 00000000000..2dab21b4614 --- /dev/null +++ b/hypershift-operator/controllers/hostedcluster/primaryudn/management_udn.go @@ -0,0 +1,43 @@ +package primaryudn + +import ( + "context" + "fmt" + + "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" +) + +// EnsureUserDefinedNetwork ensures a UserDefinedNetwork CR exists in the given namespace. +// This is used for Primary UDN hosted clusters to avoid relying on imperative setup scripts. +func EnsureUserDefinedNetwork(ctx context.Context, c client.Client, namespace, name, subnetCIDR string) error { + if name == "" { + return nil + } + if subnetCIDR == "" { + return fmt.Errorf("primary UDN subnet is required") + } + + udn := &unstructured.Unstructured{} + udn.SetAPIVersion("k8s.ovn.org/v1") + udn.SetKind("UserDefinedNetwork") + udn.SetNamespace(namespace) + udn.SetName(name) + + _, err := controllerutil.CreateOrUpdate(ctx, c, udn, func() error { + udn.Object["spec"] = map[string]any{ + "topology": "Layer2", + "layer2": map[string]any{ + "role": "Primary", + "subnets": []any{subnetCIDR}, + "ipam": map[string]any{ + "mode": "Enabled", + "lifecycle": "Persistent", + }, + }, + } + return nil + }) + return err +} diff --git a/hypershift-operator/controllers/nodepool/kubevirt/kubevirt.go b/hypershift-operator/controllers/nodepool/kubevirt/kubevirt.go index 7221fedf1be..e81b26aa0ad 100644 --- a/hypershift-operator/controllers/nodepool/kubevirt/kubevirt.go +++ b/hypershift-operator/controllers/nodepool/kubevirt/kubevirt.go @@ -445,6 +445,8 @@ func MachineTemplateSpec(nodePool *hyperv1.NodePool, hcluster *hyperv1.HostedClu return nil, err } + applyPrimaryUDNVMTemplateOverrides(hcluster, vmTemplate) + return &capikubevirt.KubevirtMachineTemplateSpec{ Template: capikubevirt.KubevirtMachineTemplateResource{ Spec: capikubevirt.KubevirtMachineSpec{ @@ -455,6 +457,32 @@ func MachineTemplateSpec(nodePool *hyperv1.NodePool, hcluster *hyperv1.HostedClu }, nil } +func applyPrimaryUDNVMTemplateOverrides(hcluster *hyperv1.HostedCluster, tmplt *capikubevirt.VirtualMachineTemplateSpec) { + if hcluster == nil || tmplt == nil || hcluster.Annotations == nil { + return + } + // Primary UDN enablement is atomic: either both name+subnet are set, or neither. + if hcluster.Annotations[hyperv1.PrimaryUDNNameAnnotation] == "" || hcluster.Annotations[hyperv1.PrimaryUDNSubnetAnnotation] == "" { + return + } + + // Primary UDN uses l2bridge binding and must not set allow-pod-bridge-network-live-migration. + if tmplt.Spec.Template != nil { + if tmplt.Spec.Template.ObjectMeta.Annotations != nil { + delete(tmplt.Spec.Template.ObjectMeta.Annotations, "kubevirt.io/allow-pod-bridge-network-live-migration") + } + ifaces := tmplt.Spec.Template.Spec.Domain.Devices.Interfaces + for i := range ifaces { + if ifaces[i].Name != "default" { + continue + } + ifaces[i].InterfaceBindingMethod = kubevirtv1.InterfaceBindingMethod{} + ifaces[i].Binding = &kubevirtv1.PluginBinding{Name: "l2bridge"} + } + tmplt.Spec.Template.Spec.Domain.Devices.Interfaces = ifaces + } +} + func applyJsonPatches(nodePool *hyperv1.NodePool, hcluster *hyperv1.HostedCluster, tmplt *capikubevirt.VirtualMachineTemplateSpec) error { hcAnn, hcOK := hcluster.Annotations[hyperv1.JSONPatchAnnotation] npAnn, npOK := nodePool.Annotations[hyperv1.JSONPatchAnnotation] diff --git a/hypershift-operator/controllers/nodepool/token.go b/hypershift-operator/controllers/nodepool/token.go index 31c9e71a803..c3ae9dee540 100644 --- a/hypershift-operator/controllers/nodepool/token.go +++ b/hypershift-operator/controllers/nodepool/token.go @@ -8,7 +8,6 @@ import ( "time" hyperv1 "github.com/openshift/hypershift/api/hypershift/v1beta1" - "github.com/openshift/hypershift/hypershift-operator/controllers/manifests/ignitionserver" "github.com/openshift/hypershift/support/backwardcompat" "github.com/openshift/hypershift/support/globalconfig" karpenterutil "github.com/openshift/hypershift/support/karpenter" @@ -146,8 +145,14 @@ func NewToken(ctx context.Context, configGenerator *ConfigGenerator, cpoCapabili // getInitionCACert gets the ignition CA cert from a secret. // It's needed to generate a valid ignition config within the user data secret. func (t *Token) getIgnitionCACert(ctx context.Context) ([]byte, error) { - // Validate Ignition CA Secret. - caSecret := ignitionserver.IgnitionCACertSecret(t.controlplaneNamespace) + // Use root-ca secret which is the CA that signs the ignition-server certificate. + // This is needed for Primary UDN clusters where VMs access ignition-server via internal service DNS. + caSecret := &corev1.Secret{ + ObjectMeta: metav1.ObjectMeta{ + Namespace: t.controlplaneNamespace, + Name: "root-ca", + }, + } if err := t.Get(ctx, client.ObjectKeyFromObject(caSecret), caSecret); err != nil { return nil, err } diff --git a/test/e2e/e2e_test.go b/test/e2e/e2e_test.go index 177c59c43a9..ecd48716833 100644 --- a/test/e2e/e2e_test.go +++ b/test/e2e/e2e_test.go @@ -136,6 +136,8 @@ func TestMain(m *testing.M) { flag.StringVar(&globalOpts.ConfigurableClusterOptions.KubeVirtRootVolumeVolumeMode, "e2e.kubevirt-root-volume-volume-mode", "Filesystem", "The root pvc volume mode") flag.UintVar(&globalOpts.ConfigurableClusterOptions.KubeVirtNodeCores, "e2e.kubevirt-node-cores", 2, "The number of cores provided to each workload node") flag.UintVar(&globalOpts.ConfigurableClusterOptions.KubeVirtRootVolumeSize, "e2e.kubevirt-root-volume-size", 32, "The root volume size in Gi") + flag.StringVar(&globalOpts.ConfigurableClusterOptions.KubeVirtPrimaryUDNName, "e2e.kubevirt-primary-udn-name", "", "Enable Primary UDN by specifying the UserDefinedNetwork name to create/ensure in the hosted control plane namespace") + flag.StringVar(&globalOpts.ConfigurableClusterOptions.KubeVirtPrimaryUDNSubnet, "e2e.kubevirt-primary-udn-subnet", "", "Subnet CIDR for the Primary UDN to create/ensure (e.g. 10.150.0.0/16). Required when e2e.kubevirt-primary-udn-name is set") // OpenStack specific flags flag.StringVar(&globalOpts.ConfigurableClusterOptions.OpenStackCACertFile, "e2e.openstack-ca-cert-file", "", "Path to the OpenStack CA certificate file") diff --git a/test/e2e/util/options.go b/test/e2e/util/options.go index b02d1c0d290..51b6bab154e 100644 --- a/test/e2e/util/options.go +++ b/test/e2e/util/options.go @@ -154,6 +154,8 @@ type ConfigurableClusterOptions struct { KubeVirtNodeMemory string KubeVirtRootVolumeSize uint KubeVirtRootVolumeVolumeMode string + KubeVirtPrimaryUDNName string + KubeVirtPrimaryUDNSubnet string NetworkType string NodePoolReplicas int OpenStackExternalNetworkID string @@ -363,6 +365,8 @@ func (o *Options) DefaultKubeVirtOptions() kubevirt.RawCreateOptions { ServicePublishingStrategy: kubevirt.IngressServicePublishingStrategy, InfraKubeConfigFile: o.ConfigurableClusterOptions.KubeVirtInfraKubeconfigFile, InfraNamespace: o.ConfigurableClusterOptions.KubeVirtInfraNamespace, + PrimaryUDNName: o.ConfigurableClusterOptions.KubeVirtPrimaryUDNName, + PrimaryUDNSubnet: o.ConfigurableClusterOptions.KubeVirtPrimaryUDNSubnet, NodePoolOpts: &kubevirtnodepool.RawKubevirtPlatformCreateOptions{ KubevirtPlatformOptions: &kubevirtnodepool.KubevirtPlatformOptions{ Cores: uint32(o.ConfigurableClusterOptions.KubeVirtNodeCores), diff --git a/vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go b/vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go index 92e1852fb2c..be74b9485d6 100644 --- a/vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go +++ b/vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go @@ -278,6 +278,13 @@ const ( // JSONPatchAnnotation allow modifying the kubevirt VM template using jsonpatch JSONPatchAnnotation = "hypershift.openshift.io/kubevirt-vm-jsonpatch" + // PrimaryUDNNameAnnotation enables Primary UDN for KubeVirt hosted clusters by specifying the + // UserDefinedNetwork name to be used as the primary network for the hosted control plane namespace. + PrimaryUDNNameAnnotation = "hypershift.openshift.io/primary-udn-name" + // PrimaryUDNSubnetAnnotation specifies the subnet CIDR for the Primary UDN to be created/ensured + // in the hosted control plane namespace (e.g. "10.150.0.0/16"). + PrimaryUDNSubnetAnnotation = "hypershift.openshift.io/primary-udn-subnet" + // KubeAPIServerGOGCAnnotation allows modifying the kube-apiserver GOGC environment variable to impact how often // the GO garbage collector runs. This can be used to reduce the memory footprint of the kube-apiserver. KubeAPIServerGOGCAnnotation = "hypershift.openshift.io/kube-apiserver-gogc"