From 009f6d7e6501ff3cd9950086ab79a9efdebde434 Mon Sep 17 00:00:00 2001 From: "W. Trevor King" Date: Mon, 13 May 2019 21:38:47 -0700 Subject: [PATCH] ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e: Add third AWS compute node This should help avoid persistent failures like [1]: STEP: Getting zone name for pod ubelite-spread-rc-f3c96168-7346-11e9-9c48-0a58ac10c848-527zc, on node ip-10-0-57-149.ec2.internal STEP: Getting zone name for pod ubelite-spread-rc-f3c96168-7346-11e9-9c48-0a58ac10c848-5749s, on node ip-10-0-57-149.ec2.internal STEP: Getting zone name for pod ubelite-spread-rc-f3c96168-7346-11e9-9c48-0a58ac10c848-fhf2k, on node ip-10-0-57-149.ec2.internal STEP: Getting zone name for pod ubelite-spread-rc-f3c96168-7346-11e9-9c48-0a58ac10c848-gb9kw, on node ip-10-0-64-141.ec2.internal STEP: Getting zone name for pod ubelite-spread-rc-f3c96168-7346-11e9-9c48-0a58ac10c848-lss79, on node ip-10-0-64-141.ec2.internal STEP: Getting zone name for pod ubelite-spread-rc-f3c96168-7346-11e9-9c48-0a58ac10c848-trq9w, on node ip-10-0-64-141.ec2.internal STEP: Getting zone name for pod ubelite-spread-rc-f3c96168-7346-11e9-9c48-0a58ac10c848-w997t, on node ip-10-0-57-149.ec2.internal ... fail [k8s.io/kubernetes/test/e2e/scheduling/ubernetes_lite.go:170]: Pods were not evenly spread across zones. 0 in one zone and 4 in another zone Expected : 0 to be ~ : 4 In that case, the nodes were: $ curl -s 'https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/3440/rehearse-3440-pull-ci-openshift-installer-master-e2e-aws-upi/30/artifacts/e2e-aws-upi/nodes.json' | jq -r '.items[] | .metadata.name + "\t" + .metadata.labels["failure-domain.beta.kubernetes.io/zone"]' ip-10-0-57-149.ec2.internal us-east-1a ip-10-0-62-7.ec2.internal us-east-1a ip-10-0-64-141.ec2.internal us-east-1b ip-10-0-70-141.ec2.internal us-east-1b ip-10-0-85-210.ec2.internal us-east-1c My guess is that the test logic is guessing that the pods have three zones available because we have control-plane nodes in three zones. But before this commit, we only had the two compute nodes, so we had: us-east-1a ip-10-0-57-149.ec2.internal 4 pods us-east-1b ip-10-0-64-141.ec2.internal 3 pods us-east-1c control-plane node but no compute 0 pods With this commit, we will have compute in each zone, so the test should pass. [1]: https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_release/3440/rehearse-3440-pull-ci-openshift-installer-master-e2e-aws-upi/30 --- .../installer/cluster-launch-installer-upi-e2e.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e.yaml b/ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e.yaml index e73816d79dd0d..209efd934c05e 100644 --- a/ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e.yaml +++ b/ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e.yaml @@ -545,7 +545,7 @@ objects: CONTROL_PLANE_1_IP="$(echo "${CONTROL_PLANE_IPS}" | cut -d, -f2)" CONTROL_PLANE_2_IP="$(echo "${CONTROL_PLANE_IPS}" | cut -d, -f3)" - for INDEX in 0 1 + for INDEX in 0 1 2 do SUBNET="PRIVATE_SUBNET_${INDEX}" aws cloudformation create-stack \ @@ -876,14 +876,14 @@ objects: export AWS_DEFAULT_REGION="${AWS_REGION}" # CLI prefers the former - for STACK_SUFFIX in compute-1 compute-0 control-plane bootstrap security infra vpc + for STACK_SUFFIX in compute-2 compute-1 compute-0 control-plane bootstrap security infra vpc do aws cloudformation delete-stack --stack-name "${CLUSTER_NAME}-${STACK_SUFFIX}" done openshift-install --dir /tmp/artifacts/installer destroy cluster - for STACK_SUFFIX in compute-1 compute-0 control-plane bootstrap security infra vpc + for STACK_SUFFIX in compute-2 compute-1 compute-0 control-plane bootstrap security infra vpc do aws cloudformation wait stack-delete-complete --stack-name "${CLUSTER_NAME}-${STACK_SUFFIX}" done