Skip to content

Conversation

@bd233
Copy link
Contributor

@bd233 bd233 commented Jan 13, 2022

Support internal publish strategy for platform Alibaba Cloud

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 13, 2022

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 13, 2022
@bd233 bd233 changed the title Alibaba: support internal publish strategy Bug 2035720: [Alibaba] support internal publish strategy Jan 13, 2022
@openshift-ci openshift-ci bot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jan 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 13, 2022

@bd233: This pull request references Bugzilla bug 2035720, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.10.0) matches configured target release for branch (4.10.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @jianli-wei

Details

In response to this:

Bug 2035720: [Alibaba] support internal publish strategy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bd233 bd233 marked this pull request as ready for review January 14, 2022 10:18
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 14, 2022
Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually, this looks good to me. I just have a minor nit about keeping the consolidated slb_ids rather than exploding it into two separate variables.

Since we don't have e2e testing for this, I will need someone to show me a successful install

  1. using external publishing to make sure that this does not break existing functionality and
  2. using internal publishing.

/approve

@@ -22,12 +22,16 @@ output "eip_ip" {
value = alicloud_eip_address.eip.ip_address
}

output "slb_ids" {
value = [alicloud_slb_load_balancer.slb_external.id, alicloud_slb_load_balancer.slb_internal.id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep the same slb_ids output.

output "slb_ids" {
  value = concat(alicloud_slb_load_balancer.slb_external[*].id, [alicloud_slb_load_balancer.slb_internal.id])
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jianli-wei Maybe need your help to test the updated code

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 17, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: staebler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 17, 2022
@jianli-wei
Copy link
Contributor

FYI using the build from https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp-modern/1483764171014148096, no the original error any more, i.e. with 'publish: Internal', the installer could go ahead.

$ yq e '.publish' work2/install-config.yaml
Internal
$ echo 'credentialsMode: Manual' >> work2/install-config.yaml
$ openshift-install create manifests --dir work2
INFO Consuming Install Config from target directory
INFO Manifests created in: work2/manifests and work2/openshift
$ openshift-install create cluster --dir work2 --log-level info
INFO Consuming Common Manifests from target directory
INFO Consuming Worker Machines from target directory
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Consuming Master Machines from target directory
INFO Consuming Openshift Manifests from target directory
INFO Creating infrastructure resources...
INFO Waiting up to 20m0s (until 12:37PM) for the Kubernetes API at https://api.jiwei-303.alicloud-qe.devcluster.openshift.com:6443...

@bd233 bd233 force-pushed the fix-publish-internal branch from 14c6690 to 208903a Compare January 19, 2022 23:52
@jianli-wei
Copy link
Contributor

jianli-wei commented Jan 21, 2022

FYI tested with a build from https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp-modern/1484413155458158592, to deploy a private cluster (with cco in manual mode) can succeed.

$ openshift-install version
openshift-install 4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest
built from commit e122e158816a3915bd1a10b5691d17f349e2b438
release image registry.build01.ci.openshift.org/ci-ln-24mf4fb/release@sha256:4dec0d21a63aab0fbbaa60e2fa9b0725ea3d314b12a7e8d9526982d316ad7c6c
release architecture amd64
$ openshift-install create install-config --dir work
? SSH Public Key /home/fedora/.ssh/openshift-qe.pub
? Platform alibabacloud
? Region eu-central-1
? Base Domain alicloud-qe.devcluster.openshift.com
? Cluster Name jiwei-518
? Pull Secret [? for help] ***********
$ echo 'credentialsMode: Manual' >> work/install-config.yaml
$ vim work/install-config.yaml
$ yq e '.platform' work/install-config.yaml 
alibabacloud:
  region: eu-central-1
  resourceGroupID: rg-aek2c4huej7f3ni
  vpcID: vpc-gw8lmracwyk1d0gru8d45
  vswitchIDs:
    - vsw-gw8f5qoxtj4g9s71wxqk4
    - vsw-gw8nn2822z75cl3kwrrot
$ yq e '.publish' work/install-config.yaml 
Internal
$ yq e '.credentialsMode' work/install-config.yaml 
Manual
$ 
$ openshift-install create manifests --dir work
INFO Consuming Install Config from target directory 
INFO Manifests created in: work/manifests and work/openshift 
$ 
$ export http_proxy=http://proxy-user1:[email protected]:3128
$ export https_proxy=http://proxy-user1:[email protected]:3128
$ 
$ openshift-install create cluster --dir work --log-level info
INFO Consuming Worker Machines from target directory
INFO Consuming Master Machines from target directory
INFO Consuming Openshift Manifests from target directory
INFO Consuming Common Manifests from target directory
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Creating infrastructure resources...
INFO Waiting up to 20m0s (until 12:40PM) for the Kubernetes API at https://api.jiwei-518.alicloud-qe.devcluster.openshift.com:6443...
INFO API v1.22.1-4611+112af524d7219b-dirty up
INFO Waiting up to 30m0s (until 12:53PM) for bootstrapping to complete...
INFO Destroying the bootstrap resources...
INFO Waiting up to 40m0s (until 1:24PM) for the cluster at https://api.jiwei-518.alicloud-qe.devcluster.openshift.com:6443 to initialize...
I0121 12:45:03.992457  421134 trace.go:205] Trace[7328973]: "Reflector ListAndWatch" name:k8s.io/client-go/tools/watch/informerwatcher.go:146 (21-Jan-2022 12:44:45.913) (total time: 18078ms):
Trace[7328973]: [18.07866465s] [18.07866465s] END
E0121 12:45:03.992495  421134 reflector.go:138] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *v1.ClusterVersion: failed to list *v1.ClusterVersion: Get "https://api.jiwei-518.alicloud-qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusterversions?fieldSelector=metadata.name%3Dversion&limit=500&resourceVersion=0": http2: client connection lost
INFO Waiting up to 10m0s (until 1:01PM) for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/fedora/work/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.jiwei-518.alicloud-qe.devcluster.openshift.com
INFO Login to the console with user: "kubeadmin", and password: "qEduy-3mU9P-rBqcj-GSPWD"
INFO Time elapsed: 33m42s  
$ 
$ export KUBECONFIG=/home/fedora/work/auth/kubeconfig
$ oc get clusterversion
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         4m23s   Cluster version is 4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest
$ oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
jiwei-518-2ch2n-master-0                     Ready    master   30m     v1.23.0+112af52
jiwei-518-2ch2n-master-1                     Ready    master   30m     v1.23.0+112af52
jiwei-518-2ch2n-master-2                     Ready    master   28m     v1.23.0+112af52
jiwei-518-2ch2n-worker-eu-central-1a-9zpnn   Ready    worker   7m33s   v1.23.0+112af52
jiwei-518-2ch2n-worker-eu-central-1a-z84d2   Ready    worker   19m     v1.23.0+112af52
jiwei-518-2ch2n-worker-eu-central-1b-m6md7   Ready    worker   17m     v1.23.0+112af52
$ 
$ oc get co
NAME                                       VERSION                                                   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      4m56s
baremetal                                  4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
cloud-controller-manager                   4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      30m
cloud-credential                           4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      24m
cluster-autoscaler                         4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
config-operator                            4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      27m
console                                    4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      8m59s
csi-snapshot-controller                    4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
dns                                        4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      25m
etcd                                       4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      25m
image-registry                             4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      10m
ingress                                    4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      16m
insights                                   4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      20m
kube-apiserver                             4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      23m
kube-controller-manager                    4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      24m
kube-scheduler                             4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      24m
kube-storage-version-migrator              4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      24m
machine-api                                4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      21m
machine-approver                           4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
machine-config                             4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      25m
marketplace                                4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
monitoring                                 4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      8m21s
network                                    4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
node-tuning                                4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
openshift-apiserver                        4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      11m
openshift-controller-manager               4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      24m
openshift-samples                          4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      10m
operator-lifecycle-manager                 4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
operator-lifecycle-manager-catalog         4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      26m
operator-lifecycle-manager-packageserver   4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      11m
service-ca                                 4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      27m
storage                                    4.10.0-0.ci.test-2022-01-21-065249-ci-ln-24mf4fb-latest   True        False         False      21m
$ 
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
00-worker                                          db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
01-master-container-runtime                        db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
01-master-kubelet                                  db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
01-worker-container-runtime                        db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
01-worker-kubelet                                  db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
99-master-generated-registries                     db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
99-master-ssh                                                                                 3.2.0             32m
99-worker-generated-registries                     db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
99-worker-ssh                                                                                 3.2.0             32m
rendered-master-b10167999daeea8350a7292b981fb023   db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
rendered-worker-0674088e91c5b40b8ccecf5f5241d024   db690a294cc9a7ad07f3646591591f9cf6e5777a   3.2.0             27m
$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-b10167999daeea8350a7292b981fb023   True      False      False      3              3                   3                     0                      28m
worker   rendered-worker-0674088e91c5b40b8ccecf5f5241d024   True      False      False      3              3                   3                     0                      28m
$ 

@@ -11,6 +11,9 @@ locals {
)
system_disk_size = 120
system_disk_category = "cloud_essd"
// Because of the issue https://github.com/hashicorp/terraform/issues/12570, the consumers cannot use a dynamic list for count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this issue applies here. The list is coming in as input to the stage, so it is not a dynamic list. Nevertheless, we can leave it as is for now, since it has been tested.

Copy link
Contributor Author

@bd233 bd233 Jan 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this issue applies here.

Yes, you are right

@@ -15,7 +16,7 @@ data "alicloud_alidns_domains" "dns_public" {
}

resource "alicloud_alidns_record" "dns_public_record" {
count = length(data.alicloud_alidns_domains.dns_public.domains) == 0 ? 0 : 1
count = local.is_external && length(data.alicloud_alidns_domains.dns_public.domains) != 0 ? 1 : 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per https://bugzilla.redhat.com/show_bug.cgi?id=2041926, we actually want this to fail when local.is_external is true and length(data.alicloud_alidns_domains.dns_public.domains is 0. In that case, the user requested external publishing but supplied a base domain for which there is no zone. Nevertheless, we can keep this as is here and address the BZ in a separate issue.


output "slb_group_length" {
// 1 for private endpoints and 1 for public endpoints
value = "2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this hard-coded to 2 rather than using the actual length of slb_ids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to AWS's usage of aws_lb_target_group_arns_length in cluster/vpc/output.tf.
I updated this part and the test looks ok.

@bd233 bd233 force-pushed the fix-publish-internal branch 2 times, most recently from 5e61b8b to dea49b6 Compare January 25, 2022 05:47
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 26, 2022
@patrickdillon
Copy link
Contributor

This needs a rebase

@bd233 bd233 force-pushed the fix-publish-internal branch from dea49b6 to efed2e5 Compare January 27, 2022 02:04
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 27, 2022
@bd233
Copy link
Contributor Author

bd233 commented Jan 27, 2022

This needs a rebase

Have done

@kirankt
Copy link
Contributor

kirankt commented Jan 27, 2022

tf-fmt failure is real. Please fix. Thanks.

Support internal publish strategy for platform Alibaba Cloud

Signed-off-by: sunhui <[email protected]>

å
@bd233 bd233 force-pushed the fix-publish-internal branch from efed2e5 to d91ecac Compare January 27, 2022 02:59
@bd233
Copy link
Contributor Author

bd233 commented Jan 27, 2022

Okay, I have format it

@patrickdillon
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 27, 2022
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

11 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 28, 2022

@bd233: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-single-node d91ecac link false /test e2e-aws-single-node
ci/prow/e2e-metal-single-node-live-iso d91ecac link false /test e2e-metal-single-node-live-iso
ci/prow/e2e-aws-workers-rhel8 d91ecac link false /test e2e-aws-workers-rhel8
ci/prow/okd-e2e-aws d91ecac link false /test okd-e2e-aws
ci/prow/okd-e2e-aws-upgrade d91ecac link false /test okd-e2e-aws-upgrade
ci/prow/e2e-ibmcloud d91ecac link false /test e2e-ibmcloud
ci/prow/e2e-ovirt d91ecac link false /test e2e-ovirt
ci/prow/e2e-alibaba d91ecac link true /test e2e-alibaba
ci/prow/e2e-azure-upi d91ecac link false /test e2e-azure-upi
ci/prow/e2e-crc d91ecac link false /test e2e-crc
ci/prow/e2e-aws-workers-rhel7 d91ecac link false /test e2e-aws-workers-rhel7

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit d5b69dd into openshift:master Jan 28, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 28, 2022

@bd233: All pull requests linked via external trackers have merged:

Bugzilla bug 2035720 has been moved to the MODIFIED state.

Details

In response to this:

Bug 2035720: [Alibaba] support internal publish strategy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants