-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[WIP] Add Alibaba Cloud platform #5018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @bd233. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@bd233 can you rebase to fix the conflict |
|
/ok-to-test |
3af7987 to
8cf6c99
Compare
d21ccc0 to
b97cae9
Compare
f24a725 to
5cf8748
Compare
I disagree. The image should be determined at the machine-pool level and not the platform level. |
I trust your opinion as this is your realm of expertise. I was basing my recommendations on the other providers who store the image on the platform itself (vsphere, AWS, baremetal, ovirt, kubevirt, openstack). The recent changes have broken the installer by not obeying the If I attempt to add a check for the existing image in the Alibaba section (in How do you recommend we solve the issue with obeying the environment variable override? |
I don't think it's a bug in the recent changes but rather the the bootstrap image asset. To me it seems like the bootstrap image should have a dependency on the rhcos image asset, rather than calling
I'll take a look at whether we have a bug with this. What is the status with publishing the rhcos images? |
|
@patrickdillon I spoke with @miabbott and he mentioned we are waiting on getting some accounts worked out and a bucket to store the images. We are or have been close on this for a while. Hopefully the trt team can make progress. |
The other platforms--with the exception of AWS--define the image once at the platform level for the entire cluster and not for each machine pool. AWS is the only plaform where the user can set the OS image in either the platform or the machine pool. This was done in #3308 with little context for why the option of setting it in the machine pool was added. My contention is that it is incorrect to provide the option to set it at the platform level and at the machine-pool level. It just makes the code more complicated. If the user wants to set it once for the entire cluster, then the user should use the default machine pool. It looks to me like there is actually a bug in the Bootstrap Image asset. The only platform that actually uses that asset is the baremetal platform. The other platforms should either (1) generate nothing for the bootstrap image or (2) generate the actual image that is going to be used for bootstrapping, even though the asset is not used. If it were the latter, then the asset should respect the environment variable override. |
I completely agree with you. I don't think it is good to store it on the platform level and storing the image in the machinepool makes much more sense.
The following code calls @patrickdillon @staebler |
@kwoodson would you take a look at #5267 and see if that fixes the problem for you? I believe it should and is one way of fixing the issue. |
|
I discovered today that the instance type for the master does not meet the required resources. This will need to be updated. I was able to achieve this with the following code. Please update the code so that masters default to have the I also updated the instance names to match the generated names with an extra This change will require an update to the private zone terraform module here https://github.com/openshift/installer/blob/2226d559a2010d8564f3697af8cf741e029d453e/data/data/alibabacloud/cluster/privatezone/privatezone.tf#L65 |
|
@bd233 @dongchen126 |
|
@patrickdillon @staebler I rebased today and fixed the go.mod dependencies and was able to install a cluster successfully (https://github.com/kwoodson/installer/tree/alibaba_v21). There will be a couple of more items but I want to set expectations for merge. Thanks! |
|
@bd233 @dongchen126 Feel free to rebase and resolve any conflicts. The correct dependencies can be found in the master branch's go.mod file here https://github.com/openshift/installer/blob/master/go.mod or I resolved them in a branch here https://github.com/kwoodson/installer/tree/alibaba_v21. This will simplify the process to build and test. Let me know if you have any questions. |
Thanks, I have updated and squash all commits. |
Has been fixed. |
I have reorganized the commits from this PR in #5291. I outlined a plan in that PR for how to best proceed with the code review of this very large PR. I believe once we reorganize the commits, it will make the PR easier to review, the installer team can then do a review, Alibaba can address immediate concerns and then we can merge. |
Support the new Alibaba Cloud platform, adding required types and initial assets:
- cluster
- installconfig
- machine
Alibaba: move Alibaba cloud to the 'HiddenPlatformNames' slice
That'll make it hidden for end users while openshift installer are not yet GA
Alibaba: add Terraform Templates
Add Terraform templates to create cluster resource on Alibaba Cloud
Alibaba: not proxy the ECS metadata servies
Should not proxy the instance metadata services
The address '100.100.100.200' is used to metadata services in Alibaba Cloud ECS instance
Alibaba: support to generate cloud-creds secret
Add the 'AlibabaCloudCredsSecretData' struct into 'pkg/asset/manifests/template.go', it is used to generate cloud-creds secret
Alibaba: add provider config
Add the Alibaba cloud provider config in manifests
Alibaba: add machine spec
Add the spec of Alibaba Cloud virtual machine
Alibaba: support to generate the 'terraform.tfvars' file
Support to generate the terraform.tfvars file by install-config and machine info
Alibaba: support to generate DNS configuration
Add privatezone ID to generate DNS configuration
Alibaba: update machines asset
update Alibaba Cloud machines asset
Alibaba: support to check creds
use NewClient function to check creds
Alibaba: update vendor
use command 'go mod vendor' to update vendor
Alibaba: fix: rename "ResourceGroupName" to "ResourceGroupID"
On Alibaba Cloud,the resource group ID is used to specify the resource group to which they belong
Alibaba: add SLB listeners health checks
Add SLB listeners ports(80, 443) health check
Alibaba: enable OSS service automatically
Add 'alicloud_oss_service' data source, and this can enable OSS service automatically
Alibaba: fix to generate bootstrap user-data
Add terraform variable 'bootstrap_stub_ignition' as user-data of bootstrap instance, and generate it by function 'generateIgnitionShim'.
Alibaba: update the image section
Update the image section, and added an 'ImageID' for Alibaba 'MachinePool'
Signed-off-by: sunhui <[email protected]>
Alibaba: split terraform templates into stages
Split terrafrom into bootstrap and cluster stages.
Signed-off-by: dongchen126 <[email protected]>
Alibaba: fix: Delete unused outputs
Delete some unused outputs and add newlines at the end of some files
Signed-off-by: sunhui <[email protected]>
Alibaba: add validation for provisioning
Add validation for provisioning. Now just add check instance type.
Signed-off-by: sunhui <[email protected]>
Alibaba: add destroy code
Add destroy code to delete cloud resource
Signed-off-by: sunhui <[email protected]>
Alibaba: move alibabacloud platform out of hiddenplatformnames
Move alibabacloud out of hiddenplatformnames to the supported platforms since
Signed-off-by: sunhui <[email protected]>
Alibaba: update to use cluster-api-provider-alibabacloud package
The cluster-api-provider-alibabacloud repo has been updated and supported, using the library’s AlibabaCloudMachineProviderConfig component.
In the future, the code will be synchronized to github.com/openshift/cluster-api-provider-alibaba repo.
Using github.com/openshift/cluster-api-provider-alibaba is the more recommended way.
Signed-off-by: sunhui <[email protected]>
Alibaba: update the scheme and version
Update the 'AddToScheme' and 'SchemeGroupVersion' for machines of master and worker
Signed-off-by: sunhui <[email protected]>
Alibaba: fix APIVersion to alibabacloudmachineproviderconfig.openshift.io
The APIVersion needs to be 'alibabacloudmachineproviderconfig.openshift.io' in 'pkg/asset/machines/alibabacloud/machinesets.go' file
Signed-off-by: sunhui <[email protected]>
Alibaba: specify the SystemDiskCategory and SystemDiskSize for AlibabaCloudMachineProviderConfig
Fix the problem: the system disk information is missing in the generated terraform variables file.
Signed-off-by: sunhui <[email protected]>
Alibaba: store credentials on disk in a config file
After user input credentials, stores them on disk in a config file
Signed-off-by: sunhui <[email protected]>
Alibaba: update terraform provider plugins
Update the Alibaba Cloud terraform provider plugin
Signed-off-by: sunhui <[email protected]>
Alibaba: replace resource alicloud_eip with alicloud_eip_address
The resource alicloud_eip has been deprecated from version 1.126.0. And use new resource alicloud_eip_address.
Signed-off-by: sunhui <[email protected]>
Alibaba: assign a public IP to bootstrap instance
Assign a public IP to bootstrap instance by specifying the internet_max_bandwidth_out parameter
Signed-off-by: sunhui <[email protected]>
Alibaba: add bootstrap instance to the security group of master
The bootstrap instance should be in master instance security group.
Signed-off-by: sunhui <[email protected]>
Alibaba: unify cluster outputs.tf and bootstrap variables.tf file
Unify the output.tf of the cluster and the variables.tf of the bootstrap
Signed-off-by: sunhui <[email protected]>
Alibaba: fix: alibabacloud destroyer creator is not initialized
Fix the error that the destroyer creator of alibabacloud is not initialized
Signed-off-by: sunhui <[email protected]>
Alibaba: remove redundant NatGatway and EIP resource
One NatGatway can fully meet the needs of the network
Signed-off-by: sunhui <[email protected]>
Alibaba: fix to 'Config.BaseDomain' is nil
Use ClusterDomain instead of BaseDomain to query DNS resources
Signed-off-by: sunhui <[email protected]>
Alibaba: fix the problem RAM must use https request
1. RAM request scheme must be https
2. Add automatic retry function
Signed-off-by: sunhui <[email protected]>
Alibaba: add tags for bootstrap security group
Add tags for bootstrap security group
Signed-off-by: sunhui <[email protected]>
Alibaba: extend the expiration time of the bucket URL
Extend the expiration time of the bucket URL to 2 hours
Signed-off-by: sunhui <[email protected]>
Alibaba: update destroy code
1. Updated the function call method,reducing the parameters passed
2. Added some log output to facilitate debugging
3. Detach policys with roles
Signed-off-by: sunhui <[email protected]>
Alibaba: fix manifests dns DNS config
1. fix public zone not be specified
2. fconfig dns private zone by name
Signed-off-by: sunhui <[email protected]>
Alibaba: update terraform provider version to 1.132.0
This version fixes the NotApplicable error that the user of the international station creates alicloud_pvtz_service resource
Signed-off-by: sunhui <[email protected]>
Alibaba: add hostname to instances
Add hostname to bootstrap and master instances
Signed-off-by: sunhui <[email protected]>
Alibaba: fix the error that DescribeAvailableInstanceType response is empty
Adjust parameter order and add instanceType parameter for request, to fix the error that DescribeAvailableInstanceType response is empty.
Signed-off-by: sunhui <[email protected]>
Alibaba: fix the error that the custom image is unavailable
Sets the imageID from `required` to `a`, to fix the error that the custom image is unavailable
Signed-off-by: sunhui <[email protected]>
Alibaba: add ali_nat_gatway_zone_id variable
Add a variable ali_nat_gatway_zone_id to createNAT Gatway to solve the problem that some availability zones do not support NAT gatway creation
Signed-off-by: sunhui <[email protected]>
Alibaba: fix DependencyViolation error
Fix the DependencyViolation error that occasionally appeared when the cluster was destroyed.
Signed-off-by: sunhui <[email protected]>
Alibaba: Add bootstrap instance to internal SLB
1. Add bootstrap instance to internal SLB.
2. The alicloud_slb_attachment has been deprecated, replace with alicloud_slb_backend_server
In the future, plan to use alicloud_slb_server_group
Signed-off-by: sunhui <[email protected]>
Alibaba: add snat to the master node VSwitchs
The masters require internet access so the instances can pull images during startup, add snat to the master node VSwitchs.
Signed-off-by: sunhui <[email protected]>
Alibaba: fix the issue that checking VPC release failed
Check resources multiple times through 'Poll' function
Signed-off-by: sunhui <[email protected]>
Alibaba: fix gatway spelling errors
Fix spelling errors gatway, the correct wording is gateway
Signed-off-by: sunhui <[email protected]>
Alibaba: delete key pair variables
The key pair only supports the configuration of the root user, not valid for core user
Signed-off-by: sunhui <[email protected]>
Alibaba: remove extra spaces
Remove extra spaces to fix CI
Signed-off-by: sunhui <[email protected]>
Alibaba: update infrastructure asset
Update infrastructure asset with 'AlibabaCloudPlatformType' and 'AlibabaCloudPlatformStatus'
Signed-off-by: sunhui <[email protected]>
Alibaba: delete AccessKeyID and AccessKeySecret item in provider config
Fix: the instance has been bound to RAM Role, no need to specify AK and SK.
Signed-off-by: sunhui <[email protected]>
Alibaba: fix: match the hostname
Tag and instance name match hostname
Signed-off-by: sunhui <[email protected]>
Alibaba: update RAM role policy
Update policy document of master and worker node RAM role
Signed-off-by: sunhui <[email protected]>
Alibaba: add static validation for resourceGroupID
1. Add static validation for resourceGroupID
2. Fix incorrect 'instanceTypePath'
Signed-off-by: sunhui <[email protected]>
Alibaba: fix: 'Region' should not be obtained from the client
Obtain the 'Region' from the 'Config', not the client
Signed-off-by: sunhui <[email protected]>
Alibaba: fix: delete unnecessary changes and comments
Delete some unnecessary changes and comments
Signed-off-by: sunhui <[email protected]>
Alibaba: fix: add some 'optional' and 'omitempty' tag
Add 'optional' and 'omitempty' tags for some optional attribute
Signed-off-by: sunhui <[email protected]>
Alibaba: fix some variable name problems
Fix some variable name problem
Signed-off-by: sunhui <[email protected]>
Alibaba: fix: add DiskCategory type
Add 'DiskCategory' type to be used for 'SystemDiskCategory'
Alibaba: separate 'GenerateIgnitionShim' as a common part
Generate an ignition file logic on Alibaba is similar to AWS, separate 'GenerateIgnitionShim' from AWS as a common part, so that Alibaba platform can also be used.
Signed-off-by: sunhui <[email protected]>
Alibaba: add MachinePool verification
Update 'InatanceType' verification, add 'ClusterName' verification
Alibaba: fix default master machine pool instance type
Update default machine pool instance type to 'xlarge' for the master
Signed-off-by: sunhui <[email protected]>
Alibaba: update format of the master instance name
Change the format of the master instance name, separated by '-'
Signed-off-by: sunhui <[email protected]>
Alibaba: update DNS config
Update the privatzone ID configuration of DNS, replace zone name with zone ID and add a zone type tag
Signed-off-by: sunhui <[email protected]>
Alibaba: refactor destroy code
Now when deleting a resource fails, it will always try again, instead of returning an Error.
But now it is used to delete resources synchronously, consider using asynchronous mode later.
Signed-off-by: sunhui <[email protected]>
Alibaba: update cloud-creds-secret.yaml.template
Add Alibaba Cloud creds in cloud-creds-secret.yaml.template file
Signed-off-by: sunhui <[email protected]>
This commit was produced by running , , and all modules verified. Signed-off-by: sunhui <[email protected]>
|
@bd233: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@bd233: PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/uncc |

No description provided.