-
Notifications
You must be signed in to change notification settings - Fork 531
Enhancement proposal for OpenShift IPI on IBM Cloud #773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,181 @@ | ||
| --- | ||
| title: openshift-ipi-on-ibmcloud | ||
| authors: | ||
| - "@jeffnowicki" | ||
| - "@BobbyRadford" | ||
| reviewers: | ||
| - @staebler | ||
| approvers: | ||
| - @staebler | ||
| creation-date: 2021-05-03 | ||
| last-updated: yyyy-mm-dd | ||
| status: implementable | ||
| --- | ||
|
|
||
| # OpenShift Installer Provisioned Infrastructure (IPI) on IBM Cloud | ||
|
|
||
| ## Release Signoff Checklist | ||
|
|
||
| - [x] Enhancement is `implementable` | ||
| - [ ] Design details are appropriately documented from clear requirements | ||
| - [ ] Test plan is defined | ||
| - [ ] Operational readiness criteria is defined | ||
| - [ ] Graduation criteria for dev preview, tech preview, GA | ||
| - [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) | ||
|
|
||
| ## Summary | ||
|
|
||
| This enhancement proposes adding support for OpenShift 4 Installer Provisioned Infrastructure (IPI) on IBM Cloud VPC (Gen 2) infrastructure. It describes the necessary enhancements, tooling and documentation to enable this capability. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Users expect OpenShift to be available through multiple cloud providers. Customers with high-availability and enhanced security requirements may wish to take advantage of [IBM Cloud VPC (Gen 2) infrastructure](https://www.ibm.com/cloud/vpc). Leveraging IPI, the process of installing OpenShift on IBM Cloud can be simplified. | ||
|
|
||
|
|
||
| ### Goals | ||
|
|
||
| The primary goal of IBM Cloud IPI is to provide users with an easier path to running OpenShift 4 on VPC infrastructure in IBM Cloud data centers. | ||
|
|
||
| To achieve that goal, we will follow the pattern of other IPI supported platforms: | ||
| - Enhance the installer to survey customer for IBM Cloud options. | ||
| - Prepare IBM Cloud Terraform (TF) module for control plane provisioning. The [IBM Cloud Terraform Provider](https://github.com/IBM-Cloud/terraform-provider-ibm) will utilized. | ||
| - Integrate with IBM Cloud VPC machine API provider. The [IBM Cloud Cluster API Provider](https://github.com/kubernetes-sigs/cluster-api-provider-ibmcloud) will be utilized. | ||
| - Integrate with IBM Cloud Controller Manager (CCM) and enhance requisite operators to enable cluster functionality on IBM Cloud. | ||
| - Provide IBM Cloud IPI user documentation [here](https://github.com/openshift/installer/tree/master/docs/user/ibmcloud/). | ||
| - Provide CI artifacts required to test the IBM Cloud IPI installer. | ||
|
|
||
|
|
||
| ### Non-Goals | ||
|
|
||
| None at this time. | ||
|
|
||
| ## Proposal | ||
|
|
||
| - Implement installer CLI prompts in order to build a default/minimal `install-config.yaml` for IBM Cloud : | ||
| ```shell | ||
| ? SSH Public Key /Users/someone/.ssh/id_rsa.pub | ||
| ? Platform ibmcloud | ||
| ? Resource Group ID default (34ffb674f7c4466398dcd257a0dac58e) | ||
| ? Region us-south | ||
| ? RHCOS Custom Image rhcos-ibmcloud-470 | ||
| ? Base Domain ibm.foo.com (Internet Services-9h) | ||
| ? Cluster Name test | ||
| ? Pull Secret [? for help] **** | ||
| ``` | ||
|
|
||
| - Implement IBM Cloud specific platform and machine pool installer customizations | ||
|
|
||
| **Platform** | ||
|
|
||
| - region (required string): The IBM Cloud region where the cluster will be created. | ||
| - cisInstanceCRN (required string): The Cloud Internet Services CRN managing the base domain DNS zone. | ||
| - clusterOSImage (required string): The name of the RHCOS custom image to use for machines. | ||
| - resourceGroup (optional string): The name of an existing resource group where the cluster and all required resources will be created. | ||
| - defaultMachinePlatform (optional object): Default machine pool properties that apply to machine pools that do not define their own IBM Cloud specific properties. | ||
|
|
||
| If one of these is specified, they ALL must be specified. | ||
| - vpc (optional string): The name of an existing VPC network. | ||
| - vpcResourceGroup (optional string): The name of the existing VPC's resource group. | ||
| - subnets (optional array of strings): A list of existing subnet IDs. Leave unset and the installer will create new subnets in the VPC network on your behalf. | ||
|
|
||
| **Machine Pool** | ||
| - type (optional string): The VSI machine profile. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is "VSI"? |
||
| - zones (optional array of strings): The availability zones used for machines in the pool. | ||
| - bootVolume (optional object): | ||
| - encryptionKeyCRN: (optional string): The CRN referencing a a Key Protect or Hyper Protect Crypto Services key to use for volume encryption. If not specified, a provider managed encryption key will be used. | ||
|
|
||
| - Provide documentation (in similar format to other supported IPI platforms) to help users use and customize the IPI installer on IBM Cloud: | ||
| - Prerequisite instructions prior to invoking installer | ||
| - Description of installer options and customizations as they apply to IBM Cloud | ||
| - Post installation instructions on further customizations a user may wish to apply | ||
|
|
||
| ### User Stories | ||
|
|
||
| Story 1 | ||
| As an OpenShift consumer, I want to quickly, with minimal input and default options, be able to use the OpenShift installer to create and destroy an OpenShift 4 cluster on IBM Cloud. | ||
|
|
||
| Story 2 | ||
| As an OpenShift consumer, I want to be able to use the OpenShift installer with customizations (i.e. enhanced security) to create and destroy an OpenShift 4 cluster on IBM Cloud. | ||
|
|
||
| Story 3 | ||
| As an OpenShift platform, I want OpenShift on IBM Cloud to be maintained and released like other supported OpenShift platforms and covered by CI tests. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This may be a step too far in anthropomorphizing. Who actually wants this? The OpenShift consumer, maybe? |
||
|
|
||
|
|
||
| ### Implementation Details/Notes/Constraints | ||
|
|
||
| - The default IPI-provisioned cluster on IBM Cloud will be a [single region, multizone (3 zones) cluster](https://cloud.ibm.com/docs/containers?topic=containers-ha_clusters#multizone). | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will the user be able to change that default in this initial version? |
||
|
|
||
| - An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Following from my comment above about the custom images, I guess this step could be performed by the installer rather than the user? And then we wouldn't need that question during the setup? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is technically possible, but due to the size of the custom image upload, it would take quite a long time. We weighed making this a prereq vs installer provided and went with prereq for that reason.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I expect the image is probably a similar size as on other platforms no? Maybe worth double checking with the Installer team what we do on other platforms and see what their recommendation is here. IIUC we do something like this on platforms like vSphere as part of the installer so maybe it's not unprecendented? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's a sequence of steps involved in creating a custom VPC image.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to have this justification in the body of the enhancement, for readers who come along after the PR merges. It sounds like the user will have some setup to do before they can use the IPI installer with their IBM Cloud account. Is it just the 4 steps in your comment above, or is there more to it? We don't have a separate user experience section, so maybe you could add a sort description of all of the steps to this section of the doc. Something along the lines of https://github.com/openshift/enhancements/blob/master/enhancements/workload-partitioning/management-workload-partitioning.md#high-level-end-to-end-workflow (although probably shorter in your case). |
||
|
|
||
| - A small custom-bootstrap.ign file is used to reference canonical bootstrap.ign file hosted in object storage to workaround 64 KB user data size limit. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be worth clarifying that this is for the bootstrap and control plane hosts and not for workers. (I assume workers will use the Machine Config Server as on other platforms) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @JoelSpeed are you referencing that top bullet item where rhcos images are imported as custom images?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Who creates the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. that would be the installer in the same fashion that is used in aws ipi @dhellmann |
||
|
|
||
| - Customer will need to provide and prepare a [Cloud Internet Services (CIS)](https://www.ibm.com/cloud/cloud-internet-services/details) instance, which will provide required DNS capabilities. | ||
|
|
||
|
|
||
| ### Risks and Mitigations | ||
|
|
||
| - Current RHCOS image minimum storage requirement of 120gb. Currently, IBM Cloud VSIs are provisioned with 100gb boot volumes. Workarounds have been explored and discussions have started for getting support for larger boot volume size options to accommodate image storage requirement. | ||
|
|
||
| - The IBM Cloud Provider (Kubernetes Cloud Provider) is a work-in-progress. It is being tracked by a [Kubernetes community enhancement](https://github.com/kubernetes/enhancements/issues/671). | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you have a link for the repository for the provider that could be included here?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The external repo hasn't been created yet. It will be created in the IBM org. We still have some refactoring to do on the internal repo and then flush out how we will push to upstream repo. Will update once external repo is established. |
||
|
|
||
| - The [IBM Cloud Cluster API Provider](https://github.com/kubernetes-sigs/cluster-api-provider-ibmcloud) project is a relatively new project. The project will need to be reasonably hardened with a level of CI enabled. | ||
|
|
||
| ## Design Details | ||
|
|
||
|
|
||
| ### Open Questions | ||
|
|
||
| 1. Is IBM Cloud VPC CCM required for IPI deliverable? Load balancer support being the obvious functionality provided by CCM (optionally enabled). Is there any other CCM functionality that is considered MVP (minimum viable product)? | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Without the CCM you would have to install the cluster as a no-platform cluster, which would mean no cloud integrations at all. Switching from a no-platform cluster to a platformed cluster is theoretically possible but I'm not sure it's something we have tested or claim to support. That or, initially we set have a special case for IBM that configures the components as no-platform and then moves them to external at a later date, could work, initially, seems complex. One of the other things that CCM provides is the Node cloud provider annotations, which is typically used for scheduling workloads. Eg I want to spread my pod across multiple failure domains. This seems like a pretty fundamental feature that, without it, would limit users as they wouldn't be able to use any pod affinity/anti-affinity that relied on Node properties. My vote here would be to require the CCM as part of the MVP
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for your perspective on the question/comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently CCM hosts logic for populating new Node resource |
||
|
|
||
| 1. The Cloud Credential Operator supports multiple modes of operation. For the initial IPI deliverable, what is the MVP level of support that needs to be implemented for IBM Cloud? | ||
| > We'll plan to implement "Passthru Mode" initially and work towards a strategic implementation around "Compute Resource Identify". Ref: https://github.com/openshift/cloud-credential-operator#2-passthrough-mode | ||
|
|
||
| 1. Is storage operator support a hard requirement for IPI? Could it be provided in a future release? | ||
| > Without it (making it a Day 2 post install operation), diminishes the UX for the customer. | ||
|
Comment on lines
+133
to
+134
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AFAIK the only hard dependency on storage for the core OCP operators is within the monitoring stack. That said, you may be able to initially disable the monitoring stack to get around this. |
||
|
|
||
| 1. Are unit-level tests recommended (required?). If so, is there preferred tooling / test structure? This is in addition to the required CI E2E testing. | ||
| > We will follow what other supported platforms have done. Please advise with any suggestions/tips. | ||
|
|
||
| ### Test Plan | ||
|
|
||
| Will follow test plan design implemented by existing supported IPI platforms: | ||
| - A new CI test suite with E2E test jobs will be established. | ||
|
|
||
| ### Graduation Criteria | ||
|
|
||
| The proposal is to follow a graduation process based on the existence of a continuous integration (CI) | ||
| suite running end-to-end (E2E) jobs. The CI suite results will be evaluated and acted on as needed. | ||
|
|
||
| The following list describes the key elements of the criteria: | ||
|
|
||
| - CI jobs are enabled and regularly scheduled. | ||
| - Current IPI documention published in the OpenShift repo. | ||
| - E2E jobs are stable and passing. Results are evaluated with the same criteria as comparable supported IPI platform providers. | ||
| - Test engineers have successfully followed the documented IPI instructions to deploy OpenShift 4 on IBM Cloud. | ||
|
|
||
| #### Dev Preview -> Tech Preview | ||
|
|
||
| #### Tech Preview -> GA | ||
|
|
||
| #### Removing a deprecated feature | ||
|
|
||
| ### Upgrade / Downgrade Strategy | ||
|
|
||
| ### Version Skew Strategy | ||
|
|
||
| ## Implementation History | ||
|
|
||
| Major milestones in the life cycle of a proposal should be tracked in `Implementation | ||
| History`. | ||
|
|
||
| ## Drawbacks | ||
|
|
||
| None at this time. | ||
|
|
||
| ## Alternatives | ||
|
|
||
| Currently, there is no alternative. We intend to formalize and present a UPI proposal in the near future (which would be considered an alternative to IPI). Proof-of-concept work was performed following the [UPI plaform-agnostic instructions](https://docs.openshift.com/container-platform/4.7/installing/installing_platform_agnostic/installing-platform-agnostic.html). | ||
|
|
||
| ## Infrastructure Needed | ||
|
|
||
| IBM Cloud VPC infrastructure will be made available to support CI E2E testing. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is there a way on IBM for this to be some public shared default, or something the installer sets up if not? AFAIK we don't ask for an image like this on other platforms so to provide the easiest path forward, would be ideal if users didn't have to provide a custom image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, not at the moment. There is the concept of shared default images that all customers have access to, but currently RHCOS is not one of those. We are hoping that it will be in the near future, but that's proving to be more difficult than it seems it would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add that as a non-goal for this enhancement to make it clear that it doesn't block this work but is related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We found a means of automating the creation of custom rhcos images on behalf of users, so the rhcos image name is no longer a prompted survey question requiring input from users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent. It would be good to capture that history in the "alternatives" section of this doc.
Is the image going to be an optional input? I don't know what we do about images on other platforms, so I'm not sure what would be consistent.