-
Notifications
You must be signed in to change notification settings - Fork 530
Enhancement proposal for OpenShift IPI on IBM Cloud #773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @jeffnowicki. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
|
/cc @aravindhp |
0812051 to
962b064
Compare
| ? Platform ibmcloud | ||
| ? Resource Group ID default (34ffb674f7c4466398dcd257a0dac58e) | ||
| ? Region us-south | ||
| ? RHCOS Custom Image rhcos-ibmcloud-470 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is there a way on IBM for this to be some public shared default, or something the installer sets up if not? AFAIK we don't ask for an image like this on other platforms so to provide the easiest path forward, would be ideal if users didn't have to provide a custom image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, not at the moment. There is the concept of shared default images that all customers have access to, but currently RHCOS is not one of those. We are hoping that it will be in the near future, but that's proving to be more difficult than it seems it would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add that as a non-goal for this enhancement to make it clear that it doesn't block this work but is related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We found a means of automating the creation of custom rhcos images on behalf of users, so the rhcos image name is no longer a prompted survey question requiring input from users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent. It would be good to capture that history in the "alternatives" section of this doc.
Is the image going to be an optional input? I don't know what we do about images on other platforms, so I'm not sure what would be consistent.
|
|
||
| - The default IPI-provisioned cluster on IBM Cloud will be a [single region, multizone (3 zones) cluster](https://cloud.ibm.com/docs/containers?topic=containers-ha_clusters#multizone). | ||
|
|
||
| - An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following from my comment above about the custom images, I guess this step could be performed by the installer rather than the user? And then we wouldn't need that question during the setup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is technically possible, but due to the size of the custom image upload, it would take quite a long time. We weighed making this a prereq vs installer provided and went with prereq for that reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect the image is probably a similar size as on other platforms no? Maybe worth double checking with the Installer team what we do on other platforms and see what their recommendation is here. IIUC we do something like this on platforms like vSphere as part of the installer so maybe it's not unprecendented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a sequence of steps involved in creating a custom VPC image.
- RHCOS image downloaded, or otherwise accessible from somewhere.
- The image needs to be uploaded to COS (COS instance itself and/or bucket created to house the image).
- VPC needs to be granted access to above COS bucket containing rhcos image.
- VPC custom image then needs to be created based on the rhcos image stored in COS.
The lead time for this process and the amount of other work we have before us lead us to conclude this manual prerequisite was sufficient for an initial IPI MVP.
@staebler thoughts? I'm not familiar with how rhcos images are managed in vSphere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have this justification in the body of the enhancement, for readers who come along after the PR merges.
It sounds like the user will have some setup to do before they can use the IPI installer with their IBM Cloud account. Is it just the 4 steps in your comment above, or is there more to it? We don't have a separate user experience section, so maybe you could add a sort description of all of the steps to this section of the doc. Something along the lines of https://github.com/openshift/enhancements/blob/master/enhancements/workload-partitioning/management-workload-partitioning.md#high-level-end-to-end-workflow (although probably shorter in your case).
|
|
||
| - An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud. | ||
|
|
||
| - A small custom-bootstrap.ign file is used to reference canonical bootstrap.ign file hosted in object storage to workaround 64 KB user data size limit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth clarifying that this is for the bootstrap and control plane hosts and not for workers. (I assume workers will use the Machine Config Server as on other platforms)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoelSpeed are you referencing that top bullet item where rhcos images are imported as custom images?
We will be using the Machine Config Server to provision worker nodes, but the ibm capi provider will leverage these same rhcos images when provisioning the worker nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who creates the custom-bootstrap.ign and where is it hosted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be the installer in the same fashion that is used in aws ipi @dhellmann
|
|
||
| - Current RHCOS image minimum storage requirement of 120gb. Currently, IBM Cloud VSIs are provisioned with 100gb boot volumes. Workarounds have been explored and discussions have started for getting support for larger boot volume size options to accommodate image storage requirement. | ||
|
|
||
| - The IBM Cloud Provider (Kubernetes Cloud Provider) is a work-in-progress. It is being tracked by a [Kubernetes community enhancement](https://github.com/kubernetes/enhancements/issues/671). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a link for the repository for the provider that could be included here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The external repo hasn't been created yet. It will be created in the IBM org. We still have some refactoring to do on the internal repo and then flush out how we will push to upstream repo. Will update once external repo is established.
|
|
||
| ### Open Questions | ||
|
|
||
| 1. Is IBM Cloud VPC CCM required for IPI deliverable? Load balancer support being the obvious functionality provided by CCM (optionally enabled). Is there any other CCM functionality that is considered MVP (minimum viable product)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the CCM you would have to install the cluster as a no-platform cluster, which would mean no cloud integrations at all. Switching from a no-platform cluster to a platformed cluster is theoretically possible but I'm not sure it's something we have tested or claim to support.
That or, initially we set have a special case for IBM that configures the components as no-platform and then moves them to external at a later date, could work, initially, seems complex.
One of the other things that CCM provides is the Node cloud provider annotations, which is typically used for scheduling workloads. Eg I want to spread my pod across multiple failure domains. This seems like a pretty fundamental feature that, without it, would limit users as they wouldn't be able to use any pod affinity/anti-affinity that relied on Node properties.
My vote here would be to require the CCM as part of the MVP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your perspective on the question/comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently CCM hosts logic for populating new Node resource spec.providerID field, node adresses and labels. Either of the spec.providerID or node NodeInternalIP are required for Machine API code to match Machine with Node provisioned. I'm not sure how Machine API integration will work without this feature - the Node is just not going to join the cluster. I also vote for CCM support for MVP.
| 1. Is storage operator support a hard requirement for IPI? Could it be provided in a future release? | ||
| > Without it (making it a Day 2 post install operation), diminishes the UX for the customer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK the only hard dependency on storage for the core OCP operators is within the monitoring stack.
Without a storage implementation and default storageclass, I think you would prevent the monitoring stack from initialising which would cause the cluster to fail bootstrap.
That said, you may be able to initially disable the monitoring stack to get around this.
| ? Platform ibmcloud | ||
| ? Resource Group ID default (34ffb674f7c4466398dcd257a0dac58e) | ||
| ? Region us-south | ||
| ? RHCOS Custom Image rhcos-ibmcloud-470 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add that as a non-goal for this enhancement to make it clear that it doesn't block this work but is related.
| - subnets (optional array of strings): A list of existing subnet IDs. Leave unset and the installer will create new subnets in the VPC network on your behalf. | ||
|
|
||
| **Machine Pool** | ||
| - type (optional string): The VSI machine profile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "VSI"?
| As an OpenShift consumer, I want to be able to use the OpenShift installer with customizations (i.e. enhanced security) to create and destroy an OpenShift 4 cluster on IBM Cloud. | ||
|
|
||
| Story 3 | ||
| As an OpenShift platform, I want OpenShift on IBM Cloud to be maintained and released like other supported OpenShift platforms and covered by CI tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a step too far in anthropomorphizing. Who actually wants this? The OpenShift consumer, maybe?
|
|
||
| - The default IPI-provisioned cluster on IBM Cloud will be a [single region, multizone (3 zones) cluster](https://cloud.ibm.com/docs/containers?topic=containers-ha_clusters#multizone). | ||
|
|
||
| - An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have this justification in the body of the enhancement, for readers who come along after the PR merges.
It sounds like the user will have some setup to do before they can use the IPI installer with their IBM Cloud account. Is it just the 4 steps in your comment above, or is there more to it? We don't have a separate user experience section, so maybe you could add a sort description of all of the steps to this section of the doc. Something along the lines of https://github.com/openshift/enhancements/blob/master/enhancements/workload-partitioning/management-workload-partitioning.md#high-level-end-to-end-workflow (although probably shorter in your case).
|
|
||
| - An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud. | ||
|
|
||
| - A small custom-bootstrap.ign file is used to reference canonical bootstrap.ign file hosted in object storage to workaround 64 KB user data size limit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who creates the custom-bootstrap.ign and where is it hosted?
|
|
||
| ### Implementation Details/Notes/Constraints | ||
|
|
||
| - The default IPI-provisioned cluster on IBM Cloud will be a [single region, multizone (3 zones) cluster](https://cloud.ibm.com/docs/containers?topic=containers-ha_clusters#multizone). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the user be able to change that default in this initial version?
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
/remove-lifecycle stale |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
|
Stale enhancement proposals rot after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle rotten |
|
Rotten enhancement proposals close after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Reopen the proposal by commenting /close |
|
@openshift-bot: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
No description provided.