Skip to content

Conversation

@jeffnowicki
Copy link

No description provided.

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign spadgett after the PR has been reviewed.
You can assign the PR to them by writing /assign @spadgett in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

Hi @jeffnowicki. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 5, 2021
@fabianofranz
Copy link
Member

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 7, 2021
@fabianofranz
Copy link
Member

@openshift-ci openshift-ci bot requested review from eparis, russellb and staebler May 7, 2021 18:01
@fabianofranz
Copy link
Member

/cc @aravindhp

@openshift-ci openshift-ci bot requested a review from aravindhp May 7, 2021 18:03
@jeffnowicki jeffnowicki force-pushed the master branch 4 times, most recently from 0812051 to 962b064 Compare May 10, 2021 12:22
? Platform ibmcloud
? Resource Group ID default (34ffb674f7c4466398dcd257a0dac58e)
? Region us-south
? RHCOS Custom Image rhcos-ibmcloud-470
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is there a way on IBM for this to be some public shared default, or something the installer sets up if not? AFAIK we don't ask for an image like this on other platforms so to provide the easiest path forward, would be ideal if users didn't have to provide a custom image

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, not at the moment. There is the concept of shared default images that all customers have access to, but currently RHCOS is not one of those. We are hoping that it will be in the near future, but that's proving to be more difficult than it seems it would be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add that as a non-goal for this enhancement to make it clear that it doesn't block this work but is related.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We found a means of automating the creation of custom rhcos images on behalf of users, so the rhcos image name is no longer a prompted survey question requiring input from users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. It would be good to capture that history in the "alternatives" section of this doc.

Is the image going to be an optional input? I don't know what we do about images on other platforms, so I'm not sure what would be consistent.


- The default IPI-provisioned cluster on IBM Cloud will be a [single region, multizone (3 zones) cluster](https://cloud.ibm.com/docs/containers?topic=containers-ha_clusters#multizone).

- An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following from my comment above about the custom images, I guess this step could be performed by the installer rather than the user? And then we wouldn't need that question during the setup?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is technically possible, but due to the size of the custom image upload, it would take quite a long time. We weighed making this a prereq vs installer provided and went with prereq for that reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the image is probably a similar size as on other platforms no? Maybe worth double checking with the Installer team what we do on other platforms and see what their recommendation is here. IIUC we do something like this on platforms like vSphere as part of the installer so maybe it's not unprecendented?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a sequence of steps involved in creating a custom VPC image.

  1. RHCOS image downloaded, or otherwise accessible from somewhere.
  2. The image needs to be uploaded to COS (COS instance itself and/or bucket created to house the image).
  3. VPC needs to be granted access to above COS bucket containing rhcos image.
  4. VPC custom image then needs to be created based on the rhcos image stored in COS.
    The lead time for this process and the amount of other work we have before us lead us to conclude this manual prerequisite was sufficient for an initial IPI MVP.
    @staebler thoughts? I'm not familiar with how rhcos images are managed in vSphere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have this justification in the body of the enhancement, for readers who come along after the PR merges.

It sounds like the user will have some setup to do before they can use the IPI installer with their IBM Cloud account. Is it just the 4 steps in your comment above, or is there more to it? We don't have a separate user experience section, so maybe you could add a sort description of all of the steps to this section of the doc. Something along the lines of https://github.com/openshift/enhancements/blob/master/enhancements/workload-partitioning/management-workload-partitioning.md#high-level-end-to-end-workflow (although probably shorter in your case).


- An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud.

- A small custom-bootstrap.ign file is used to reference canonical bootstrap.ign file hosted in object storage to workaround 64 KB user data size limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth clarifying that this is for the bootstrap and control plane hosts and not for workers. (I assume workers will use the Machine Config Server as on other platforms)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed are you referencing that top bullet item where rhcos images are imported as custom images?
We will be using the Machine Config Server to provision worker nodes, but the ibm capi provider will leverage these same rhcos images when provisioning the worker nodes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who creates the custom-bootstrap.ign and where is it hosted?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be the installer in the same fashion that is used in aws ipi @dhellmann


- Current RHCOS image minimum storage requirement of 120gb. Currently, IBM Cloud VSIs are provisioned with 100gb boot volumes. Workarounds have been explored and discussions have started for getting support for larger boot volume size options to accommodate image storage requirement.

- The IBM Cloud Provider (Kubernetes Cloud Provider) is a work-in-progress. It is being tracked by a [Kubernetes community enhancement](https://github.com/kubernetes/enhancements/issues/671).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a link for the repository for the provider that could be included here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The external repo hasn't been created yet. It will be created in the IBM org. We still have some refactoring to do on the internal repo and then flush out how we will push to upstream repo. Will update once external repo is established.


### Open Questions

1. Is IBM Cloud VPC CCM required for IPI deliverable? Load balancer support being the obvious functionality provided by CCM (optionally enabled). Is there any other CCM functionality that is considered MVP (minimum viable product)?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the CCM you would have to install the cluster as a no-platform cluster, which would mean no cloud integrations at all. Switching from a no-platform cluster to a platformed cluster is theoretically possible but I'm not sure it's something we have tested or claim to support.

That or, initially we set have a special case for IBM that configures the components as no-platform and then moves them to external at a later date, could work, initially, seems complex.

One of the other things that CCM provides is the Node cloud provider annotations, which is typically used for scheduling workloads. Eg I want to spread my pod across multiple failure domains. This seems like a pretty fundamental feature that, without it, would limit users as they wouldn't be able to use any pod affinity/anti-affinity that relied on Node properties.

My vote here would be to require the CCM as part of the MVP

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your perspective on the question/comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently CCM hosts logic for populating new Node resource spec.providerID field, node adresses and labels. Either of the spec.providerID or node NodeInternalIP are required for Machine API code to match Machine with Node provisioned. I'm not sure how Machine API integration will work without this feature - the Node is just not going to join the cluster. I also vote for CCM support for MVP.

Comment on lines +133 to +134
1. Is storage operator support a hard requirement for IPI? Could it be provided in a future release?
> Without it (making it a Day 2 post install operation), diminishes the UX for the customer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the only hard dependency on storage for the core OCP operators is within the monitoring stack.
Without a storage implementation and default storageclass, I think you would prevent the monitoring stack from initialising which would cause the cluster to fail bootstrap.

That said, you may be able to initially disable the monitoring stack to get around this.

? Platform ibmcloud
? Resource Group ID default (34ffb674f7c4466398dcd257a0dac58e)
? Region us-south
? RHCOS Custom Image rhcos-ibmcloud-470
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add that as a non-goal for this enhancement to make it clear that it doesn't block this work but is related.

- subnets (optional array of strings): A list of existing subnet IDs. Leave unset and the installer will create new subnets in the VPC network on your behalf.

**Machine Pool**
- type (optional string): The VSI machine profile.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "VSI"?

As an OpenShift consumer, I want to be able to use the OpenShift installer with customizations (i.e. enhanced security) to create and destroy an OpenShift 4 cluster on IBM Cloud.

Story 3
As an OpenShift platform, I want OpenShift on IBM Cloud to be maintained and released like other supported OpenShift platforms and covered by CI tests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a step too far in anthropomorphizing. Who actually wants this? The OpenShift consumer, maybe?


- The default IPI-provisioned cluster on IBM Cloud will be a [single region, multizone (3 zones) cluster](https://cloud.ibm.com/docs/containers?topic=containers-ha_clusters#multizone).

- An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have this justification in the body of the enhancement, for readers who come along after the PR merges.

It sounds like the user will have some setup to do before they can use the IPI installer with their IBM Cloud account. Is it just the 4 steps in your comment above, or is there more to it? We don't have a separate user experience section, so maybe you could add a sort description of all of the steps to this section of the doc. Something along the lines of https://github.com/openshift/enhancements/blob/master/enhancements/workload-partitioning/management-workload-partitioning.md#high-level-end-to-end-workflow (although probably shorter in your case).


- An IBM Cloud RHCOS image must be made available and maintained - ([current 4.7 image location](https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.7/latest/)). The images will be imported as custom images for IBM Cloud.

- A small custom-bootstrap.ign file is used to reference canonical bootstrap.ign file hosted in object storage to workaround 64 KB user data size limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who creates the custom-bootstrap.ign and where is it hosted?


### Implementation Details/Notes/Constraints

- The default IPI-provisioned cluster on IBM Cloud will be a [single region, multizone (3 zones) cluster](https://cloud.ibm.com/docs/containers?topic=containers-ha_clusters#multizone).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the user be able to change that default in this initial version?

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@jeffnowicki
Copy link
Author

/remove-lifecycle stale

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 21, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign sjenning after the PR has been reviewed.
You can assign the PR to them by writing /assign @sjenning in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2021
@openshift-bot
Copy link

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 25, 2021
@openshift-bot
Copy link

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this Dec 2, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 2, 2021

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants