-
Notifications
You must be signed in to change notification settings - Fork 535
Add Enhancement for Installing to Azure Stack Hub #689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Enhancement for Installing to Azure Stack Hub #689
Conversation
|
@patrickdillon: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
c3261d8 to
2f63268
Compare
enhancements/installer/azurestack.md
Outdated
|
|
||
| The Installer will need to construct this json configuration file from user input and include the file as part of the cloud-provider-config configmap. The file will also need to be present on nodes for the kubelet. `AZURE_ENVIRONMENT_FILEPATH` will need to be set programmatically for ASH on the kubelet, presumably through the kubelet's systemd unit [(see open questions)](#open-questions). | ||
|
|
||
| Operators will set the `AZURE_ENVIRONMENT_FILEPATH` and mount the endpoints JSON file from the cloud-provider-config configmap. All operators using the Azure SDK will need to do this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way for operators to configure the Azure client directly with the custom endpoints rather than relying on a json file? It would be cumbersome for an operator with a static manifest to do this with mounts, given that the mounted ConfigMap must be copied over to the namespace. The mounting option is feasible for controllers that the operator starts but not easily for operators directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up on this. You can definitely set the endpoint programmatically when creating the client.
For example, see the following.
https://github.com/staebler/cluster-ingress-operator/blob/3bb8ed560892ae648806e6133d18803da9ec341b/pkg/dns/azure/client/client.go#L88
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@staebler I have updated this and I think these revisions make more sense.
enhancements/installer/azurestack.md
Outdated
| ASH endpoints are user-provided and, therefore, the Azure SDK treats ASH endpoints differently than the already-known endpoints of other Azure environments. When the cloud environment is set to | ||
| `AZURESTACKCLOUD` the SDK expects the environment variable `AZURE_ENVIRONMENT_FILEPATH` to point to a [json configuration file](https://kubernetes-sigs.github.io/cloud-provider-azure/install/configs/#azure-stack-configuration), which is [typically located at `/etc/kuberentes/azurestackcloud.json`](https://github.com/kubernetes-sigs/cloud-provider-azure/issues/151). | ||
|
|
||
| The Installer will need to construct this json configuration file from user input and include the file as part of the cloud-provider-config configmap. The file will also need to be present on nodes for the kubelet. `AZURE_ENVIRONMENT_FILEPATH` will need to be set programmatically for ASH on the kubelet, presumably through the kubelet's systemd unit [(see open questions)](#open-questions). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where will this user input come from? Is the expectation that they will have already set up an AZURE_ENVIRONMENT_FILEPATH on the install machine, given that the installer will presumably need to know the endpoints for its accesses to the Azure API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the update should clarify this.
enhancements/installer/azurestack.md
Outdated
| ### Open Questions | ||
|
|
||
| 1. We need to explore how to pass the JSON endpoints file to the Kubelet: | ||
| 1. Can the endpoints file be simply passed from the installer through ignition? Does this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be MachineConfigs that the installer adds for master and workers?
Does the bootstrap machine need this as well? I am assuming not since you have been able to run kubelet successfully on the bootstrap machine with the Azure platform.
| unit's `EnvironmentFile`s](https://github.com/openshift/machine-config-operator/blob/master/templates/master/01-master-kubelet/_base/units/kubelet.service.yaml#L19). If so, which & how | ||
| (ignition or MCO)? | ||
| 1. Should ASH produce a cluster DNS manifest or follow the baremetal approach and not create one? | ||
| 1. Suggestions for determining all operators that need to adapt for the ASH config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2cea816 to
8f1b8b4
Compare
|
/retest |
|
@gnufied brought up CSI support for Azure Stack openshift/kubernetes#643 (comment) We do not have a good background on storage, so I'm glad you reached out. Some quick googling led me to this: https://github.com/kubernetes-sigs/azuredisk-csi-driver Can we discuss what more needs to be done to add storage support? We can move it over to JIRA too which might be more appropriate. |
| - requires different rhcos image | ||
| - limited instance metadata service (IMDS) for VMs | ||
| - does not support private DNS zones | ||
| - limited subset of Azure infrastructure available (ergo different Terraform provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other aspect that may fall under this category of limiited subset of infrastructure is limited API versions available. We need to be careful that in the future we are cognizant of API version selection and are specific about which API versions an Azure Stack deployment must support in order to run OpenShift.
enhancements/installer/azurestack.md
Outdated
| [cloud provider configmap](https://github.com/openshift/installer/blob/master/pkg/asset/manifests/cloudproviderconfig.go#L126-L141): | ||
|
|
||
| ```go | ||
| cloudProviderEndpointsKey = "endpoints" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| cloudProviderEndpointsKey = "endpoints" | |
| cloudProviderEndpointsKey = "endpoints.json" |
enhancements/installer/azurestack.md
Outdated
| This provider lacks the ability to create a service principal for the resource group and assign a contributor role, which is required by the Ingress controller. | ||
|
|
||
| These actions are achievable through the CLI, if necessary these commands could be run as a Terraform post hook. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this conflating the service principal used by the ingress operator with the user-assigned identity attached to the VMs, which is used by the kube-controller-manager? For public Azure, the terraform creates a user-assigned identity but does not create any service principals. Azure Stack does not support user-assigned identities, even through the CLI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My statement in the enhancement is confused. The issue is, as you say, the lack of user-assigned identities--not service principals, but it looks like the "main" identity is scoped to the resource group; so I don't think it's the VM identities you mention. Here is the code from IPI. The UPI doc states:
Grant the Contributor role to the Azure identity so that the Ingress Operator can create a public IP and its load balancer.
I'm not sure I have a firm grasp on the problem. From reading the docs it seems we need to create this identity, to give operators access to create resources in the resource group. The machine API operator would probably need this as well, but perhaps it is not mentioned as these are UPI docs.
Really what I am missing is why/how isn't this handled by the CCO? If we're using passthrough I assume this isn't a problem at all because those creds clearly have perms to create resources in the resource group.
Let me know what you think and I'll update the doc accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handled by the CCO. All of the in-cluster operators use credentials that are granted by the CCO (assuming mint mode). The UPI docs are a bit misleading when they say that the Contributor role is needed by the Ingress operator. The Ingress operator is creating loadbalancer Services, but it is the kube-controller-manager that is creating the cloud resources for those Services.
In public Azure, the kube-controller-manager uses the user-assigned identity assigned to the VM. Azure Stack does not support user-assigned identities. Presumably, we could allow Azure Stack to create system-assigned identities for each VM. The installer would then need to assign the Contributor role to each system-assigned identity after each VM was created. The kubelet also uses the managed identity assigned to the VM, so the managed identity is need for worker machines too. However, the installer does not create the worker VMs so cannot assign the Contributor role to the system-assigned identities for the worker VMs. As far as I am aware, the machine-api does not have a feature for assigning roles to the system-assigned identities of the VMs that the machine-api creates.
A possible solution is to add that feature to the machine-api. Another solution is to supply a Service Principal in the cloud config that the kubelet and kube-controller-manager would use when accessing the Azure API. The latter solution is what I think you are using currently in your UPI work. There may be problems with utilizing that solution long-term due to how the user would replace that Service Principal. I don't think that simply changing the Service Principal in the cloud-provider-config ConfigMap would cause the replacement Service Principal to roll out to all of the kubelets and kube-controller-managers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get it now. The piece I was missing is that the main identity is assigned
to the VMs in TF and the machinesets. Then you explained the rest; thank
you for that!
If we want to manage the roles after creation, then VM extensions may
provide this:
https://registry.terraform.io/providers/hashicorp/azurestack/latest/docs/resources/virtual_machine_extension
But it seems the MAO may be more involved. Yes the UPI solution writes the
cloud provider config with the service principal ID & secret. So I guess
the question is: does updating the cloud provider config configmap in
cluster cause a new machine config to be written? Does a new machine config
cause a reboot, in which case the new creds would be picked up by the
kubelet.
8f1b8b4 to
fba6a5e
Compare
Create an enhancement to outline work for installing and supporting OpenShift on Azure Stack Hub.
fba6a5e to
4da334b
Compare
|
@staebler This has been revised. PTAL |
staebler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: staebler The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Create an enhancement to outline work for installing and supporting OpenShift on Azure Stack Hub.
cc @openshift/openshift-team-installer
cc @staebler
/label priority/important-soon