Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-AZs clusters #22

Open
mhurtrel opened this issue Oct 22, 2020 · 28 comments
Open

Multi-AZs clusters #22

mhurtrel opened this issue Oct 22, 2020 · 28 comments
Labels
Geo New availability zones for existing services

Comments

@mhurtrel
Copy link
Collaborator

mhurtrel commented Oct 22, 2020

As a MKS administrator
I want to spawn Kubernetes cluster distributed on multiple low-latency availability zones
So that I can spread worker nodes accros regions and benfit from an ever better HA of my K8S control plane with contractual SLA

Note :
We currently target this in France at first.

@mhurtrel mhurtrel added Geo New availability zones for existing services Managed Private Registry and removed Managed Private Registry labels Oct 22, 2020
@mhurtrel mhurtrel changed the title MKS - Multi region cluster Multi region cluster Oct 26, 2020
@cambierr
Copy link

We already forked the Openstack CCM to implement multi region at OVH and running it on prod. Feel free to ask if interested

@mr-ssd
Copy link

mr-ssd commented Feb 24, 2021

We already forked the Openstack CCM to implement multi region at OVH and running it on prod. Feel free to ask if interested

@cambierr I'm looking for that solution. I would appreciate it if you could share.

@cambierr
Copy link

It's available at https://hub.docker.com/repository/docker/alphanetworkstv/openstack-cloud-controller-manager-amd64
The only change compared to the uplink version is that you need to provide allowed regions to the config:

[Global]
username=...
password=...
auth-url=https://auth.cloud.ovh.net/v3
tenant-id=...
domain-id=default
region=GRA5
regions=GRA3,GRA5,GRA7

[Networking]
internal-network-name=...
ipv6-support-disabled=true
public-network-name=Ext-Net

[BlockStorage]
bs-version = v3

I still need to push the code somewhere to share the sources, by the way.

CSI is also available: https://hub.docker.com/repository/docker/alphanetworkstv/cinder-csi-plugin-amd64

@tanandy
Copy link

tanandy commented Jul 10, 2021

can we have something integrated in the console and easily deploy node pool on different region ?

@zcourts
Copy link

zcourts commented Jul 16, 2021

@mhurtrel has there been any movement on this?
We're desperately in need of it because we're being affected by http://travaux.ovh.net/?do=details&id=50121& that is dependent on some upstream OpenStack fix - we recently had a 14hr outage because the OVH Volume wouldn't re-attach to any of our pods after a deployment. A multi-region cluster would avoid this.

@cambierr this looks interesting. I am not sure how to use it but I assume will need to use the OpenStack client to setup? I'll see if I can get some help from an ops eng. in the meantime can you provide any resources/guide on how to use these? I'd like to setup two test clusters to play with it.

@cambierr
Copy link

@zcourts what I built is a version of the https://github.com/kubernetes/cloud-provider-openstack that supports multiple region cluster. This is not an extension of the managed clusters by OVH.

If you are still interested, then you can use your own cluster created with kubeadm, rke, or whatever tool you want, then deploy the Openstack cloud controller in your cluster.

Basically, you can do the exact same thing as per https://github.com/kubernetes/cloud-provider-openstack except:

based on that, the CCM will query all the provided regions for instance data instead of the default one only. The CSI (the "kubernetes/openstack volume translator) will also work this way and be able to deal with volumes in the "good" region for the instances.

Please be aware that volumes from region A won't be able to be mounted on region B !

Feel free to ask if you need any help.

@tanandy
Copy link

tanandy commented Jul 16, 2021

@mhurtrel has there been any movement on this?
We're desperately in need of it because we're being affected by http://travaux.ovh.net/?do=details&id=50121& that is dependent on some upstream OpenStack fix - we recently had a 14hr outage because the OVH Volume wouldn't re-attach to any of our pods after a deployment. A multi-region cluster would avoid this.

@cambierr this looks interesting. I am not sure how to use it but I assume will need to use the OpenStack client to setup? I'll see if I can get some help from an ops eng. in the meantime can you provide any resources/guide on how to use these? I'd like to setup two test clusters to play with it.

I seen Scaleway Kosmos provides easy multi cluster integration . You can create cluster there and use ovh nodes until we have something in our OVH console.

@zcourts
Copy link

zcourts commented Jul 17, 2021

@tanandy thanks - I didn't know about Scaleway's Kosmos - https://www.scaleway.com/fr/betas/#kuberneteskosmos it is in private beta though so it won't be an option for our production environments right now (also requires invite to access).

@cambierr ahhhh, that's clearer now. Couple of questions come to mind:

  1. How do OVH resources (volumes etc) get provisioned, are those requested by the CCM or some sub-component and get auto added or you need to attach what you need outside k8s then use?
  2. How many control planes do you end up having? A single one or one per region?

We've been working on a design that uses ISTIO multi-cluster. In this setup we would have a control plane per region. Obvious benefit is that we can entirely lose a region and continue operation, the challenge is the increased complexity in managing and controlling access to multiple control plane/clusters.

We've not gotten as far as doing test clusters with this yet as the issue only affected us 2 weeks ago but we're progressing along this route. I'll bring your links to our team's attention for them to consider as well.

How have you found doing it this way so far? Any common/obvious issues?

@cambierr
Copy link

How do OVH resources (volumes etc) get provisioned, are those requested by the CCM or some sub-component and get auto added or you need to attach what you need outside k8s then use?
The CCM and CSI are responsible to provision, as per the "official" providers. The scheduler will allocate resources on a node and the CSI will discuss with the node's openstack cluster to be able to provision volumes as needed, for instance

How many control planes do you end up having? A single one or one per region?
A single one, this is not multi kubernetes cluster stuff bug a single one on top of multiple Openstack regions

Si, in our setup we use regions from GRA, UK, and SBG in our cluster. This since then a multi region Kubernetes cluster with "ultra high HA" given three regions, all with their own infrastructures (power, net). This brings us the benefit of the HA without the complexity of federation.

@tanandy
Copy link

tanandy commented Jan 17, 2022

Why do we close this issue ?

@mhurtrel mhurtrel reopened this Jan 17, 2022
@mhurtrel
Copy link
Collaborator Author

Hi @tanandy this was a mistake, I confirm we will work on this at a later stage

@Grounz
Copy link

Grounz commented Feb 15, 2022

Hi,

what's the status of this issue ?

@qualitesys
Copy link

Hi OVH
Is there some news on the multi region cluster? As said by @zcourts, this option exists with Scaleway Kosmos, works very well.
Have you some schedule on the roadmap ?

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Jun 9, 2022

Hi @Grounz and @qualitesys
I confirm that we will develop a solution for this, but I can't yet share you a public ETA.
We are exploring option for a very rich multiregion, multicloud and multicluster experience. I will update this issue when possible.

@lenglet-k
Copy link

Hi @mhurtrel

have you any news ?

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Sep 6, 2022

Our current ETA is early 2023

@mhurtrel mhurtrel changed the title Multi region cluster Multi-AZs or Multiregion cluster region cluster Jan 4, 2023
@botylev
Copy link

botylev commented Mar 13, 2023

Hi @mhurtrel, any news on this feature?

@mhurtrel
Copy link
Collaborator Author

Hello @Spark3757 there as been a small delai on our IaaS pillars availaibilities in RBX, that is needed to fully validate our plans. But I should be able to give a new ETA soon. Sorry for the delay.

@yctn
Copy link

yctn commented May 23, 2023

Hi @mhurtrel, any news on this feature?

@mhurtrel
Copy link
Collaborator Author

Unfortunately, we are not yet able to share an ETA, though it remains a priority. As soon as we have ETA from our IAAS colleagues dependancies, I will update this issue.

@mhurtrel mhurtrel changed the title Multi-AZs or Multiregion cluster region cluster Multi-AZs clusters Sep 21, 2023
@mhurtrel
Copy link
Collaborator Author

A small update on the matter : Multi-regions clusters will not be provided in the foreseeable future in Managed Kubernetes Services but though a new product offering capability to manage self-managed Kubernetes control planes by bringing your own nodes.

I refocused this issue on multi-AZ clusters, which will be offered in our multi-AZ regions, the first one being planned in France. We cannot give you an ETA yet but be assured it is identified as a priority.

@lenglet-k
Copy link

Hello @mhurtrel

Could we manage a multi-AZ cluster in different infrastructures like a PCI / HPC / HPC Secnumcloud mix?

What do you mean by "the ability to manage control planes ourselves", does that mean that we will be able to add control planes and manage their configurations and updates? Will it be secnumcloud compatible?

@mhurtrel
Copy link
Collaborator Author

Hi @lenglet-k

This issue (#22) will focus on MultiAZ (single region) Managed Kubernetes service (leveraging Public Cloud instances only).
We will however also offer a multicloud/multicluster solution (in private beta in the next few months) : to build and manage self-amanged cluster on any infrastructure : #467 . This one will at first require the infrastructure to offer internet connectivity, but will at a later stage support vrack-only connectivity. Yes you will be able to manage the control plane, using a supported distribution (more details soon). it will not be SecNumCloud compatible at launch.

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Oct 17, 2023

Hi everyone ! Though we of course still plan to support 3AZ-regions-based managed Kubernetes clusters, I also wanted to let you know that we just released Managed Rancher Service in alpha (aka private beta). Amongst many other features, this product anables you to create and self-managed cluster based on any infrastructure. You could for example spawn baremetal machines or VMs in multiple regions (from OVHcloud, another cloud providers or even onprem, provided the machines have internet access) to build an extremely-highly-available cluster.

Do not hesitate to consult this page to learn more and fill the short form to be one of the first users of this new managed service : https://labs.ovhcloud.com/en/managed-rancher-service/

@cambierr
Copy link

Rancher ? ouch !

Will the multi region MKS be based on it ?

@mhurtrel
Copy link
Collaborator Author

@cambierr nope, MKS and Manage Rancher Service are twi different products. Multi-zone MKS will me made available quickly after the first multi az ovhcloud public cloud region is made available and will not require managed rancher service.

@salimidruide
Copy link

@mhurtrel thank you for the update. I would like to give you an honest feedback, if you want like to keep your clients you need to move from the speech of " we of course still plan to support 3AZ-regions-based managed Kubernetes clusters" to "This is the deadline for delivering and we are respecting it".

@MatthieuFin
Copy link

Hello there,

I discover this thread, I encounter this issue couple month earlier. I followed same approach as @cambierr, and made first of all a PR on cinder-csi-plugin part of openstack cloud provider.

This was merge this summer and should be released soon i guess, but you could build you own docker image to try it before release.

I enjoy any feedback on this implementation.

Technically it is a multi-cloud implementation and not only multi region. You could spread nodes on multiple Openstack cluster.

Which permit us to spread a single Kubernetes cluster on 3 OVH regions and an on-premise Openstack cluster and being able to consume PVC on any nodes.

Obviously the limitation is that PVC created on a region are not consumable from another region and you have to manage a storage class per region.

Personally to spread a single STS across multiple region I pre-provision PVC before create STS, (with one PVC per SC I'm able to spread my pods across my different regions)

I don't use MKS product (for a lot of some reasons). I don't know how it is compatible with this product, technically it should be doable.

Next step on my side is to test current implementation of CCM in this multi-cloud environment to be able to keep CCM with a multi-cloud cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Geo New availability zones for existing services
Projects
Status: Planned
Development

No branches or pull requests