Skip to content

Add support for external etcd cluster#1165

Merged
ncopa merged 17 commits intok0sproject:mainfrom
jewertow:feat/external-etcd-cluster-support
Jan 3, 2022
Merged

Add support for external etcd cluster#1165
ncopa merged 17 commits intok0sproject:mainfrom
jewertow:feat/external-etcd-cluster-support

Conversation

@jewertow
Copy link
Contributor

@jewertow jewertow commented Oct 3, 2021

Signed-off-by: Jacek Ewertowski jacek.ewertowski1@gmail.com

Issue
Fixes #1010

What this PR Includes
This PR aims to enable users to use external etcd cluster as a storage for k0s. To do that a user has to define the following configuration:

  storage:
    etcd:
      externalCluster:
        endpoints:
        - http://192.168.10.1:2379
        - http://192.168.10.2:2379
        - http://192.168.10.3:2379
        etcdPrefix: k0s-tenant
    type: etcd

Field endpoints contains list of URLs that listen on for client requests. etcdPrefix is used to enable multi-tenancy in etcd and specifies the name of the root path which by default is /registry; this value will be used to pass as --etcd-prefix argument in kube-apiserver.

Changes

  • Extend ClusterConfig CRD with etcd.externalCluster option.

TODO:

  • Skip running internal etcd if etcd.externalCluster is defined.
  • Configure external endpoints in EtcdClient.
  • Run kube-apiserver with configured external endpoints and --etcd-prefix argument.
  • Validate external cluster configuration.
  • Add option to specify paths to TLS key and certificate.

@kke
Copy link
Contributor

kke commented Oct 6, 2021

Good stuff. I'm thinking if it would be neater if instead of the externalCluster you would have storage.etcd.endpoints and the etcd.IsExternal() would be return len(e.Endpoints) > 0. I think prefix and clientCert can be there for the internal etcd too.

@jewertow
Copy link
Contributor Author

jewertow commented Oct 6, 2021

Thanks for your feedback. I decided to wrap endpoints and etcdPrefix in externalCluster object to make the default config as simple as possible:

  storage:
    type: etcd
    etcd:
      externalCluster: null
      peerAddress: 192.168.68.104

Additionally, wrapper object externalCluster gives more context about the purpose of the fields inside.

On the other hand, why a user would want to set custom prefix or client cert for the internal cluster? I think that internal etcd should be managed only by k0s to keep it as simple as possible.

@jnummelin
Copy link
Member

I think that internal etcd should be managed only by k0s to keep it as simple as possible.

I agree with this, there's no need to complicate things for the k0s managed etcd case.

I think prefix and clientCert can be there for the internal etcd too.

prefix maybe. client cert really by itself not, as k0s managed etcd will need lot more certs than only the client cert

@jewertow
Copy link
Contributor Author

jewertow commented Oct 8, 2021

@jnummelin @kke, we have to find a common approach to the new API. Please share your opinion and vote for preferred option.

  1. Option with externalCluster wrapper:
  storage:
    type: etcd
    etcd:
      externalCluster:
        endpoints:
        - http://192.168.68.104:2379
        - http://192.168.68.105:2379
        etcdPrefix: k0s-tenant
        caFile: /etc/pki/CA/ca.crt
        clientCertFile: /etc/pki/tls/certs/etcd-client.crt
        clientKeyFile: /etc/pki/tls/private/etcd-client.key
      peerAddress: 192.168.68.104
  1. Flat config:
  storage:
    type: etcd
    etcd:
      endpoints:
      - http://192.168.68.104:2379
      - http://192.168.68.105:2379
      etcdPrefix: k0s-tenant
      caFile: /etc/pki/CA/ca.crt
      clientCertFile: /etc/pki/tls/certs/etcd-client.crt
      clientKeyFile: /etc/pki/tls/private/etcd-client.key
      peerAddress: 192.168.68.104

In my opinion the first option gives more context and is more intuitive, so I vote for the first option.

Other questions:

  1. Can TLS be optional in case of external cluster or always required?
  2. I'm going to not support k0s etcd member-list and k0s etcd leave when external cluster is used, because I think that if a user uses an external cluster, we can assume that he knows how to manage this cluster and he has tools to list members. When it comes to leave command, k0s should not support this, because it simply doesn't make sense. Do you agree with me?
  3. We have to decide if etcdPrefix should be available also for internal cluster. I don't see a need to use a prefix in this case, but maybe you know some use case?
  4. What do you think about the name of etcdPrefix property? I named it so, to point out it's used as --etcd-prefix, but maybe it could be just prefix or pathPrefix?

@jnummelin
Copy link
Member

In my opinion the first option gives more context and is more intuitive, so I vote for the first option.

I agree, feels more intuitive IMO too.

Can TLS be optional in case of external cluster or always required?

Theoretically one can configure etcd not to use any TLS, I don't think many people do it though. So I'd say they can be optional.

I'm going to not support k0s etcd member-list and k0s etcd leave when external cluster is used, because I think that if a user uses an external cluster, we can assume that he knows how to manage this cluster and he has tools to list members. When it comes to leave command, k0s should not support this, because it simply doesn't make sense. Do you agree with me?

Yes, I agree.

We have to decide if etcdPrefix should be available also for internal cluster. I don't see a need to use a prefix in this case, but maybe you know some use case?

I do not know any use case where this would be required. So IMO ok to have it only on the externalCluster.

What do you think about the name of etcdPrefix property? I named it so, to point out it's used as --etcd-prefix, but maybe it could be just prefix or pathPrefix?

Both etcdPrefix and pathPrefix sound ok to me.

@jewertow
Copy link
Contributor Author

This PR is almost ready. I tested k0s with external cluster with and without TLS and everything works fine. I’m going to add some additional unit tests and to check error messages from k0s etcd and k0s backup when external cluster is configured.

@jewertow jewertow force-pushed the feat/external-etcd-cluster-support branch from 6a69731 to 546501b Compare October 26, 2021 22:09
@jewertow jewertow marked this pull request as ready for review October 26, 2021 22:21
@jewertow jewertow requested a review from a team as a code owner October 26, 2021 22:21
@jewertow jewertow requested review from ncopa and xinfengliu October 26, 2021 22:21
@jewertow
Copy link
Contributor Author

This pull request does not include documentation and smoke tests, but I'm going to add both these things in subsequent pull requests.

If you want to test this feature, you can use my k0s-sandbox repository that contains vagrant-based setups of k0s.

@jewertow jewertow requested a review from a team as a code owner November 21, 2021 15:41
@jewertow jewertow requested a review from trawler November 21, 2021 15:41
@mikhail-sakhnov
Copy link
Contributor

@jewertow can I ask you to rebase over upstream? I am afraid GHA won't start on the outdated branch

@jewertow jewertow force-pushed the feat/external-etcd-cluster-support branch from 1444f31 to 4be9a06 Compare November 29, 2021 21:22
@jewertow
Copy link
Contributor Author

Rebased

@mikhail-sakhnov
Copy link
Contributor

This pull request does not include documentation and smoke tests, but I'm going to add both these things in subsequent pull requests.

Would be great to have e2e test suite as a part of this PR, it's usual practice in the project to cover new features with tests originally to not let techdebt growing.

@jewertow
Copy link
Contributor Author

jewertow commented Dec 7, 2021

Ok, I will add some e2e tests.

Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
@mikhail-sakhnov
Copy link
Contributor

@jewertow hello! Do you have any updates on this?

@jewertow jewertow force-pushed the feat/external-etcd-cluster-support branch from 4be9a06 to fc6c566 Compare December 29, 2021 13:08
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
@jewertow jewertow force-pushed the feat/external-etcd-cluster-support branch from 348d51c to 60515a0 Compare December 30, 2021 16:35
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
ncopa
ncopa previously approved these changes Dec 31, 2021
Copy link
Collaborator

@ncopa ncopa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
@jewertow
Copy link
Contributor Author

jewertow commented Jan 3, 2022

This pull request is still missing a test case for external etcd with enabled TLS. I tested such a case manually on VMs and it works fine, but unfortunately I have no time this week to implement this test. I can implement this test next week. What do you think? Is it acceptable for you?

@jewertow jewertow force-pushed the feat/external-etcd-cluster-support branch from 211174b to 39afd19 Compare January 3, 2022 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add external etcd cluster as an option

6 participants