-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFD 5: Kubernetes Service #4455
Conversation
rfd/0005-kubernetes-service.md
Outdated
```yaml | ||
# New format: | ||
kubernetes_service: | ||
enabled: yes # default "no" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 5.0, if you recommend users set Admin starts a Teleport agent within each Kubernetes cluster, with the following
, this would mean that the Auth/Proxy on the root could have this set to False? ( this would a similar to ssh_service being false ) so it would follow the same pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admin can set it to false.
Proxy still needs to listed on a k8s port to forward requests though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat confusing - the idea that you could start a kubernetes_service
and give it a public_addr
but it still isn't actually the thing that's responsible for doing the listening. It doesn't work without a corresponding proxy, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @awly for comment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, missed this.
A proxy_service
with kube_listend_addr
set is a user-facing endpoint.
A kubernetes_service
is a gateway for a single k8s cluster from a teleport cluster, that is only reachable through a proxy.
Some scenarios:
- single teleport cluster, single k8s cluster - use both
kubernetes_service
andproxy_service
withkube_listen_addr
- single teleport cluster, multiple k8s clusters -
auth_service
andproxy_service
on one box, separate pods with onlykubernetes_service
in each cluster - root teleport cluster without local k8s cluster, leaf teleport clusters with k8s clusters - root proxy needs
kube_listen_addr
but notkubernetes_service
; leaf proxies needkube_listen_addr
, withkubernetes_service
in one process or separate pods running justkubernetes_service
There's a consistent requirement - if you want k8s integration, your proxies must set kube_listen_addr
and clients only talk to kubernetes_service
through a proxy.
rfd/0005-kubernetes-service.md
Outdated
|
||
The Kubernetes service implements all the common features: | ||
- connecting to an Auth server directly or via a Proxy tunnel | ||
- registration using join tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we want to split out node token, or keep it generic for Kubernetes.
$ tctl nodes add
The invite token: 3abaf5364b23483b29b18d23091e2397
This token will expire in 30 minutes
Run this on the new node to join the cluster:
> teleport start \
++ --roles=node \
--token=3abaf5364b23483b29b18d23091e2397 \
--ca-pin=sha256:47e50cb6138cfa11587508e334d99fa26b9832a030a7b20c9ab7b2b9e77f4206 \
--auth-server=172.31.1.91:3025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it would use a different role for registration: tctl tokens add --type=kubernetes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a reasonable suggestion. I have some concerns about migrations and how we can persuade people to do the right thing, particularly when many have already built complicated setups to work around the fact that we've always had the hard "one k8s cluster = one proxy" requirement in the past. This is good news for them in theory, but it will require work to migrate. Some people have designed their entire architecture around the knowledge that you need an auth/proxy pair (and thus stateful storage) in every k8s leaf cluster.
Possible mitigations:
- Existing
proxy_service
setups inside pods which link back as leaf clusters could maybe add themselves askubernetes_service
instead with the same credentials once upgraded to v5.0.0, or at least use a transparent rotation-type mechanism to get new credentials with aKubernetes
type instead of aProxy
type. It's going to be hard to automatically do the right thing for many though. - Another mitigation is to put a lot of work into our Helm chart and make sure that our Helm flow for starting a lone
kubernetes_service
inside a pod and linking it back to an existing Teleport cluster is really really simple - I think this would be a fairly common use case.
Other concerns:
-
At the moment the majority of people don't just deploy Teleport running
proxy_service
inside pods - they have to deploy both auth and proxy (with stateful storage) because this is how you link a leaf cluster back.- Is the idea that
kubernetes_service
will encapsulate the entire reverse tunnel authz/n part as well, so that you could go from runningauth_service
andproxy_service
to just runningkubernetes_service
with exactly the same experience?- If we can encourage people to migrate early without any downtime (and remove the need for them to provide stateful storage for every leaf) then the whole idea becomes more compelling.
- If you still wanted SSH access as well as k8s, would you also need to deploy both
auth_service
andproxy_service
along withkubernetes_service
to register as a leaf cluster get this? - How would this integrate with the existing leaf cluster model?
- Is the idea that
-
We are releasing AAP with v5.0.0 which is also going to add
app_service
to/etc/teleport.yaml
- this would potentially have it happen in the same release, and make the minimal config more complicated:
From:
teleport:
auth_token: blah
ca_pin: sha256:blah
log:
severity: INFO
output: stderr
auth_service:
enabled: no
proxy_service:
enabled: no
ssh_service:
enabled: yes
To:
teleport:
auth_token: blah
ca_pin: sha256:blah
log:
severity: INFO
output: stderr
auth_service:
enabled: no
app_service:
enabled: no
kubernetes_service:
enabled: no
proxy_service:
enabled: no
ssh_service:
enabled: yes
This isn't necessarily a deal-breaker, but it's worth noting that historically, the default behaviour of all *_service
flags in the config is to set enabled: yes
unless you explicitly specify: enabled: no
. This could lead to a situation where people on v4.4.x config files are having a couple of new services enabled on them after upgrading to v5.0.0 without any notification.
- I don't know what the mitigation for this is - I'm just saying that there's a lot of cruft linked into decoupling the k8s part from the proxy part. I think it's worth exploring but we should be careful not to add legacy config migrations/paths that we'll need to support until the end of the time.
rfd/0005-kubernetes-service.md
Outdated
```yaml | ||
# New format: | ||
kubernetes_service: | ||
enabled: yes # default "no" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat confusing - the idea that you could start a kubernetes_service
and give it a public_addr
but it still isn't actually the thing that's responsible for doing the listening. It doesn't work without a corresponding proxy, right?
rfd/0005-kubernetes-service.md
Outdated
|
||
To encourage users to migrate, all new config fields (`k8s_cluster_name` and | ||
`labels`) will be added to the new service definition only. | ||
`kubconfig_file` in the old section will behave as before - only extracting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`kubconfig_file` in the old section will behave as before - only extracting | |
`kubeconfig_file` in the old section will behave as before - only extracting |
rfd/0005-kubernetes-service.md
Outdated
`kubconfig_file` in the old section will behave as before - only extracting | ||
`current-context` and registering that as `k8s_cluster_name`. In the new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd have to come up with some logic for what happens if two services (one proxy_service
, one kubernetes_service
) try to register with the same name - to handle potential migration scenarios where people have an existing working setup but want to start switching over to the new style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be multiple teleport
binaries reporting the same k8s cluster in heartbeats (e.g. HA proxy setup inside the cluster).
When routing requests, we can send the request to any endpoint that claims to support a given k8s cluster name.
rfd/0005-kubernetes-service.md
Outdated
there's still a lot of authn/z and audit complexity). | ||
|
||
This RFD complements the [Kubernetes 5.0 enhancements | ||
design](https://docs.google.com/document/d/1cS6J2d_xBcJMWPewWPjdOZyrDHLKdqmgF1QLLo4E1YI). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please port the rest of the doc in this RFD too? I was re-reading today, and found new commands tsh kube clusters
and tsh kube login
that I wanted to discuss.
This will help to have a full picture of what you are proposing here.
b44f581
to
a6500f2
Compare
@klizhentas moved most of the doc contents into this RFD.
|
rfd/0005-kubernetes-service.md
Outdated
|
||
#### Non-k8s proxy | ||
|
||
A separate "gateway" proxy can run with `proxy_service.kubernetes.enabled: yes` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use this instead of new kubernetes_service?
TODO(me):
|
a6500f2
to
836ab0a
Compare
Updated and added more config examples. |
This plumbs config fields only, they have no effect yet. Also, remove `cluster_name` from `proxy_config.kubernetes`. This field will only exist under `kubernetes_service` per #4455
836ab0a
to
037626b
Compare
Ping @klizhentas @russjones @benarent, still need approval here |
This plumbs config fields only, they have no effect yet. Also, remove `cluster_name` from `proxy_config.kubernetes`. This field will only exist under `kubernetes_service` per #4455
* Fix local etcd test failures when etcd is not running * Add kubernetes_service to teleport.yaml This plumbs config fields only, they have no effect yet. Also, remove `cluster_name` from `proxy_config.kubernetes`. This field will only exist under `kubernetes_service` per #4455 * Handle IPv6 in kubernetes_service and rename label fields * Disable k8s cluster name defaulting in user TLS certs Need to implement service registration first.
# New format: | ||
kubernetes_service: | ||
enabled: yes | ||
public_addr: [k8s.example.com:3026] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we still be able to specify a public_addr
here? Is it needed when the proxy is intended to be responsible for providing the endpoint that kubectl
connects to and routing traffic to the appropriate cluster?
Just trying to figure out what the use case is for having a separate public_addr
when the proxy is the inbound point for the traffic.
Edit: Finished the document and I now understand that there's a case for having public_addr
and listen_addr
set when the kubernetes_service
is connected directly to the auth server (as they're needed to route inbound traffic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, public_addr
is for a proxy to figure out where this service lives
in the `auth_servers` field of `teleport.yaml`. | ||
|
||
When connecting over a reverse tunnel, `kubernetes_service` will not listen on | ||
any local port, unless its `listen_addr` is set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is different to the way that ssh_service
works currently - I don't believe ssh_service
ever listens when connected over a reverse tunnel, meaning there's no way to make an inbound connection to port 3022. I feel like the behaviour of kubernetes_service
when connected over a reverse tunnel should be the same, unless there's a really compelling use case to make it listen locally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you mean that listen_addr
should be disallowed when connecting over a tunnel?
I can see 2 small (but not deal-breaker) problems:
listen_addr
is validated at startup, before the process knows whether it's talking to an auth server or a proxy- an admin may choose to set
listen_addr
on all instances, but only setauth_server
to a proxy on some of them; on tunneled instances, listener will remain active but unused; on regular instances it will be dialed by the proxies; no need to templatelisten_addr
in your config mangement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I start writing relevant code, I understand what you mean better.
I'll disallow both connecting via proxy tunnel and using a local listen_addr
because:
- it's simpler to implement (no need to merge connections from multiple listeners)
- it behaves more like the
ssh_service
``` | ||
|
||
```sh | ||
$ tsh kube clusters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do end users need to do anything else with kubectl
for multiple cluster? or does it just work out of the box? https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tsh login
will configure all known clusters as kubectl
contexts.
to switch between them, end users have to call either tsh kube login $cluster_name
or kubectl config use-context $cluster_name
.
This is a shorthand for the larger kubernetes section: ``` proxy_service: kube_listen_addr: "0.0.0.0:3026" ``` if equivalent to: ``` proxy_service: kubernetes: enabled: yes listen_addr: "0.0.0.0:3026" ``` This shorthand is meant to be used with the new `kubernetes_service`: #4455 It reduces confusion when both `proxy_service` and `kubernetes_service` are configured in the same process.
037626b
to
9696d75
Compare
9696d75
to
5a813d2
Compare
This is a shorthand for the larger kubernetes section: ``` proxy_service: kube_listen_addr: "0.0.0.0:3026" ``` if equivalent to: ``` proxy_service: kubernetes: enabled: yes listen_addr: "0.0.0.0:3026" ``` This shorthand is meant to be used with the new `kubernetes_service`: #4455 It reduces confusion when both `proxy_service` and `kubernetes_service` are configured in the same process.
This is a shorthand for the larger kubernetes section: ``` proxy_service: kube_listen_addr: "0.0.0.0:3026" ``` if equivalent to: ``` proxy_service: kubernetes: enabled: yes listen_addr: "0.0.0.0:3026" ``` This shorthand is meant to be used with the new `kubernetes_service`: #4455 It reduces confusion when both `proxy_service` and `kubernetes_service` are configured in the same process.
Proposal for a standalone
kubernetes_service
, separate fromproxy_service
.Updates #3952