Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ExternalWorkload CRD #11805

Merged
merged 6 commits into from
Jan 5, 2024
Merged

Conversation

mateiidavid
Copy link
Member

@mateiidavid mateiidavid commented Dec 20, 2023

To support mesh expansion, the control plane needs to read configuration associated with an external instance (i.e. a VM) for the purpose of service and inbound authorization policy discovery.

This change introduces a new CRD that supports the required configuration options. The resource supports:

  • a list of workload IPs (with a generic format to support ipv4 and ipv6 in the future)
  • a set of mesh TLS settings (SNI and identity)
  • a set of ports exposed by the workload
  • a set of status conditions

Some background on choices

OpenAPI supports validation, where applicable, we add validation rules.

  • We validate SNI & Identity are max 253 characters (max allowed for a DNS name, although for identity it might not make sense since we will support URIs. SNI is based on DNS though).
  • We validate conditions, but a resource may be created without a status. Condition validation is taken from the httproute resource since it follows a very similar format.
  • We do not require meshTls settings, but we require workloadIPs and ports.

Unfortunately, for ips we cannot add any server side validation through the schema since we plan on supporting both ipv4 and ipv6. All IPs will have to be parsed and validated in the controllers that read the resource.

For printer columns, we list the age and the identity. IPs and ports will not render properly since printer columns only support primitive data types (record types can still be printed but in debug format). A choice would be to eliminate the need for an ip key and have the IPs printed as `["192.0.2.0", "192.0.3.0"].

Output:

NAME              IDENTITY                                           AGE
nginx-vm-test-3   foo.default.serviceaccount.linkerd.cluster.local   2m56s
nginx-vm-test-4   bar.default.serviceaccount.linkerd.cluster.local   2s

and with IPs as a printer column:

NAME              IPS                    PORTS
nginx-vm-test-3   [{"ip":"192.0.2.2"}]   [{"port":8080,"protocol":"TCP"},{"port":7777,"protocol":"TCP"}]
apiVersion: workload.linkerd.io/v1alpha1
kind: ExternalWorkload
metadata:
  annotations:
    config.linkerd.io/default-inbound-policy: deny
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"workload.linkerd.io/v1alpha1","kind":"ExternalWorkload","metadata":{"annotations":{"config.linkerd.io/default-inbound-policy":"deny"},"labels":{"app":"echo-service"},"name":"nginx-vm-test-3","namespace":"default"},"spec":{"meshTls":{"identity":"foo.default.serviceaccount.linkerd.cluster.local","serverName":"foo.default.serviceaccount.linkerd.cluster.local"},"ports":[{"port":8080,"protocol":"TCP"},{"port":7777,"protocol":"TCP"}],"workloadIPs":[{"ip":"192.0.2.2"}]},"status":{"conditions":[{"status":"True","type":"Ready"}]}}
  creationTimestamp: "2023-12-20T14:48:22Z"
  generation: 1
  labels:
    app: echo-service
  name: nginx-vm-test-3
  namespace: default
  resourceVersion: "31812"
  uid: 21a22e4e-36c7-4315-973c-90cc39ad83af
spec:
  meshTls:
    identity: foo.default.serviceaccount.linkerd.cluster.local
    serverName: foo.default.serviceaccount.linkerd.cluster.local
  ports:
  - port: 8080
    protocol: TCP
  - port: 7777
    protocol: TCP
  workloadIPs:
  - ip: 192.0.2.2
status:
  conditions:
  - status: "True"
    type: Ready

To support mesh expansion, the control plane needs to read configuration
associated with an external instance (i.e. a VM) for the purpose of
service and inbound authorization policy discovery.

This change introduces a new CRD that supports the required
configuration options. The resource supports:

* a list of workload IPs (with a generic format to support ipv4 and ipv6
  in the future)
* a set of mesh TLS settings (SNI and identity)
* a set of ports exposed by the workload
* a set of status conditions

Signed-off-by: Matei David <[email protected]>
@mateiidavid mateiidavid requested a review from a team as a code owner December 20, 2023 14:56
Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exciting! 🥳

Comment on lines 99 to 101
required:
- workloadIPs
- ports
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't identity required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Ports and workload IPs are necessary to fill out an EndpointSlice, so they need to be present on the resource when we start indexing. But identity / tls settings aren't a necessity strictly speaking. I thought maybe it makes sense to have looser server side validation requirements here to start with. Maybe it does make sense to also require these up front though. We could always do that as a follow-up; could there be external endpoints expressed through this resource that do not require TLS settings? I might be thinking a bit too far ahead heh.

@mateiidavid mateiidavid merged commit 31e1334 into main Jan 5, 2024
34 checks passed
@mateiidavid mateiidavid deleted the matei/expand-the-mesh-with-crds branch January 5, 2024 11:35
mateiidavid added a commit that referenced this pull request Jan 12, 2024
This edge release introduces a number of different fixes and improvements. More
notably, it introduces a new `cni-repair-controller` binary to the CNI plugin
image. The controller will automatically restart pods that have not received
their iptables configuration.

* Removed shortnames from Tap API resources to avoid colliding with existing
  Kubernetes resources ([#11816]; fixes [#11784])
* Introduced a new ExternalWorkload CRD to support upcoming mesh expansion
  feature ([#11805])
* Changed `MeshTLSAuthentication` resource validation to allow SPIFFE URI
  identities ([#11882])
* Introduced a new `cni-repair-controller` to the `linkerd-cni` DaemonSet to
  automatically restart misconfigured pods that are missing iptables rules
  ([#11699]; fixes [#11073])
* Fixed a `"duplicate metrics"` warning in the multicluster service-mirror
  component ([#11875]; fixes [#11839])
* Added metric labels and weights to `linkerd diagnostics endpoints` json
  output ([#11889])
* Changed how `Server` updates are handled in the destination service. The
  change will ensure that during a cluster resync, consumers won't be
  overloaded by redundant updates ([#11907])
* Changed `linkerd install` error output to add a newline when a Kubernetes
  client cannot be successfully initialised

[#11816]: #11816
[#11784]: #11784
[#11805]: #11805
[#11882]: #11882
[#11699]: #11699
[#11073]: #11073
[#11875]: #11875
[#11839]: #11839
[#11889]: #11889
[#11907]: #11907
[#11917]: #11917

Signed-off-by: Matei David <[email protected]>
@mateiidavid mateiidavid mentioned this pull request Jan 12, 2024
mateiidavid added a commit that referenced this pull request Jan 12, 2024
This edge release introduces a number of different fixes and improvements. More
notably, it introduces a new `cni-repair-controller` binary to the CNI plugin
image. The controller will automatically restart pods that have not received
their iptables configuration.

* Removed shortnames from Tap API resources to avoid colliding with existing
  Kubernetes resources ([#11816]; fixes [#11784])
* Introduced a new ExternalWorkload CRD to support upcoming mesh expansion
  feature ([#11805])
* Changed `MeshTLSAuthentication` resource validation to allow SPIFFE URI
  identities ([#11882])
* Introduced a new `cni-repair-controller` to the `linkerd-cni` DaemonSet to
  automatically restart misconfigured pods that are missing iptables rules
  ([#11699]; fixes [#11073])
* Fixed a `"duplicate metrics"` warning in the multicluster service-mirror
  component ([#11875]; fixes [#11839])
* Added metric labels and weights to `linkerd diagnostics endpoints` json
  output ([#11889])
* Changed how `Server` updates are handled in the destination service. The
  change will ensure that during a cluster resync, consumers won't be
  overloaded by redundant updates ([#11907])
* Changed `linkerd install` error output to add a newline when a Kubernetes
  client cannot be successfully initialised ([#11917])

[#11816]: #11816
[#11784]: #11784
[#11805]: #11805
[#11882]: #11882
[#11699]: #11699
[#11073]: #11073
[#11875]: #11875
[#11839]: #11839
[#11889]: #11889
[#11907]: #11907
[#11917]: #11917

Signed-off-by: Matei David <[email protected]>
mateiidavid added a commit that referenced this pull request Jan 12, 2024
This edge release introduces a number of different fixes and improvements. More
notably, it introduces a new `cni-repair-controller` binary to the CNI plugin
image. The controller will automatically restart pods that have not received
their iptables configuration.

* Removed shortnames from Tap API resources to avoid colliding with existing
  Kubernetes resources ([#11816]; fixes [#11784])
* Introduced a new ExternalWorkload CRD to support upcoming mesh expansion
  feature ([#11805])
* Changed `MeshTLSAuthentication` resource validation to allow SPIFFE URI
  identities ([#11882])
* Introduced a new `cni-repair-controller` to the `linkerd-cni` DaemonSet to
  automatically restart misconfigured pods that are missing iptables rules
  ([#11699]; fixes [#11073])
* Fixed a `"duplicate metrics"` warning in the multicluster service-mirror
  component ([#11875]; fixes [#11839])
* Added metric labels and weights to `linkerd diagnostics endpoints` json
  output ([#11889])
* Changed how `Server` updates are handled in the destination service. The
  change will ensure that during a cluster resync, consumers won't be
  overloaded by redundant updates ([#11907])
* Changed `linkerd install` error output to add a newline when a Kubernetes
  client cannot be successfully initialised ([#11917])

[#11816]: #11816
[#11784]: #11784
[#11805]: #11805
[#11882]: #11882
[#11699]: #11699
[#11073]: #11073
[#11875]: #11875
[#11839]: #11839
[#11889]: #11889
[#11907]: #11907
[#11917]: #11917

Signed-off-by: Matei David <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants