Conversation
85db7b9 to
6404ce3
Compare
| //TODO(zalimeni) Do we need to check service-enable label here on service/deployments/replicasets? | ||
| // Should we expect owner is annotated, and if so, is that consistent for non-replicaset workloads? | ||
| // If so, update to register and deregister conditionally. |
There was a problem hiding this comment.
This 🧵 discusses maybe not needing explicit exclusion for services. I'm not 100% certain whether this controller should expect any opt-outs that will impact service registration, outside of disabled k8s namespaces from config, already checked above.
Input welcome! Lacking a strong opinion, I'll probably convert this to a follow-up ticket as a placeholder for a pre-GA decision.
| // We have to fetch the owner and check its type because pod names cannot be disambiguated from pod owner names due to | ||
| // the `-` delimiter and unique ID parts also being valid name components. |
There was a problem hiding this comment.
Feel free to correct me if anyone knows better here. 🧵 for context here.
| //TODO(zalimeni) Read from Consul or some other state to determine whether a write is needed. | ||
| // For now, we blindly write state on each reconcile to simplify implementation. | ||
| _, err := resourceClient.Write(ctx, &pbresource.WriteRequest{Resource: serviceResource}) |
There was a problem hiding this comment.
My thinking on this r/n is that we should just try to read the existing Service, and if we find it, check for total equality minus resource version. If there's a diff, rewrite.
If we move to an EndpointSlices/Pods-based approach in the future, we'll need to do some sort of checking for stale workload selector values, but since Endpoints gives us a full snapshot up to the 1K pod limit, we should be able to just upsert today w/o being concerned with drift.
There was a problem hiding this comment.
Was there a specific case you were trying to address by taking the diff and rewriting it? There was some discussion of running multiple versions of the controller concurrently in the future, and I wonder if this could create flapping with different workers fighting over the same service.
I think I naively assumed that if the write doesn't error, the resource service has persisted the service. Some future work is planned for later releases to sync the status of Consul resources back to Kubernetes resources. Do you think this would address any of your concerns?
There was a problem hiding this comment.
Was there a specific case you were trying to address by taking the diff and rewriting it?
Mainly just avoiding needless writes - TBF I'm not fully up to speed on the resource API and state store changes, but my basic understanding is reads are cheaper, and we are theoretically capped in the 2-3K/s for writes. This seemed like an easy way to avoid needless clobbering, esp. when potentially many reconciles loops attempt to write at once w/o versions and the baked-in CAS retries compete w/ each other.
There was some discussion of running multiple versions of the controller concurrently in the future, and I wonder if this could create flapping with different workers fighting over the same service.
I think the flapping would occur regardless of pre-reads? But that's a good callout. I'd hope we'd come up w/ some one-way version tagging similar or identical to the managed-by tag if needed to avoid issues from that. (Do you have a link to that conversation by chance?)
I think I naively assumed that if the write doesn't error, the resource service has persisted the service. Some future work is planned for later releases to sync the status of Consul resources back to Kubernetes resources. Do you think this would address any of your concerns?
That's my assumption too 👍🏻 ATM I'm only thinking of debouncing, though bidirectional sync would add complexity to that.
There was a problem hiding this comment.
I think it was in a casual Zoom, so I don't happen to have a link to the convo.
There was a problem hiding this comment.
Summary from Slack: not entirely conclusive, but we agree early optimization should be generally avoided until we get a better idea of write performance in the wild.
I think it's fair to argue that debouncing rapid identical writes falls under low-hanging fruit to reduce needless load, but is leading me to additional questions about how we'll handle state in resources that may be mutated by multiple sources/data flows (not sure how much of this there will be - could resource Meta tagging be an example?), since we also can't just blindly rewrite records in that scenario.
Regardless, this is tagged for follow-up outside this PR.
| VirtualPort: uint32(p.Port), | ||
| TargetPort: p.TargetPort.String(), | ||
| Protocol: common.GetPortProtocol(p.AppProtocol), |
There was a problem hiding this comment.
I'd appreciate a spot check on this mapping JIC I'm missing some nuance - cc @ishustava in case you have input
There was a problem hiding this comment.
I don't think targetPort will work in the case it is a number, but the pod defines the service with a name. We will need to look up the corresponding port name in the pod. I'm open to changing how this works on the Pod side but not sure how you get around the name being optional and targetPort working on either.
There was a problem hiding this comment.
I don't think targetPort will work in the case it is a number, but the pod defines the service with a name
(This might be wrong or obvious, feel free to correct me) It seems to me from the k8s docs that there's an inherent tradeoff that applies generally: you need to pick whether to use a name consistently across service and pods (which enables mixed pod port numbers under one name), or you need to pick a consistent pod port number and use that in the service. Either way, it seems selected pods need to adhere to the service's chosen int-or-string value for targetPort.
Our docs for Consul Service.TargetPort say target_port is the name of the workload port, but I took that to mean we defer totally to the value used by the k8s service per the typical scheme above.
It seems wrong to me to reach into the pods to infer anything beyond the workload selector, since the targetPort mapping isn't really a consul-k8s-specific limitation... But I might be missing a reason we'd want to do that.
I'm open to changing how this works on the Pod side
What did you have in mind here?
There was a problem hiding this comment.
It seems wrong to me to reach into the pods to infer anything beyond the workload selector, since the targetPort mapping isn't really a consul-k8s-specific limitation
Yeah, I don't like the idea of reaching in either. The pods may be slightly different, and the names don't align, making the lookup challenging. But I think I disagree that there isn't something K8s-specific here. We generate a Consul port name per port, even if one wasn't defined in the Kube spec, and Kubernetes doesn't necessarily require you to use any name at all, so there is a mismatch of strong (consul) vs. weak (k8s) references. I think I had the opposite reaction that since the VM case is straightforward, we'd have to "fix it" in consul-k8s.
I'm open to changing how this works on the Pod side
No specific ideas here. I'm just trying not to look like I'm suggesting a singular solution.
There was a problem hiding this comment.
Started 🧵 here to figure out how we want this to work
There was a problem hiding this comment.
This conversation was ongoing as of EOD Friday so leaving for resolution outside this PR, but I'll try to keep things linked in Slack so the chain of 🧵s isn't lost from here.
For the short-term, the plan ATM is to use Consul Service TargetPort strictly as a string name. In the case where the K8s service targetPort is a number, we'll rely on best-effort inference of the best Consul TargetPort value based on the name used by the most prevalent endpoint subset for that port (e.g. none, or a specific name that isn't being directly targeted in K8s).
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller.go
Outdated
Show resolved
Hide resolved
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller.go
Show resolved
Hide resolved
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller_test.go
Outdated
Show resolved
Hide resolved
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller_test.go
Show resolved
Hide resolved
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller.go
Outdated
Show resolved
Hide resolved
6404ce3 to
0780abf
Compare
DanStough
left a comment
There was a problem hiding this comment.
I didn't get a chance to review the test case yet, but everything else looks great! I wanted to send you my open-ended comments in advance, like how targetPort works and the appProtocol.
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller_test.go
Outdated
Show resolved
Hide resolved
| //TODO(zalimeni) Read from Consul or some other state to determine whether a write is needed. | ||
| // For now, we blindly write state on each reconcile to simplify implementation. | ||
| _, err := resourceClient.Write(ctx, &pbresource.WriteRequest{Resource: serviceResource}) |
There was a problem hiding this comment.
Was there a specific case you were trying to address by taking the diff and rewriting it? There was some discussion of running multiple versions of the controller concurrently in the future, and I wonder if this could create flapping with different workers fighting over the same service.
I think I naively assumed that if the write doesn't error, the resource service has persisted the service. Some future work is planned for later releases to sync the status of Consul resources back to Kubernetes resources. Do you think this would address any of your concerns?
| case "http2": | ||
| return pbcatalog.Protocol_PROTOCOL_HTTP2 | ||
| case "grpc": | ||
| return pbcatalog.Protocol_PROTOCOL_GRPC |
There was a problem hiding this comment.
I wonder if we technically need to include prefixes with some of these protocols. I was reading the API docs for appProtocol and it mentions only IANA names can be unprefixed. Only my first time seeing this so I could be mistaken.
I also think an annotation on the service or pod might be easier for customers to support. e.g. the Bitnami Postgres helmchart exposes annotations for services and certain pods but not a change of appProtocol.
There was a problem hiding this comment.
I wonder if we technically need to include prefixes with some of these protocols. I was reading the API docs for appProtocol and it mentions only IANA names can be unprefixed. Only my first time seeing this so I could be mistaken.
Yeah, I was wondering the same - we may need to iterate on this part. FWIW, Istio's solution (appProtocol and detection via port name) use unprefixed values from what I can see, and Open Service Mesh also includes non-standard tcp and gRPC. I wonder whether the intent in k8s docs was valid Service Name Syntax rather than a registered name - that seems arbitrarily limiting for new protocols, to me, when the purpose appears to be open-ended extension.
I also think an annotation on the service or pod might be easier for customers to support. e.g. the Bitnami Postgres helmchart exposes annotations for services and certain pods but not a change of appProtocol.
Agreed, it seems like annotations could help ease operator pain here rather than plumbing appProtocol through templates - though maybe better for us to enforce "standard" k8s concepts and push that concern up the tooling chain rather than accommodate it ourselves? I can start a 🧵 in the multi-port channel and see whether a follow-up makes sense.
There was a problem hiding this comment.
Added this as a bullet in follow-up ticket so we don't lose track but probably don't need to figure out right this second.
There was a problem hiding this comment.
Circling back to this one to tie up the loose end: @ishustava, do you happen to have perspective on either of Dan's questions above?
My gut is to leave the unprefixed protocols in place for now since there seems to be precedent in other meshes, and wait to see if we need annotations in addition to just appProtocol. If that's not ideal, we can fix now or plan to follow up in 1.18.
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller.go
Outdated
Show resolved
Hide resolved
| VirtualPort: uint32(p.Port), | ||
| TargetPort: p.TargetPort.String(), | ||
| Protocol: common.GetPortProtocol(p.AppProtocol), |
There was a problem hiding this comment.
I don't think targetPort will work in the case it is a number, but the pod defines the service with a name. We will need to look up the corresponding port name in the pod. I'm open to changing how this works on the Pod side but not sure how you get around the name being optional and targetPort working on either.
0780abf to
2e50816
Compare
zalimeni
left a comment
There was a problem hiding this comment.
@DanStough @thisisnotashwin ok, I think this is ready for a next pass. I've added more tests around the essential CrUD bits as well as the pod fetching behavior since that seemed valuable to lock in.
Assuming no major changes are needed, I'll plan to merge this to unblock testing while I follow up on remaining TODOs and Slack 🧵s.
| //TODO(zalimeni) Read from Consul or some other state to determine whether a write is needed. | ||
| // For now, we blindly write state on each reconcile to simplify implementation. | ||
| _, err := resourceClient.Write(ctx, &pbresource.WriteRequest{Resource: serviceResource}) |
| VirtualPort: uint32(p.Port), | ||
| TargetPort: p.TargetPort.String(), | ||
| Protocol: common.GetPortProtocol(p.AppProtocol), |
There was a problem hiding this comment.
Started 🧵 here to figure out how we want this to work
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller.go
Outdated
Show resolved
Hide resolved
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller.go
Outdated
Show resolved
Hide resolved
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller_test.go
Show resolved
Hide resolved
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller.go
Show resolved
Hide resolved
DanStough
left a comment
There was a problem hiding this comment.
I think based on all the tradeoffs we discussed to get a version of this ready to test, LGTM.
control-plane/connect-inject/controllers/endpointsv2/endpoints_controller_test.go
Show resolved
Hide resolved
2e50816 to
b6daa27
Compare
395da7b to
31fa5c0
Compare
Implement the basic requirements of a new Endpoints controller that registers Services via Consul's V2 API. Further tests and TODOs will be addressed in follow-up changes.
31fa5c0 to
202959d
Compare
Follows pattern of iterative review in #2868, and currently targets that branch due to dependence on common code added there
This PR is part of a series of PRs that add functionality to the V2 Endpoints Controller. For the MVP task to be complete, there will be a few additional changes to resolve TODOs, debounce upsert writes if needed, and add further tests.
Changes proposed in this PR:
How I've tested this PR: Unit tests
How I expect reviewers to test this PR: 👁️🗨️👃🏻👁️🗨️
Checklist: