Introduce forward looking MCP enhancements#741
Introduce forward looking MCP enhancements#741istio-testing merged 14 commits intoistio:release-1.1from
Conversation
The initial version of Mesh Configuration Protocol (MCP) was introduced to decouple Pilot/Mixer from the k8s kube-apiserver. These enhancements address additional forward looking requirements as we bring Galley and the MCP API to beta quality. * Enable alternative control topologies where the source of configuration is not publicly accessible. * Provide a feedback mechanism to report the observed config state to the user (e.g via CRD status). * Improve performance at scale (e.g. Enterprise use case) * Rationalize the resource model The intent is to introduce these backwards incompatible API changes now before Galley and MCP ship as beta quality and on-by-default. Rationalization of the resource model and incremental improvements can be implemented immediately as they effect the contract between Pilot and Galley. Alternative control topologies and feedback/status are inherently new features and can be implemented later with the same APIs without concerns of breaking compatibility. Design proposal: https://goo.gl/RTKMwF
mcp/v1alpha1/metadata.proto
Outdated
|
|
||
| // Structured data that can be used by source and sink to communicate arbitrary | ||
| // metadata about this resource. | ||
| google.protobuf.Struct annotations = 5; |
There was a problem hiding this comment.
perhaps a map<string,Struct> ?
There was a problem hiding this comment.
I know conceptually the same thing can be implemented with a Struct as well, but having a top-level map<string,..> allows ownership/seggregation by different owners, without having to deal with struct parsing.
There was a problem hiding this comment.
Please don't use Structs. Super expensive and verbose and wire-inefficient.
There may be a use case ( like representing json or xml ) - but this is not it.
There was a problem hiding this comment.
... or we stick with map<string,string> which covers the majority of use cases we want to encourage (e.g. simple annotations). Arbitrary complex annotations would still be possible (via JSON), but they would be discouraged in favor of adding well defined field(s) in the common metadata or resource specific proto.
There was a problem hiding this comment.
map<string,string> sounds reasonable.
| // Type of resource collection that is being requested, e.g. | ||
| // | ||
| // istio/networking/v1alpha3/VirtualService | ||
| // k8s/<apiVersion>/<kind> |
There was a problem hiding this comment.
?? what is the k8s/apiVersion stuff for? Arent we converting k8s services to ServiceEntries and streaming them instead?
There was a problem hiding this comment.
I think the protocol can be used to watch/stream any k8s ( or consul, etc ) object.
There was a problem hiding this comment.
The protocol is more general than the current implementation. Pilot would continue to watch collections of Istio resources directly. e.g.
istio/networking/v1alpha3/VirtualService
istio/networking/v1alpha3/ServiceEntry
// etc ...
istio/authentication/v1alpha1/Policy
// etc ...
The k8s/apiVersion stuff could be used to move k8s resources between networks/clusters where direct access to the k8s apiserver is not possible.
|
Left some top level comments in the doc. |
| // this case the response_nonce is set to the nonce value | ||
| // in the Resources. ACK/NACK is determined by the presence | ||
| // of error_detail. | ||
| message RequestResources { |
There was a problem hiding this comment.
How about ResourceRequest ? That seems to be a more common pattern for naming a request.
There was a problem hiding this comment.
I would normally agree if this was only ever sent by the client. However, in the reverse service direction ResourceRequest is sent by the server.
| // When the RequestResources is an ACK or NACK message in response to a previous RequestResources, | ||
| // the response_nonce must be the nonce in the RequestResources. Otherwise response_nonce must | ||
| // be omitted. | ||
| string response_nonce = 4; |
There was a problem hiding this comment.
One of the biggest pain points in ADS is this 'nonce' and error_detail. It is never clear when a Request is a real request, and when it is just an ack/nack.
Keep in mind rpc.Status can also be 'OK' - so the field should probably be named status.
I would just make things clear - the request should include either an ack (status=OK), nack ( status != ok, response_nonce set) or an update/new request.
There was a problem hiding this comment.
The presence of response_nonce should be sufficient to differentiate ACK/NACK from updated watch, no?
- ACK (nonce!=""; error_details==nil)
- NACK (nonce!=""; error_details!=nil)
- New/Update request (nonce=="", error_details ignored)
Note that this is slightly different than the non-incremental spec where request/response are always paired in per-version nonce and version.
There was a problem hiding this comment.
Perhaps add this cheat-sheet to the comments?
mcp/v1alpha1/mcp.proto
Outdated
|
|
||
| // Opaque metadata extending the client identifier. | ||
| // Opaque metadata extending the node identifier. | ||
| google.protobuf.Struct metadata = 2; |
There was a problem hiding this comment.
This is unrelated to what's defined in metadata.proto, right? Then perhaps this should use a different name than metadata.
How about annotations, and use a map<string, string> just like metadata annotations?
There was a problem hiding this comment.
Metadata is a common concept (e.g. https://github.com/grpc/grpc-go/blob/master/Documentation/grpc-metadata.md). We end up with metadata at different layers, e.g.
- connection/stream metadata (grpc)
- node metadata (e.g. pod labels)
- resource metadata (e.g. k8s resource metadata)
Note that this was also carried over from the envoy/xDS protocol.
There was a problem hiding this comment.
I'm looking at this in the context of the MCP API proper, and reusing the same term for two different meanings within the same API feel confusing to me. I realize that the notion of metadata is generally known in the industry, but this specific API has a type called Metadata and it has nothing to do with this field, and so it would be preferable to chose a different name.
There was a problem hiding this comment.
The name is context sensitive, i.e. node metadata vs. resource metadata. It isn't ideal but I'm lacking a good alternative. I'm not sure annotation is any better since annotation appears under metadata elsewhere.
There was a problem hiding this comment.
IMHO, annotation (with map<string, string> ) still seems like a more consistent option. Even though there is a little bit of mismatch (i.e. an intermediary Metadata resource).
| // When the RequestResources is an ACK or NACK message in response to a previous RequestResources, | ||
| // the response_nonce must be the nonce in the RequestResources. Otherwise response_nonce must | ||
| // be omitted. | ||
| string response_nonce = 4; |
There was a problem hiding this comment.
Perhaps add this cheat-sheet to the comments?
|
On Wed, Jan 2, 2019 at 12:19 PM Jason Young ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In mcp/v1alpha1/mcp.proto
<#741 (comment)>:
> + // Type of resource collection that is being requested, e.g.
+ //
+ // istio/networking/v1alpha3/VirtualService
+ // k8s/<apiVersion>/<kind>
+ string collection = 2;
+
+ // When the RequestResources is the first in a stream, the initial_resource_versions must
+ // be populated. Otherwise, initial_resource_versions must be omitted. The keys are the
+ // resources names of the MCP resources known to the MCP client. The values in the map
+ // are the associated resource level version info.
+ map<string, string> initial_resource_versions = 3;
+
+ // When the RequestResources is an ACK or NACK message in response to a previous RequestResources,
+ // the response_nonce must be the nonce in the RequestResources. Otherwise response_nonce must
+ // be omitted.
+ string response_nonce = 4;
The presence of response_nonce should be sufficient to differentiate
ACK/NACK from updated watch, no?
- ACK (nonce!=""; error_details==nil)
- NACK (nonce!=""; error_details!=nil)
- New/Update request (nonce=="", error_details ignored)
It's not that clear - for example on reconnect you may have an ACK/NACK
from a previous message ( which may
have went to a different server ).
And you can never know if the message has additional resources to watch -
we need to keep doing diffs instead of
having a clear separating between updates and ACK/NACK.
So far I haven't seen any use case or benefit of mixing ACK/NACK with an
update of the watched resources ( and
the forced diff and confusion).
…
Note that this is slightly different than the non-incremental spec where
request/response are always paired in per-version nonce *and* version.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#741 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFI6vBxxxVqsOQ2EjFQgL39x-m4jMMcks5u_RRqgaJpZM4ZesPL>
.
|
| @@ -0,0 +1,2 @@ | |||
| *.pb.go linguist-generated=true | |||
There was a problem hiding this comment.
Should we also include the Python files?
There was a problem hiding this comment.
The generated python protobuf code is already hidden by default.
https://github.com/github/linguist/blob/e761f9b013e5b61161481fcb898b59721ee40e3d/lib/linguist/generated.rb#L302
https://github.com/github/linguist/blob/e761f9b013e5b61161481fcb898b59721ee40e3d/lib/linguist/generated.rb#L302
What specifically isn't clear? On re-connect a new stream is established and the sink will send a new request w/empty nonce. The new request may optionally include a list of previously received resources for fast-resume.
I'm not sure I follow the problem here. Why wouldn't we know which resources to watch? With incremental, the list of resources to subscribe/unsubscribe to in the request is already a delta. The source would accumulate these changes to modify what resources are pushed to the sink. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ayj, ozevren The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Introduce forward looking MCP enhancements The initial version of Mesh Configuration Protocol (MCP) was introduced to decouple Pilot/Mixer from the k8s kube-apiserver. These enhancements address additional forward looking requirements as we bring Galley and the MCP API to beta quality. * Enable alternative control topologies where the source of configuration is not publicly accessible. * Provide a feedback mechanism to report the observed config state to the user (e.g via CRD status). * Improve performance at scale (e.g. Enterprise use case) * Rationalize the resource model The intent is to introduce these backwards incompatible API changes now before Galley and MCP ship as beta quality and on-by-default. Rationalization of the resource model and incremental improvements can be implemented immediately as they effect the contract between Pilot and Galley. Alternative control topologies and feedback/status are inherently new features and can be implemented later with the same APIs without concerns of breaking compatibility. Design proposal: https://goo.gl/RTKMwF * fix linter error * add missing generated file * proto-commit * remove python/istio_api/mcp/v1alpha1/envelope_pb2.py * s/envelope/resource * s/client/node * make proto-commit * fix comments * add system_version_info for compatibility with non-incremental MCP * address review comments * s/node/sink_node * address more review comments * update resource name documentation
* Introduce forward looking MCP enhancements The initial version of Mesh Configuration Protocol (MCP) was introduced to decouple Pilot/Mixer from the k8s kube-apiserver. These enhancements address additional forward looking requirements as we bring Galley and the MCP API to beta quality. * Enable alternative control topologies where the source of configuration is not publicly accessible. * Provide a feedback mechanism to report the observed config state to the user (e.g via CRD status). * Improve performance at scale (e.g. Enterprise use case) * Rationalize the resource model The intent is to introduce these backwards incompatible API changes now before Galley and MCP ship as beta quality and on-by-default. Rationalization of the resource model and incremental improvements can be implemented immediately as they effect the contract between Pilot and Galley. Alternative control topologies and feedback/status are inherently new features and can be implemented later with the same APIs without concerns of breaking compatibility. Design proposal: https://goo.gl/RTKMwF * fix linter error * add missing generated file * proto-commit * remove python/istio_api/mcp/v1alpha1/envelope_pb2.py * s/envelope/resource * s/client/node * make proto-commit * fix comments * add system_version_info for compatibility with non-incremental MCP * address review comments * s/node/sink_node * address more review comments * update resource name documentation
The initial version of Mesh Configuration Protocol (MCP) was
introduced to decouple Pilot/Mixer from the k8s kube-apiserver. These
enhancements address additional forward looking requirements as we
bring Galley and the MCP API to beta quality.
Enable alternative control topologies where the source of
configuration is not publicly accessible.
Provide a feedback mechanism to report the observed config state to
the user (e.g via CRD status).
Improve performance at scale (e.g. Enterprise use case)
Rationalize the resource model
The intent is to introduce these backwards incompatible API changes
now before Galley and MCP ship as beta quality and on-by-default.
Rationalization of the resource model and incremental improvements can
be implemented immediately as they effect the contract between Pilot
and Galley. Alternative control topologies and feedback/status are
inherently new features and can be implemented later with the same
APIs without concerns of breaking compatibility.
Design proposal: https://goo.gl/RTKMwF