Skip to content

Introduce forward looking MCP enhancements#741

Merged
istio-testing merged 14 commits intoistio:release-1.1from
ayj:release-1.1-mcp-enhancements
Jan 9, 2019
Merged

Introduce forward looking MCP enhancements#741
istio-testing merged 14 commits intoistio:release-1.1from
ayj:release-1.1-mcp-enhancements

Conversation

@ayj
Copy link
Copy Markdown
Contributor

@ayj ayj commented Dec 21, 2018

The initial version of Mesh Configuration Protocol (MCP) was
introduced to decouple Pilot/Mixer from the k8s kube-apiserver. These
enhancements address additional forward looking requirements as we
bring Galley and the MCP API to beta quality.

  • Enable alternative control topologies where the source of
    configuration is not publicly accessible.

  • Provide a feedback mechanism to report the observed config state to
    the user (e.g via CRD status).

  • Improve performance at scale (e.g. Enterprise use case)

  • Rationalize the resource model

The intent is to introduce these backwards incompatible API changes
now before Galley and MCP ship as beta quality and on-by-default.

Rationalization of the resource model and incremental improvements can
be implemented immediately as they effect the contract between Pilot
and Galley. Alternative control topologies and feedback/status are
inherently new features and can be implemented later with the same
APIs without concerns of breaking compatibility.

Design proposal: https://goo.gl/RTKMwF

The initial version of Mesh Configuration Protocol (MCP) was
introduced to decouple Pilot/Mixer from the k8s kube-apiserver. These
enhancements address additional forward looking requirements as we
bring Galley and the MCP API to beta quality.

* Enable alternative control topologies where the source of
  configuration is not publicly accessible.

* Provide a feedback mechanism to report the observed config state to
  the user (e.g via CRD status).

* Improve performance at scale (e.g. Enterprise use case)

* Rationalize the resource model

The intent is to introduce these backwards incompatible API changes
now before Galley and MCP ship as beta quality and on-by-default.

Rationalization of the resource model and incremental improvements can
be implemented immediately as they effect the contract between Pilot
and Galley. Alternative control topologies and feedback/status are
inherently new features and can be implemented later with the same
APIs without concerns of breaking compatibility.

Design proposal: https://goo.gl/RTKMwF
@ayj ayj requested review from geeknoid and ozevren December 21, 2018 18:56
@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Dec 21, 2018
@ayj
Copy link
Copy Markdown
Contributor Author

ayj commented Dec 21, 2018

cc @jeffmendoza @Nino-K


// Structured data that can be used by source and sink to communicate arbitrary
// metadata about this resource.
google.protobuf.Struct annotations = 5;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps a map<string,Struct> ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know conceptually the same thing can be implemented with a Struct as well, but having a top-level map<string,..> allows ownership/seggregation by different owners, without having to deal with struct parsing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use Structs. Super expensive and verbose and wire-inefficient.
There may be a use case ( like representing json or xml ) - but this is not it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or we stick with map<string,string> which covers the majority of use cases we want to encourage (e.g. simple annotations). Arbitrary complex annotations would still be possible (via JSON), but they would be discouraged in favor of adding well defined field(s) in the common metadata or resource specific proto.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map<string,string> sounds reasonable.

// Type of resource collection that is being requested, e.g.
//
// istio/networking/v1alpha3/VirtualService
// k8s/<apiVersion>/<kind>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?? what is the k8s/apiVersion stuff for? Arent we converting k8s services to ServiceEntries and streaming them instead?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the protocol can be used to watch/stream any k8s ( or consul, etc ) object.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocol is more general than the current implementation. Pilot would continue to watch collections of Istio resources directly. e.g.

istio/networking/v1alpha3/VirtualService
istio/networking/v1alpha3/ServiceEntry
// etc ...
istio/authentication/v1alpha1/Policy
// etc ...

The k8s/apiVersion stuff could be used to move k8s resources between networks/clusters where direct access to the k8s apiserver is not possible.

@rshriram
Copy link
Copy Markdown
Member

Left some top level comments in the doc.

// this case the response_nonce is set to the nonce value
// in the Resources. ACK/NACK is determined by the presence
// of error_detail.
message RequestResources {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about ResourceRequest ? That seems to be a more common pattern for naming a request.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would normally agree if this was only ever sent by the client. However, in the reverse service direction ResourceRequest is sent by the server.

// When the RequestResources is an ACK or NACK message in response to a previous RequestResources,
// the response_nonce must be the nonce in the RequestResources. Otherwise response_nonce must
// be omitted.
string response_nonce = 4;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the biggest pain points in ADS is this 'nonce' and error_detail. It is never clear when a Request is a real request, and when it is just an ack/nack.

Keep in mind rpc.Status can also be 'OK' - so the field should probably be named status.

I would just make things clear - the request should include either an ack (status=OK), nack ( status != ok, response_nonce set) or an update/new request.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The presence of response_nonce should be sufficient to differentiate ACK/NACK from updated watch, no?

  • ACK (nonce!=""; error_details==nil)
  • NACK (nonce!=""; error_details!=nil)
  • New/Update request (nonce=="", error_details ignored)

Note that this is slightly different than the non-incremental spec where request/response are always paired in per-version nonce and version.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add this cheat-sheet to the comments?


// Opaque metadata extending the client identifier.
// Opaque metadata extending the node identifier.
google.protobuf.Struct metadata = 2;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated to what's defined in metadata.proto, right? Then perhaps this should use a different name than metadata.

How about annotations, and use a map<string, string> just like metadata annotations?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metadata is a common concept (e.g. https://github.com/grpc/grpc-go/blob/master/Documentation/grpc-metadata.md). We end up with metadata at different layers, e.g.

  • connection/stream metadata (grpc)
  • node metadata (e.g. pod labels)
  • resource metadata (e.g. k8s resource metadata)

Note that this was also carried over from the envoy/xDS protocol.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at this in the context of the MCP API proper, and reusing the same term for two different meanings within the same API feel confusing to me. I realize that the notion of metadata is generally known in the industry, but this specific API has a type called Metadata and it has nothing to do with this field, and so it would be preferable to chose a different name.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is context sensitive, i.e. node metadata vs. resource metadata. It isn't ideal but I'm lacking a good alternative. I'm not sure annotation is any better since annotation appears under metadata elsewhere.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, annotation (with map<string, string> ) still seems like a more consistent option. Even though there is a little bit of mismatch (i.e. an intermediary Metadata resource).

// When the RequestResources is an ACK or NACK message in response to a previous RequestResources,
// the response_nonce must be the nonce in the RequestResources. Otherwise response_nonce must
// be omitted.
string response_nonce = 4;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add this cheat-sheet to the comments?

@costinm
Copy link
Copy Markdown
Contributor

costinm commented Jan 3, 2019 via email

@@ -0,0 +1,2 @@
*.pb.go linguist-generated=true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also include the Python files?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayj
Copy link
Copy Markdown
Contributor Author

ayj commented Jan 3, 2019

It's not that clear - for example on reconnect you may have an ACK/NACK
from a previous message ( which may
have went to a different server ).

What specifically isn't clear? On re-connect a new stream is established and the sink will send a new request w/empty nonce. The new request may optionally include a list of previously received resources for fast-resume.

And you can never know if the message has additional resources to watch -
we need to keep doing diffs instead of
having a clear separating between updates and ACK/NACK.

I'm not sure I follow the problem here. Why wouldn't we know which resources to watch? With incremental, the list of resources to subscribe/unsubscribe to in the request is already a delta. The source would accumulate these changes to modify what resources are pushed to the sink.

@ozevren
Copy link
Copy Markdown
Contributor

ozevren commented Jan 9, 2019

/lgtm

@istio-testing
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ayj, ozevren

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@istio-testing istio-testing merged commit aa7083e into istio:release-1.1 Jan 9, 2019
yangminzhu pushed a commit to yangminzhu/api that referenced this pull request Jan 15, 2019
* Introduce forward looking MCP enhancements

The initial version of Mesh Configuration Protocol (MCP) was
introduced to decouple Pilot/Mixer from the k8s kube-apiserver. These
enhancements address additional forward looking requirements as we
bring Galley and the MCP API to beta quality.

* Enable alternative control topologies where the source of
  configuration is not publicly accessible.

* Provide a feedback mechanism to report the observed config state to
  the user (e.g via CRD status).

* Improve performance at scale (e.g. Enterprise use case)

* Rationalize the resource model

The intent is to introduce these backwards incompatible API changes
now before Galley and MCP ship as beta quality and on-by-default.

Rationalization of the resource model and incremental improvements can
be implemented immediately as they effect the contract between Pilot
and Galley. Alternative control topologies and feedback/status are
inherently new features and can be implemented later with the same
APIs without concerns of breaking compatibility.

Design proposal: https://goo.gl/RTKMwF

* fix linter error

* add missing generated file

* proto-commit

* remove python/istio_api/mcp/v1alpha1/envelope_pb2.py

* s/envelope/resource

* s/client/node

* make proto-commit

* fix comments

* add system_version_info for compatibility with non-incremental MCP

* address review comments

* s/node/sink_node

* address more review comments

* update resource name documentation
louiscryan pushed a commit to louiscryan/api that referenced this pull request Jan 16, 2019
* Introduce forward looking MCP enhancements

The initial version of Mesh Configuration Protocol (MCP) was
introduced to decouple Pilot/Mixer from the k8s kube-apiserver. These
enhancements address additional forward looking requirements as we
bring Galley and the MCP API to beta quality.

* Enable alternative control topologies where the source of
  configuration is not publicly accessible.

* Provide a feedback mechanism to report the observed config state to
  the user (e.g via CRD status).

* Improve performance at scale (e.g. Enterprise use case)

* Rationalize the resource model

The intent is to introduce these backwards incompatible API changes
now before Galley and MCP ship as beta quality and on-by-default.

Rationalization of the resource model and incremental improvements can
be implemented immediately as they effect the contract between Pilot
and Galley. Alternative control topologies and feedback/status are
inherently new features and can be implemented later with the same
APIs without concerns of breaking compatibility.

Design proposal: https://goo.gl/RTKMwF

* fix linter error

* add missing generated file

* proto-commit

* remove python/istio_api/mcp/v1alpha1/envelope_pb2.py

* s/envelope/resource

* s/client/node

* make proto-commit

* fix comments

* add system_version_info for compatibility with non-incremental MCP

* address review comments

* s/node/sink_node

* address more review comments

* update resource name documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants