-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-Introducing Protocol Buffer Representation #626
Conversation
Signed-off-by: Day, Jeremy(jday) <[email protected]>
ping @evankanderson |
Is the ownership story regarding Protobuf resolved, i.e. did Google put it into CNCF, yet? |
ideally this could replace the and possibly, we could then discuss Liiklus' proto cc @bsideup |
Signed-off-by: Day, Jeremy(jday) <[email protected]>
@markfisher - I toyed with adding gRPC but that might be a more of a transport specification thing. Do we consider gRPC a transport ? I've updated the schema to represent the required and optional attributes as first-class fields. |
I believe gRPC is part of CNCF, so a proto format for CloudEvents would align with gRPC transport definitions. I'll take a look later today; I think there was also an earlier protobuf definition PR, so we might want to reconcile the two. |
protobuf-format.md
Outdated
} | ||
``` | ||
|
||
NOTE: Field assignment numbers are non-normative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Field assignment numbers absolutely have to be defined by the spec. Proto wire format encodes tag + type as a prefix, so if the tag doesn't match the expected type, the entire message will be discarded during decoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my comment was mis-leading ... what I was trying to imply was that the content in this document were 'representative' but the actual schema is defined in the spec.proto file .. I'll clean that up.
|
||
* string | ||
* bytes | ||
* protobuf object/message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was a json structure (possibly equivalent to protobuf Struct
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure i follow your comment, My intention was to allow the transportation of Proto event data in a way that is more idiomatic for Proto. ie there's no need to marshal the data into a separate field, the pack/unpack feature of Any
supports this directly.
I'll add an example..
string text_data = 3; | ||
|
||
// Protobuf Message data | ||
google.protobuf.Any proto_data = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you want to use Any
here; Any
is basically a wrapper for a string type and an enveloped message. For CloudEvents, I think it would be more natural to put the message bytes into raw_data
, and the message type into a CE extension attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment ... if I'm dealing natively with protobuf it's not very idiomatic to marshal it separately.
I don't believe the spec calls for all event data to be encoded as a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
protobuf-format.md
Outdated
of the content. This can be determined by examining the data for invalid UTF-8 | ||
sequences or by consulting the `datacontenttype` attribute. | ||
|
||
* Where the data is a protobuf message it MUST be stored in the `proto_data` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why complicate this vs. just using bytes
(and possibly needing to decode the payload a second time), vs needing to scan the payload to determine which field to put the content into?
Also, is it an error to receive a utf-8 string or a proto message as raw_data
? If not, this seems like a case where the recipient will know what to do best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get your point but I'm not sure it complicates things. My thought process was that it was a bit odd to put textual content into a byte array when Proto allow us to be explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, the binary encoding of protobuf lets you "Pun" two different protos between different length-delimited field types, like so:
// Use this type if you are an intermediate router and don't need to care about interpreting or validating payload.
message Envelope {
string name =1;
bytes payload = 2;
}
message Data {
map<string, int32> values = 1;
double magnitude = 2;
}
// If you are an endpoint node, decode Envelope as Decoded and you can interpret payload as the Data message.
message Decoded {
string name = 1;
Data data = 2;
}
The main benefit of using string
vs bytes
is that Proto will generate the correct types from the auto-generated code, possibly verifying that the string is valid UTF-8 along the way (I don't recall if all implementations verify the UTF-8-ness of the bytestream). The cost is that any proto interpretation which understands the message will unpack and verify all the inner payload fields, whether or not they are used by the application.
protobuf-format.md
Outdated
|
||
```java | ||
public static Spec.CloudEvent plainTextExample() { | ||
Spec.CloudEvent.Builder ceBuilder = Spec.CloudEvent.newBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You never define the enclosing CloudEvent type, only extensions
and data_oneof
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec.CloudEvent is the enclosing type, the example method returns an instance of a CloudEvent.
bytes ce_binary = 4; | ||
string ce_uri = 5; | ||
string ce_uri_reference = 6; | ||
google.protobuf.Timestamp ce_timestamp = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are unknown attribute keys handled? If I get a string for "datakey" with value "/path", should it be encoded as ce_string
, ce_binary
, or ce_uri_reference
? (The same is also true for timestamp)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The originator of the event presumably know the 'type', so they wold populate the extension attribute accordingly.
I'll add an example...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about a component which is adapting structured JSON input from e.g. AWS Kinesis to gRPC transport? This sort of intermediate component may not be aware of the types of all extensions (by design), but without type information from the upstream transport, a datakey
attribute might be transcribed to a different encoding than the recipient is expecting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an example of what this looks like in another typed transport see the AMQP spec, where clients need to be prepared to accept either string
or typed values for each attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're conflating 'transport' and 'format' .. this spec is only applicable in structured mode and in those situations the payload is opaque (apologies if I mis-stated that somewhere, now you've got me worried).
In AMQP this would be transported as a application/cloudevents+x-protobuf content type.
It'd be the gRPC transport specification (if we consider it a transport) that would define how binary and structured payloads would be exchanged.
Switching between different structured formats (JSON->Proto, Avro->JSON) is a higher-order function and not a concern of the transport binding I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I thought this was handled, but I'm looking back, and it looks like we talked past each other.
Assume the following structured JSON CloudEvent with no payload:
{
"specversion" : "1.0",
"type" : "com.example.someevent",
"source" : "/mycontext",
"id" : "A234-1234-1234",
"time" : "2018-04-05T17:31:00Z",
"comexampleextension1" : "2018-04-05T17:31:00Z",
}
What does the equivalent structured Proto CloudEvent look like? (using text format for readability)
id: "A234-1234-1234"
source: "/mycontext"
spec_version: "1.0"
type: "com.example.someevent"
attributes: [{
key: "time"
value: {
ce_timestamp: {...}
}, {
key: "comexampleextension1"
value: {
ce_string: "...."
}
}]
Alternately, comexamplextension1
could be encoded as:
{
key: "comexampleextension1"
value: {
ce_timestamp: { ... }
}
}
If there is a generic JSON -> Proto CloudEvents bridge:
- How does it know how to encode
comexampleextension1
? - Do all downstream clients need to be able to handle both
ce_timestamp
andce_string
format? - Does the same issue apply to the other types?
This is somewhat of a generic problem with the CloudEvents extension model, but mentioning it again here.
protobuf-format.md
Outdated
direct support of the CloudEvent [type system][ce-types]. | ||
|
||
```proto | ||
map<string, CloudEventAttribute> extensions = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This map is labeled extensions
. Are the core attributes handled the same way, or by unique tags?
The example at the end suggests that known attributes are assigned unique tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
originally I had all the defined attributes in a map structure, I flipped to a model when the defined attributes are explicit (as in the JSON schema).
So in that model only extension attributes need to be carried in the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is about what happens when core attributes are added, e.g. in a 1.1 spec.
Clients which were built before the 1.1 spec will map these attributes into the extensions
map, while post-1.1 clients will map them into a unique tag-assigned field. Recipients of messages will then need to write complicated logic to map pre-1.1 messages to post-1.1 messages, losing most of the benefit of code auto-generation for protocol buffers.
Yes, this is a "perfect is the enemy of the good" case, but my experience has been that not planning for expansion tends to lead to "everything was good until we tried to touch something, and now it's all terrible."
Another acceptable scenario is to accept that we can't add further attributes to the 1.0 proto message definition, and all additional attributes (whether currently documented extensions or not-yet-defined extensions) must be stored in extensions
.
The last option is to keep a global tag number registry mapping extension names to tags, but that only helps humans and doesn't really help conversion gateways, which would need to be able to load the list dynamically when transforming a message between proto and other encodings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there's a half-way house solution where all the 'required' attributes are explicitly define but the optional ones aren't, on the basis that we can't add any required attributes without changing the major version..
spec.proto
Outdated
google.protobuf.Timestamp time = 8; | ||
|
||
// -- CloudEvent Extension Attributes | ||
map<string, CloudEventAttribute> extensions = 9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also my comments in #576 (comment)
You may want to define all optional attributes (core and extension) as part of the same string-keyed map, rather than putting a few as tags but not later-defined attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's an interesting point ... we don't do that in that in the other format definitions but is something to consider.
It does have the potential to introduce complexity where consumers/produces might be using different version of the spec as new extensions are added - Proto can cope with that but it would increase complexity for consumers.
I'd argue for sticking with the defined 'required' and 'optional' context attributes.
If expect a follow-on to proto definition would be a set of gRPC transports. At a minimum, this would be unary request-response and streaming request-response, but there might be other useful profiles. |
(oh, and see also #576) |
@JemDay Nice! I have a couple of questions since I think I miss some important details:
Thanks in advance! |
bytes ce_binary = 4; | ||
string ce_uri = 5; | ||
string ce_uri_reference = 6; | ||
google.protobuf.Timestamp ce_timestamp = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use google.protobuf.Timestamp
here, we will always be using UTC for the timezone, while the spec says that it should be a timezoned datetime (which cares an extra information and can be used to align the notifications schedule, for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with respect to your prior comment (which I can't comment on directly)..
1 - I think I updated the spec after you commented, the context attributes defined in the spec are now represented explicitly.
2 - Again, with the re-structure I don't think that's an issue. In Proto typically everything is optional;i believe there is a way to flag 'required' but Proto purists may object :-)
3 - For me it's about retaining the type information, a receiver can determine that the context is URI because the value is in that field.
This might be a question for the larger group, my interpretation was that the intention was to ensure timezone information was included so representing the value as UTC met that requirement from my point of view.
On closer inspection there's an implication that timezones MUST be communicated as strings - which seems a bit inefficient especially when native representations are available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proto2 has a notion of optional/required semantics. These were dropped in proto3 due to operational experience -- as protocol buffers evolved over time, fields which were previously "required" might no longer make sense, but converting a required field to optional was an extensive multi-stage rollout process across all consumers of the protocol buffer, even those which did not use the fields explicitly (like intermediate proxies).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two notes here on details, while I'm still looking at the broader conversation. (Sorry for parachuting in; a colleague pointed me at this issue, and this particular subthread touches on things I know about.)
While proto3 still doesn't have "required" semantics, it has very recently regained the idea of "optional" fields. The word "optional" is pretty much historical now, but it's really about "does this field support the notion of being implicitly set vs explicitly set". An optional int32 proto3 field that has been explicitly set to 0 by user code is serialized even though 0 is the default value. A regular int32 proto3 field will never have its default value serialized, whether it was set by the user or not.
Separately, on time zones, I'd like us to be really clear on terminology. RFC 3339 does not allow the time zone to be conveyed. It allows a UTC offset to be specified. They're not the same thing, and that affects scheduling. If I send an event with timestamp that is in UTC+1 (as the UK is right now), and I promise to send another event one day later in local time, you can't tell whether that will be another timestamp with an offset of +1 or not. It's unfortunate that RFC 3339 (and ISO-8601 before it) conflates UTC offsets and time zones, but let's try not to in any CloudEvents documentation :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part definitely needs more discussion - if the format needs to be able to store a timestamp with offset from UTC as per RFC 3339, Timestamp
will not do that on its own. We could create our own message for that concept, but it wouldn't be as easy to handle - Timestamp
has built-in goodness in terms of a readable JSON representation, and utility methods in at least some languages (e.g. C#).
I would suggest that in retrospect, the CE spec deciding that a timestamp doesn't just represent an instant in time may have been a mistake - but it's unclear what we need to do now. Being able to convert between formats without loss of information would certainly be a useful property...
Signed-off-by: Day, Jeremy(jday) <[email protected]>
@JemDay is this one ready for review? |
@duglin - Probably ready for a broader conversation, I do need to go through the comments from @evankanderson a bit more. Items to thing about :
|
protobuf-format.md
Outdated
### 1.2 Content-Type | ||
|
||
There is no official IANA *media-type* designation for protobuf, as such this | ||
specification uses 'application/x-protobuf' to identify such content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a draft for protobuf media type here: https://tools.ietf.org/html/draft-rfernando-protocol-buffers-00.
We recently decided in our company to use the media type application/protobuf
proposed in the draft above.
Would you mind switching to that definition instead of the proposed application/x-protobuf
?
specification uses 'application/x-protobuf' to identify such content. | |
specification uses 'application/protobuf' to identify such content based on the draft [Encoding rules and MIME type for Protocol Buffers](https://tools.ietf.org/html/draft-rfernando-protocol-buffers-00). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I certainly think this warrants further discussion, i've seen both application/protobuf and application/x-protobuf used.
Looking at Spring for example they appear to use the x-protobuf.
I'm not sure there's one correct answer to this :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my mistake - we mainly thought about the datacontenttype
attribute.
I think the proposal made in the call today to use application/cloudevents+protobuf
for the Content-Type
http header field for structured mode is reasonable.
@JemDay : You can ignore my suggested change.
- Only required attributes are formally represented. - Optional & Extension attributes are carried in the map. - Documentation & Examples changed as appropiate. - Added version marker to proto & java package. Signed-off-by: Day, Jeremy(jday) <[email protected]>
Signed-off-by: Day, Jeremy(jday) <[email protected]>
@JemDay is this one ready for consideration on tomorrow's call? |
Signed-off-by: Day, Jeremy(jday) <[email protected]>
Signed-off-by: Day, Jeremy(jday) <[email protected]>
Signed-off-by: Day, Jeremy(jday) <[email protected]>
Signed-off-by: Day, Jeremy(jday) <[email protected]>
@evankanderson - I think all of the concerns have been covered now and I believe this is pretty closely aligned with the commentary in Issue #576. Any comments ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One larger concern around the attribute model, and it looks like the payload question hasn't been resolved. (See this comment for reference to the other mechanisms for handling payloads.)
IMO, the complexity of deeply handling the payload structure in a format which supports binary strings is probably not worth it compared with the simplicity of having an envelope which can nest another proto within.
Secondarily, it seems like the datacontenttype
and Any
message may be providing conflicting guidance about the expected content of the message if you use Any
, since it encodes a type URL along with a byte payload.
bytes ce_binary = 4; | ||
string ce_uri = 5; | ||
string ce_uri_reference = 6; | ||
google.protobuf.Timestamp ce_timestamp = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I thought this was handled, but I'm looking back, and it looks like we talked past each other.
Assume the following structured JSON CloudEvent with no payload:
{
"specversion" : "1.0",
"type" : "com.example.someevent",
"source" : "/mycontext",
"id" : "A234-1234-1234",
"time" : "2018-04-05T17:31:00Z",
"comexampleextension1" : "2018-04-05T17:31:00Z",
}
What does the equivalent structured Proto CloudEvent look like? (using text format for readability)
id: "A234-1234-1234"
source: "/mycontext"
spec_version: "1.0"
type: "com.example.someevent"
attributes: [{
key: "time"
value: {
ce_timestamp: {...}
}, {
key: "comexampleextension1"
value: {
ce_string: "...."
}
}]
Alternately, comexamplextension1
could be encoded as:
{
key: "comexampleextension1"
value: {
ce_timestamp: { ... }
}
}
If there is a generic JSON -> Proto CloudEvents bridge:
- How does it know how to encode
comexampleextension1
? - Do all downstream clients need to be able to handle both
ce_timestamp
andce_string
format? - Does the same issue apply to the other types?
This is somewhat of a generic problem with the CloudEvents extension model, but mentioning it again here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just came across this PR today, what was the outcome of the discussion of the call yesterday? I see a few missed opportunities here:
- adding the well-known optional attributes like subject, datacontenttype, dataschema and type. This means that you only add extensions in the attributes field (I also agree with @evankanderson that it's probably better to have a repeated field iso a map)
- not using the one_of for the data, the funny thing is that the 3 proposed representation all are encoded exactly the same on the wire format of proto
- and most importantly: align the proto tag number with Any type: giving string dataschema = 1 and bytes data = 2. This would make the CloudEvent 100% compatible with Any if both are filled correctly. This gives a lot of flexibility if you know that you are sending Any types.
@evankanderson .. A decision was taken prior to 1.0 not to encode type information for attributes, so in your example if an intermediary were to translate between JSON and Proto formats the I'm not clear on the thrust of your question regarding downstream clients - I would assume there's some sort of negotiation prior to receiving events, if i can't handle events in proto format (or avro) then i wouldn't subscribe to them. if i understand proto then i would understand the type system it exposes. If i were an SDK writer and i exposed a 'setAttribute(type, value)' type method a protobuf encoder can retain the type information. JSON <-> Proto Format Bridge - I think this would be doable I don't believe the type issue is specific to proto, it also exists in a pure JSON model where custom extensions are present, the receiver only sees a I'll comment on the payload concern seperately... |
@alexvanboxel .. Re your comments
|
## 4. Transport | ||
|
||
Transports that support content identification MUST use the following designation: | ||
|
||
```text | ||
application/cloudevents+protobuf | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me what kind of wire format are we targeting for transporting protobufs? Are you implicitly referring to marshalling protos to binary encoding (https://developers.google.com/protocol-buffers/docs/encoding)?
Alternatively, we can always convert protos to JSON when putting them on wire, or maybe even use textprotos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was assuming that this was the binary encoding, given that JSON is already covered.
I'd strongly discourage using textprotos - they're not supported in all languages, and not well-defined externally as far as I'm aware.
I've added some language specific package options as-per the example from @jskeet - i'm not conversant in some of these languages so if anybody can sanity check that would be good. @jskeet - On the subject of Timestamp, my understanding is that the specification simply demands a time instant. I don't believe there's any requirement that any original offsets be retained across processing boundaries so long as the semantic is retained. |
@JemDay: The specification isn't as clear on this as it might be. The description is "Date and time expression using the Gregorian Calendar" - which is interesting given that the calendar system is actually irrelevant for an instant in time until you choose a text representation. The text representation is specified to be RFC 3339, which does allow for UTC offsets. I'm happy to create a PR to clarify the specification here, in terms of:
Once all of those are in place (or at least agreed to in principle), I'm fine using Timestamp from protos. Does that sound like the kind of PR that would be welcomed? (I don't know whether it's worth mentioning leap seconds in there at all. I've tried to avoid them in all the date/time work I've done, but they could definitely be relevant for some CloudEvents users...) |
@jskeet - I think this a specification clarity issue, believe it is meant to be an "instant in time", and that we follow RFC 3330 when encoding into a string. Might be simpler to discuss it tomorrow in the WG call before doing a PR (I could event make it part of this one). With respect to your suggestion regarding using the 'Value' type as a more first-class way to support JSON data; I understand the model but I'm unclear on how this would get realized in code. Looking at the Java language spec I don't see an option to fo something like |
@JemDay: Timestamps - I'll try to find a WG that I can come to, but it definitely won't be tomorrow. (The WG timing tends to be clash with all kinds of things for me - which is to be expected for an international meeting, of course.) For the |
@JemDay: Excellent. Please cc me on the PR; as a date/time nerd I'm happy to be picky :) |
All - I believe we've reached consensus so could I asked for a few LGTM to allow this PR to be merged and closed. I will open a separate PR to clarify the expectations related to Timestamps in the specification, that does not impact this proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few things I'd still like to see addressed - but if others want to get the first draft in and then iterate, that's fine by me. It's still only going to be a draft, right?
spec.proto
Outdated
import "google/protobuf/any.proto"; | ||
import "google/protobuf/timestamp.proto"; | ||
|
||
option csharp_namespace = "Io.CloudEvents.V1.Proto"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be decided for sure later on (on a per language basis) but the CloudEvents C# SDK uses a namespace of CloudNative.CloudEvents
. We probably want to fit in with that. (So CloudNative.CloudEvents.V1.Proto
perhaps? Or CloudNative.CloudEvents.V1.Protobuf
?
|
||
message CloudEvent { | ||
|
||
// -- CloudEvent Context Attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally it would be good to get this done before it's merged, but I'm okay for us to do it later.
spec.proto
Outdated
map<string, CloudEventAttribute> attribute = 5; | ||
|
||
// -- CloudEvent Data (Bytes, Text, or Proto) | ||
oneof data_oneof { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping on this one.
spec.proto
Outdated
|
||
message CloudEventAttribute { | ||
|
||
oneof attr_oneof { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And again.
- Added Java directive for seperate fies. - re-pluralized 'attributes' to follow proto convention. Signed-off-by: Day, Jeremy(jday) <[email protected]>
@jskeet - I believe your concerns are covered now. I re-pluralized the 'attributes' and removed the c# directive, we can add that in as a seperate PR. I'm no C# expert but i believe the major version must be encoded into the package to enable V1 and V2 generated code to live side-by-side. That packaging should be independant (but not clash with) the C# SDK packages. |
Definitely happy to sort out the C# namespace later, and I agree that we should include the version in the namespace. Not entirely convinced about it being independent of the SDK, but we can decide that a little later. |
@jskeet are ok with this PR as-is? It sounds like there might be some follow-on discussions, but I wanted to make sure we didn't merge it if there are still things that need to be resolved first. |
@duglin: I would prefer to get the oneof names nicer to start with - is there any controversy about those? I don't have any fundamental objections though. |
I'll need @JemDay to speak to that :-) |
@jskeet - Can you suggest alternatives ... we need to differentiate between a "binary", "text", and "Proto Message" data payload. |
@JemDay: All I'm suggesting is removing the |
Note that if you're only talking about binary Protobuf representation, then changing the name of This also means those naming fixups could be made in a subsequent PR if desired. |
I'm assuming anything can still be changed in the near term - merging this isn't defining a 1.0 for the protobuf representation is it? Because if it is, that should include source compatibility as well, which would require the names to be finalized. |
There are two different things -- this is defining the wire-format protocol; there are separately per-language SDKs. The Golang SDK, for example is at v2.2.0 supporting the v1 and v0.3 wire-format specs. |
Well it's defining a .proto file, which I'd expect to be the canonical source for SDKs. It would be pretty silly for someone to create a wire-compatible-but-separate proto file. So why not aim to get both right to start with? |
(I'd also suggest that we should determine that the JSON serialization format of the protobuf data is also standardized - that doesn't affect oneof names as far as I remember, but it does affect tag names, basically making tag renaming a breaking change.) |
Signed-off-by: Day, Jeremy(jday) <[email protected]>
@jskeet - Tidied the names as suggested. The JSON format has its own specification, we could tweak that if needed - I don't think we did that for Avro. |
I'm not suggesting a change to the standard JSON CloudEvent format. I'm suggesting that we make sure that the proto representation doesn't change in any way that breaks source or proto3-JSON format, once we've declared it as 1.0. That's important because it should be reasonable for folks to use the proto3-JSON format in test data, and not be broken by changes. I don't think it's a major imposition - it just means that we'd be standardizing the .proto file rather than just the binary wire format. I'm definitely up for discussing that, but it would be my recommendation. Will double-check the oneofs and LGTM this PR though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to have a pass at reformatting comments and adding more of them in a future PR if that would be helpful.
We're in kind of a weird situation. The PR was approved on today's call with the assumption that all parties involved in the PR were ok with the current state. After a few more commits it looks like that is true. However, we technically didn't approve this one, which mean we should really wait. BUT, based on the lack of comments on today's call I suspect the people who really care are part of this PR's thread and not the weekly call. So, rather than waiting for next week's call I'm going to let this sit until EOD tomorrow to see if there are any lingering concerns - then I'll merge. Hope that's ok with everyone. |
ok ... sat long enough. Merging!!! Approved on the 9/24 call |
* Re-Introducing Protocol Buffer Representation Signed-off-by: Day, Jeremy(jday) <[email protected]> * required and optional attributes are explicitly represented Signed-off-by: Day, Jeremy(jday) <[email protected]> * tweaks and examples to address review comments Signed-off-by: Day, Jeremy(jday) <[email protected]> * Updated following dicussion during 6/28 weekly meeting. - Only required attributes are formally represented. - Optional & Extension attributes are carried in the map. - Documentation & Examples changed as appropiate. - Added version marker to proto & java package. Signed-off-by: Day, Jeremy(jday) <[email protected]> * tweaks Signed-off-by: Day, Jeremy(jday) <[email protected]> * Include spec reference Signed-off-by: Day, Jeremy(jday) <[email protected]> * Changed media-type designation as-per weekly call discussion Signed-off-by: Day, Jeremy(jday) <[email protected]> * Fixed links to work before merging to master Signed-off-by: Day, Jeremy(jday) <[email protected]> * Really fixing the links this time Signed-off-by: Day, Jeremy(jday) <[email protected]> * DO NOT MERGE: data representation proposal, attribute type naming change Signed-off-by: Day, Jeremy(jday) <[email protected]> * Moved back to one_of representation for Data, updated docs Signed-off-by: Day, Jeremy(jday) <[email protected]> * Minor tweaks based on review commentary Signed-off-by: Day, Jeremy(jday) <[email protected]> * - Modified proto map property to increase raadability of generated code. - Clarified usage of `dataschema` CloudEvent attribute. - Updated examples. Signed-off-by: Day, Jeremy(jday) <[email protected]> * Changes based on review comments Signed-off-by: Day, Jeremy(jday) <[email protected]> * Added language specific package definitions Signed-off-by: Day, Jeremy(jday) <[email protected]> * - Removed c# directive, can be added back in a later PR. - Added Java directive for seperate fies. - re-pluralized 'attributes' to follow proto convention. Signed-off-by: Day, Jeremy(jday) <[email protected]> * Tidied 'one-of' names Signed-off-by: Day, Jeremy(jday) <[email protected]>
re-introducing protocol buffer support, this takes a slightly different approach to the one originally proposed.
I've used
one-of
constructs which I'm aware does have the potential to introduce some processing overhead but it does enable a level of structural correctness.It's been a while since I've used Proto on a day-2-day basis so I'd appreciate comments.
Signed-off-by: Day, Jeremy(jday) [email protected]