Re-Introducing Protocol Buffer Representation #626

JemDay · 2020-05-20T21:30:02Z

re-introducing protocol buffer support, this takes a slightly different approach to the one originally proposed.

I've used one-of constructs which I'm aware does have the potential to introduce some processing overhead but it does enable a level of structural correctness.

It's been a while since I've used Proto on a day-2-day basis so I'd appreciate comments.

Signed-off-by: Day, Jeremy(jday) [email protected]

Signed-off-by: Day, Jeremy(jday) <[email protected]>

duglin · 2020-05-21T01:15:13Z

ping @evankanderson

clemensv · 2020-05-21T12:09:18Z

Is the ownership story regarding Protobuf resolved, i.e. did Google put it into CNCF, yet?

markfisher · 2020-05-21T16:20:14Z

ideally this could replace the LiiklusEvent as defined here: https://github.com/bsideup/liiklus/blob/master/protocol/src/main/proto/LiiklusService.proto

and possibly, we could then discuss Liiklus' proto service moving into the CloudEvents scope as well (along with unary service perhaps)

cc @bsideup

Signed-off-by: Day, Jeremy(jday) <[email protected]>

JemDay · 2020-05-21T22:42:15Z

@markfisher - I toyed with adding gRPC but that might be a more of a transport specification thing.

Do we consider gRPC a transport ?

I've updated the schema to represent the required and optional attributes as first-class fields.

evankanderson · 2020-05-22T02:02:16Z

I believe gRPC is part of CNCF, so a proto format for CloudEvents would align with gRPC transport definitions.

I'll take a look later today; I think there was also an earlier protobuf definition PR, so we might want to reconcile the two.

evankanderson · 2020-05-22T02:06:42Z

protobuf-format.md

+}
+```
+
+NOTE: Field assignment numbers are non-normative.


Field assignment numbers absolutely have to be defined by the spec. Proto wire format encodes tag + type as a prefix, so if the tag doesn't match the expected type, the entire message will be discarded during decoding.

I think my comment was mis-leading ... what I was trying to imply was that the content in this document were 'representative' but the actual schema is defined in the spec.proto file .. I'll clean that up.

evankanderson · 2020-05-22T02:07:48Z

protobuf-format.md

+
+* string
+* bytes
+* protobuf object/message


I thought this was a json structure (possibly equivalent to protobuf Struct).

Not sure i follow your comment, My intention was to allow the transportation of Proto event data in a way that is more idiomatic for Proto. ie there's no need to marshal the data into a separate field, the pack/unpack feature of Any supports this directly.

I'll add an example..

evankanderson · 2020-05-22T02:09:57Z

protobuf-format.md

+    string text_data = 3;
+
+    // Protobuf Message data
+    google.protobuf.Any proto_data = 4;


I don't think you want to use Any here; Any is basically a wrapper for a string type and an enveloped message. For CloudEvents, I think it would be more natural to put the message bytes into raw_data, and the message type into a CE extension attribute.

See my previous comment ... if I'm dealing natively with protobuf it's not very idiomatic to marshal it separately.

I don't believe the spec calls for all event data to be encoded as a string.

If you look at the various binary encodings, they encode the payload as a simple bytestream (not a string): HTTP, Kafka, MQTT, AMQP. AVRO has some funniness, but I wouldn't try to emulate it.

evankanderson · 2020-05-22T02:13:22Z

protobuf-format.md

+of the content. This can be determined by examining the data for invalid UTF-8
+sequences or by consulting the `datacontenttype` attribute.
+
+* Where the data is a protobuf message it MUST be stored in the `proto_data`


Why complicate this vs. just using bytes (and possibly needing to decode the payload a second time), vs needing to scan the payload to determine which field to put the content into?

Also, is it an error to receive a utf-8 string or a proto message as raw_data? If not, this seems like a case where the recipient will know what to do best.

I get your point but I'm not sure it complicates things. My thought process was that it was a bit odd to put textual content into a byte array when Proto allow us to be explicit.

FWIW, the binary encoding of protobuf lets you "Pun" two different protos between different length-delimited field types, like so:

// Use this type if you are an intermediate router and don't need to care about interpreting or validating payload. message Envelope { string name =1; bytes payload = 2; } message Data { map<string, int32> values = 1; double magnitude = 2; } // If you are an endpoint node, decode Envelope as Decoded and you can interpret payload as the Data message. message Decoded { string name = 1; Data data = 2; }

The main benefit of using string vs bytes is that Proto will generate the correct types from the auto-generated code, possibly verifying that the string is valid UTF-8 along the way (I don't recall if all implementations verify the UTF-8-ness of the bytestream). The cost is that any proto interpretation which understands the message will unpack and verify all the inner payload fields, whether or not they are used by the application.

evankanderson · 2020-05-22T02:15:26Z

protobuf-format.md

+
+```java
+public static Spec.CloudEvent plainTextExample() {
+    Spec.CloudEvent.Builder ceBuilder = Spec.CloudEvent.newBuilder();


You never define the enclosing CloudEvent type, only extensions and data_oneof.

Spec.CloudEvent is the enclosing type, the example method returns an instance of a CloudEvent.

evankanderson · 2020-05-22T02:18:36Z

protobuf-format.md

+      bytes ce_binary = 4;
+      string ce_uri = 5;
+      string ce_uri_reference = 6;
+      google.protobuf.Timestamp ce_timestamp = 7;


How are unknown attribute keys handled? If I get a string for "datakey" with value "/path", should it be encoded as ce_string, ce_binary, or ce_uri_reference? (The same is also true for timestamp)

The originator of the event presumably know the 'type', so they wold populate the extension attribute accordingly.

I'll add an example...

What about a component which is adapting structured JSON input from e.g. AWS Kinesis to gRPC transport? This sort of intermediate component may not be aware of the types of all extensions (by design), but without type information from the upstream transport, a datakey attribute might be transcribed to a different encoding than the recipient is expecting.

For an example of what this looks like in another typed transport see the AMQP spec, where clients need to be prepared to accept either string or typed values for each attribute.

I think we're conflating 'transport' and 'format' .. this spec is only applicable in structured mode and in those situations the payload is opaque (apologies if I mis-stated that somewhere, now you've got me worried).

In AMQP this would be transported as a application/cloudevents+x-protobuf content type.

It'd be the gRPC transport specification (if we consider it a transport) that would define how binary and structured payloads would be exchanged.

Switching between different structured formats (JSON->Proto, Avro->JSON) is a higher-order function and not a concern of the transport binding I believe.

Sorry, I thought this was handled, but I'm looking back, and it looks like we talked past each other.

Assume the following structured JSON CloudEvent with no payload:

{ "specversion" : "1.0", "type" : "com.example.someevent", "source" : "/mycontext", "id" : "A234-1234-1234", "time" : "2018-04-05T17:31:00Z", "comexampleextension1" : "2018-04-05T17:31:00Z", }

What does the equivalent structured Proto CloudEvent look like? (using text format for readability)

id: "A234-1234-1234" source: "/mycontext" spec_version: "1.0" type: "com.example.someevent" attributes: [{ key: "time" value: { ce_timestamp: {...} }, { key: "comexampleextension1" value: { ce_string: "...." } }]

Alternately, comexamplextension1 could be encoded as:

{ key: "comexampleextension1" value: { ce_timestamp: { ... } } }

If there is a generic JSON -> Proto CloudEvents bridge:

How does it know how to encode comexampleextension1?

Do all downstream clients need to be able to handle both ce_timestamp and ce_string format?

Does the same issue apply to the other types?

This is somewhat of a generic problem with the CloudEvents extension model, but mentioning it again here.

evankanderson · 2020-05-22T02:21:16Z

protobuf-format.md

+direct support of the CloudEvent [type system][ce-types].
+
+```proto
+map<string, CloudEventAttribute> extensions = 1;


This map is labeled extensions. Are the core attributes handled the same way, or by unique tags?

The example at the end suggests that known attributes are assigned unique tags.

originally I had all the defined attributes in a map structure, I flipped to a model when the defined attributes are explicit (as in the JSON schema).

So in that model only extension attributes need to be carried in the map.

My concern is about what happens when core attributes are added, e.g. in a 1.1 spec.

Clients which were built before the 1.1 spec will map these attributes into the extensions map, while post-1.1 clients will map them into a unique tag-assigned field. Recipients of messages will then need to write complicated logic to map pre-1.1 messages to post-1.1 messages, losing most of the benefit of code auto-generation for protocol buffers.

Yes, this is a "perfect is the enemy of the good" case, but my experience has been that not planning for expansion tends to lead to "everything was good until we tried to touch something, and now it's all terrible."

Another acceptable scenario is to accept that we can't add further attributes to the 1.0 proto message definition, and all additional attributes (whether currently documented extensions or not-yet-defined extensions) must be stored in extensions.

The last option is to keep a global tag number registry mapping extension names to tags, but that only helps humans and doesn't really help conversion gateways, which would need to be able to load the list dynamically when transforming a message between proto and other encodings.

Maybe there's a half-way house solution where all the 'required' attributes are explicitly define but the optional ones aren't, on the basis that we can't add any required attributes without changing the major version..

evankanderson · 2020-05-22T02:43:05Z

spec.proto

+  google.protobuf.Timestamp time = 8;
+
+  // -- CloudEvent Extension Attributes
+  map<string, CloudEventAttribute> extensions = 9;


See also my comments in #576 (comment)

You may want to define all optional attributes (core and extension) as part of the same string-keyed map, rather than putting a few as tags but not later-defined attributes.

that's an interesting point ... we don't do that in that in the other format definitions but is something to consider.

It does have the potential to introduce complexity where consumers/produces might be using different version of the spec as new extensions are added - Proto can cope with that but it would increase complexity for consumers.

I'd argue for sticking with the defined 'required' and 'optional' context attributes.

evankanderson · 2020-05-22T02:45:57Z

If expect a follow-on to proto definition would be a set of gRPC transports. At a minimum, this would be unary request-response and streaming request-response, but there might be other useful profiles.

evankanderson · 2020-05-22T02:46:13Z

(oh, and see also #576)

bsideup · 2020-05-22T06:35:35Z

@JemDay Nice! I have a couple of questions since I think I miss some important details:

I wonder why we have attributes as map<string, value> while they are strictly defined in the spec?
With map, how we plan to deprecate certain attributes? Or define required/optional?
Does it really makes sense to have ce_uri and ce_uri_reference, given that we're not gaining a type safety here, since protobuf is missing URI type? Perhaps we should add message URI?

Thanks in advance!

bsideup · 2020-05-22T06:42:05Z

protobuf-format.md

+      bytes ce_binary = 4;
+      string ce_uri = 5;
+      string ce_uri_reference = 6;
+      google.protobuf.Timestamp ce_timestamp = 7;


If we use google.protobuf.Timestamp here, we will always be using UTC for the timezone, while the spec says that it should be a timezoned datetime (which cares an extra information and can be used to align the notifications schedule, for example).

with respect to your prior comment (which I can't comment on directly)..

1 - I think I updated the spec after you commented, the context attributes defined in the spec are now represented explicitly.
2 - Again, with the re-structure I don't think that's an issue. In Proto typically everything is optional;i believe there is a way to flag 'required' but Proto purists may object :-)
3 - For me it's about retaining the type information, a receiver can determine that the context is URI because the value is in that field.

This might be a question for the larger group, my interpretation was that the intention was to ensure timezone information was included so representing the value as UTC met that requirement from my point of view.

On closer inspection there's an implication that timezones MUST be communicated as strings - which seems a bit inefficient especially when native representations are available.

Re optional/required:

https://github.com/protocolbuffers/protobuf/blob/master/docs/field_presence.md

Proto2 has a notion of optional/required semantics. These were dropped in proto3 due to operational experience -- as protocol buffers evolved over time, fields which were previously "required" might no longer make sense, but converting a required field to optional was an extensive multi-stage rollout process across all consumers of the protocol buffer, even those which did not use the fields explicitly (like intermediate proxies).

Two notes here on details, while I'm still looking at the broader conversation. (Sorry for parachuting in; a colleague pointed me at this issue, and this particular subthread touches on things I know about.)

While proto3 still doesn't have "required" semantics, it has very recently regained the idea of "optional" fields. The word "optional" is pretty much historical now, but it's really about "does this field support the notion of being implicitly set vs explicitly set". An optional int32 proto3 field that has been explicitly set to 0 by user code is serialized even though 0 is the default value. A regular int32 proto3 field will never have its default value serialized, whether it was set by the user or not.

Separately, on time zones, I'd like us to be really clear on terminology. RFC 3339 does not allow the time zone to be conveyed. It allows a UTC offset to be specified. They're not the same thing, and that affects scheduling. If I send an event with timestamp that is in UTC+1 (as the UK is right now), and I promise to send another event one day later in local time, you can't tell whether that will be another timestamp with an offset of +1 or not. It's unfortunate that RFC 3339 (and ISO-8601 before it) conflates UTC offsets and time zones, but let's try not to in any CloudEvents documentation :)

This part definitely needs more discussion - if the format needs to be able to store a timestamp with offset from UTC as per RFC 3339, Timestamp will not do that on its own. We could create our own message for that concept, but it wouldn't be as easy to handle - Timestamp has built-in goodness in terms of a readable JSON representation, and utility methods in at least some languages (e.g. C#).

I would suggest that in retrospect, the CE spec deciding that a timestamp doesn't just represent an instant in time may have been a mistake - but it's unclear what we need to do now. Being able to convert between formats without loss of information would certainly be a useful property...

Signed-off-by: Day, Jeremy(jday) <[email protected]>

duglin · 2020-05-26T02:38:26Z

@JemDay is this one ready for review?

JemDay · 2020-05-26T19:04:43Z

@duglin - Probably ready for a broader conversation, I do need to go through the comments from @evankanderson a bit more.

Items to thing about :

representing attributes explicitly or as a simple map (I believe I saw an issue/question relating to how this is done in Avro)
Do we consider gRPC to be a transport, or is it a special case like http webhooks.

tweing · 2020-05-28T16:57:17Z

protobuf-format.md

+### 1.2 Content-Type
+
+There is no official IANA *media-type* designation for protobuf, as such this
+specification uses 'application/x-protobuf' to identify such content.


There was a draft for protobuf media type here: https://tools.ietf.org/html/draft-rfernando-protocol-buffers-00.
We recently decided in our company to use the media type application/protobuf proposed in the draft above.
Would you mind switching to that definition instead of the proposed application/x-protobuf?

Suggested change

specification uses 'application/x-protobuf' to identify such content.

specification uses 'application/protobuf' to identify such content based on the draft [Encoding rules and MIME type for Protocol Buffers](https://tools.ietf.org/html/draft-rfernando-protocol-buffers-00).

Hi,

I certainly think this warrants further discussion, i've seen both application/protobuf and application/x-protobuf used.

Looking at Spring for example they appear to use the x-protobuf.

I'm not sure there's one correct answer to this :-(

Sorry, my mistake - we mainly thought about the datacontenttype attribute.

I think the proposal made in the call today to use application/cloudevents+protobuf for the Content-Type http header field for structured mode is reasonable.

@JemDay : You can ignore my suggested change.

- Only required attributes are formally represented. - Optional & Extension attributes are carried in the map. - Documentation & Examples changed as appropiate. - Added version marker to proto & java package. Signed-off-by: Day, Jeremy(jday) <[email protected]>

Signed-off-by: Day, Jeremy(jday) <[email protected]>

duglin · 2020-06-04T02:29:12Z

@JemDay is this one ready for consideration on tomorrow's call?

Signed-off-by: Day, Jeremy(jday) <[email protected]>

JemDay · 2020-06-04T20:02:13Z

@evankanderson - I think all of the concerns have been covered now and I believe this is pretty closely aligned with the commentary in Issue #576.

Any comments ?

evankanderson

One larger concern around the attribute model, and it looks like the payload question hasn't been resolved. (See this comment for reference to the other mechanisms for handling payloads.)

IMO, the complexity of deeply handling the payload structure in a format which supports binary strings is probably not worth it compared with the simplicity of having an envelope which can nest another proto within.

Secondarily, it seems like the datacontenttype and Any message may be providing conflicting guidance about the expected content of the message if you use Any, since it encodes a type URL along with a byte payload.

evankanderson · 2020-06-04T20:24:17Z

protobuf-format.md

+      bytes ce_binary = 4;
+      string ce_uri = 5;
+      string ce_uri_reference = 6;
+      google.protobuf.Timestamp ce_timestamp = 7;


Sorry, I thought this was handled, but I'm looking back, and it looks like we talked past each other.

Assume the following structured JSON CloudEvent with no payload:

{ "specversion" : "1.0", "type" : "com.example.someevent", "source" : "/mycontext", "id" : "A234-1234-1234", "time" : "2018-04-05T17:31:00Z", "comexampleextension1" : "2018-04-05T17:31:00Z", }

What does the equivalent structured Proto CloudEvent look like? (using text format for readability)

id: "A234-1234-1234" source: "/mycontext" spec_version: "1.0" type: "com.example.someevent" attributes: [{ key: "time" value: { ce_timestamp: {...} }, { key: "comexampleextension1" value: { ce_string: "...." } }]

Alternately, comexamplextension1 could be encoded as:

{ key: "comexampleextension1" value: { ce_timestamp: { ... } } }

If there is a generic JSON -> Proto CloudEvents bridge:

How does it know how to encode comexampleextension1?

Do all downstream clients need to be able to handle both ce_timestamp and ce_string format?

Does the same issue apply to the other types?

This is somewhat of a generic problem with the CloudEvents extension model, but mentioning it again here.

alexvanboxel

I just came across this PR today, what was the outcome of the discussion of the call yesterday? I see a few missed opportunities here:

adding the well-known optional attributes like subject, datacontenttype, dataschema and type. This means that you only add extensions in the attributes field (I also agree with @evankanderson that it's probably better to have a repeated field iso a map)
not using the one_of for the data, the funny thing is that the 3 proposed representation all are encoded exactly the same on the wire format of proto
and most importantly: align the proto tag number with Any type: giving string dataschema = 1 and bytes data = 2. This would make the CloudEvent 100% compatible with Any if both are filled correctly. This gives a lot of flexibility if you know that you are sending Any types.

JemDay · 2020-06-09T20:39:11Z

@evankanderson .. A decision was taken prior to 1.0 not to encode type information for attributes, so in your example if an intermediary were to translate between JSON and Proto formats the comexampleextension1 attribute can only be interpreted as a String and as such your first encoding example would be correct. This a general issue with custom extensions when attempting to move between different formats, or indeed from binary->structured transport representations.

I'm not clear on the thrust of your question regarding downstream clients - I would assume there's some sort of negotiation prior to receiving events, if i can't handle events in proto format (or avro) then i wouldn't subscribe to them. if i understand proto then i would understand the type system it exposes. If i were an SDK writer and i exposed a 'setAttribute(type, value)' type method a protobuf encoder can retain the type information.

JSON <-> Proto Format Bridge - I think this would be doable

I don't believe the type issue is specific to proto, it also exists in a pure JSON model where custom extensions are present, the receiver only sees a string it can't opine on it being a URI, a timestamp, etc.

I'll comment on the payload concern seperately...

JemDay · 2020-06-09T21:04:58Z

@alexvanboxel .. Re your comments

- What you describe was my original submission, conversations lead to the current model where only required attributes are formally represented. Primarily this was to allow new optional attributes to be added in future minor spec revisions and supported in a consistent fashion.
- I think it's more idiomatic to support variants representations in this fashion to simplify SDK development. I don't see this is any different from the JSON Format supporting data payloads of binary (in base64), string, or Object. We could have required JSON data payloads to be encoded as a string but chose to be idiomatic in that format.
- I may be missing the point but are you suggesting that instead of an explicit one_of you'd prefer data to simply be an Any which would then natively support bytes, String, or indeed a proto message ? I can buy into that if that's the thrust of your comment ;-)

hrasadi · 2020-06-10T15:57:20Z

protobuf-format.md

+## 4. Transport
+
+Transports that support content identification MUST use the following designation:
+
+```text
+   application/cloudevents+protobuf
+```


It's unclear to me what kind of wire format are we targeting for transporting protobufs? Are you implicitly referring to marshalling protos to binary encoding (https://developers.google.com/protocol-buffers/docs/encoding)?

Alternatively, we can always convert protos to JSON when putting them on wire, or maybe even use textprotos.

I was assuming that this was the binary encoding, given that JSON is already covered.

I'd strongly discourage using textprotos - they're not supported in all languages, and not well-defined externally as far as I'm aware.

JemDay · 2020-09-15T19:13:25Z

I've added some language specific package options as-per the example from @jskeet - i'm not conversant in some of these languages so if anybody can sanity check that would be good.

@jskeet - On the subject of Timestamp, my understanding is that the specification simply demands a time instant. I don't believe there's any requirement that any original offsets be retained across processing boundaries so long as the semantic is retained.

jskeet · 2020-09-16T06:43:06Z

@JemDay: The specification isn't as clear on this as it might be. The description is "Date and time expression using the Gregorian Calendar" - which is interesting given that the calendar system is actually irrelevant for an instant in time until you choose a text representation. The text representation is specified to be RFC 3339, which does allow for UTC offsets.

I'm happy to create a PR to clarify the specification here, in terms of:

It's just intended to be an instant in time
Any UTC offset present in a local representation need not be preserved across processing boundaries
We should specify a required level of precision (milliseconds?) but again allow for more information to be present-but-not-necessarily-preserved

Once all of those are in place (or at least agreed to in principle), I'm fine using Timestamp from protos.

Does that sound like the kind of PR that would be welcomed?

(I don't know whether it's worth mentioning leap seconds in there at all. I've tried to avoid them in all the date/time work I've done, but they could definitely be relevant for some CloudEvents users...)

JemDay · 2020-09-16T18:57:27Z

@jskeet - I think this a specification clarity issue, believe it is meant to be an "instant in time", and that we follow RFC 3330 when encoding into a string. Might be simpler to discuss it tomorrow in the WG call before doing a PR (I could event make it part of this one).

With respect to your suggestion regarding using the 'Value' type as a more first-class way to support JSON data; I understand the model but I'm unclear on how this would get realized in code. Looking at the Java language spec I don't see an option to fo something like mergeFrom(javax.json.JsonValue jsonValue); on the Value builder. Without these sort of constructs I'm not sure what we gain.

jskeet · 2020-09-16T19:01:11Z

@JemDay: Timestamps - I'll try to find a WG that I can come to, but it definitely won't be tomorrow. (The WG timing tends to be clash with all kinds of things for me - which is to be expected for an international meeting, of course.)

For the Value part supporting JSON data - based on last week's call, I don't think it's worth doing.

JemDay · 2020-09-16T20:43:00Z

@jskeet : Thanks, i believe we're aligned.

I chatted with @clemensv and there is no requirement for the offset information to be propagated, i'll try tweaking the spec to clarify that - so we're good-to-go with the Timestamp usage as defined.

jskeet · 2020-09-16T21:05:19Z

@JemDay: Excellent. Please cc me on the PR; as a date/time nerd I'm happy to be picky :)

JemDay · 2020-09-21T18:01:24Z

All - I believe we've reached consensus so could I asked for a few LGTM to allow this PR to be merged and closed. I will open a separate PR to clarify the expectations related to Timestamps in the specification, that does not impact this proposal.

jskeet

There are a few things I'd still like to see addressed - but if others want to get the first draft in and then iterate, that's fine by me. It's still only going to be a draft, right?

jskeet · 2020-09-21T18:59:56Z

spec.proto

+import "google/protobuf/any.proto";
+import "google/protobuf/timestamp.proto";
+
+option csharp_namespace = "Io.CloudEvents.V1.Proto";


This can be decided for sure later on (on a per language basis) but the CloudEvents C# SDK uses a namespace of CloudNative.CloudEvents. We probably want to fit in with that. (So CloudNative.CloudEvents.V1.Proto perhaps? Or CloudNative.CloudEvents.V1.Protobuf?

jskeet · 2020-09-21T19:01:04Z

spec.proto

+
+message CloudEvent {
+
+  // -- CloudEvent Context Attributes


Ideally it would be good to get this done before it's merged, but I'm okay for us to do it later.

jskeet · 2020-09-21T19:01:16Z

spec.proto

+  map<string, CloudEventAttribute> attribute = 5;
+
+  // -- CloudEvent Data (Bytes, Text, or Proto)
+  oneof data_oneof {


Ping on this one.

jskeet · 2020-09-21T19:01:27Z

spec.proto

+
+  message CloudEventAttribute {
+
+    oneof attr_oneof {


- Added Java directive for seperate fies. - re-pluralized 'attributes' to follow proto convention. Signed-off-by: Day, Jeremy(jday) <[email protected]>

JemDay · 2020-09-22T21:31:35Z

@jskeet - I believe your concerns are covered now. I re-pluralized the 'attributes' and removed the c# directive, we can add that in as a seperate PR. I'm no C# expert but i believe the major version must be encoded into the package to enable V1 and V2 generated code to live side-by-side. That packaging should be independant (but not clash with) the C# SDK packages.

jskeet · 2020-09-23T06:30:19Z

Definitely happy to sort out the C# namespace later, and I agree that we should include the version in the namespace. Not entirely convinced about it being independent of the SDK, but we can decide that a little later.

duglin · 2020-09-24T17:05:22Z

@jskeet are ok with this PR as-is? It sounds like there might be some follow-on discussions, but I wanted to make sure we didn't merge it if there are still things that need to be resolved first.

jskeet · 2020-09-24T17:11:37Z

@duglin: I would prefer to get the oneof names nicer to start with - is there any controversy about those? I don't have any fundamental objections though.

duglin · 2020-09-24T17:17:34Z

I'll need @JemDay to speak to that :-)

JemDay · 2020-09-24T17:31:58Z

@jskeet - Can you suggest alternatives ... we need to differentiate between a "binary", "text", and "Proto Message" data payload.

jskeet · 2020-09-24T17:36:22Z

@JemDay: All I'm suggesting is removing the _oneof suffix from data_oneof and attr_oneof. Keep the oneofs themselves - it's just the name that will be nicer without the suffix.

evankanderson · 2020-09-24T17:43:02Z

Note that if you're only talking about binary Protobuf representation, then changing the name of data_oneof and attr_oneof is a wire-compatible change, since the name of the oneof is not transmitted and is only an affordance for library code.

This also means those naming fixups could be made in a subsequent PR if desired.

jskeet · 2020-09-24T18:03:02Z

I'm assuming anything can still be changed in the near term - merging this isn't defining a 1.0 for the protobuf representation is it? Because if it is, that should include source compatibility as well, which would require the names to be finalized.

evankanderson · 2020-09-24T18:21:05Z

There are two different things -- this is defining the wire-format protocol; there are separately per-language SDKs. The Golang SDK, for example is at v2.2.0 supporting the v1 and v0.3 wire-format specs.

jskeet · 2020-09-24T18:24:46Z

this is defining the wire-format protocol

Well it's defining a .proto file, which I'd expect to be the canonical source for SDKs. It would be pretty silly for someone to create a wire-compatible-but-separate proto file. So why not aim to get both right to start with?

jskeet · 2020-09-24T18:30:27Z

(I'd also suggest that we should determine that the JSON serialization format of the protobuf data is also standardized - that doesn't affect oneof names as far as I remember, but it does affect tag names, basically making tag renaming a breaking change.)

Signed-off-by: Day, Jeremy(jday) <[email protected]>

JemDay · 2020-09-24T18:48:16Z

@jskeet - Tidied the names as suggested.

The JSON format has its own specification, we could tweak that if needed - I don't think we did that for Avro.

jskeet · 2020-09-24T18:51:06Z

I'm not suggesting a change to the standard JSON CloudEvent format. I'm suggesting that we make sure that the proto representation doesn't change in any way that breaks source or proto3-JSON format, once we've declared it as 1.0. That's important because it should be reasonable for folks to use the proto3-JSON format in test data, and not be broken by changes.

I don't think it's a major imposition - it just means that we'd be standardizing the .proto file rather than just the binary wire format. I'm definitely up for discussing that, but it would be my recommendation.

Will double-check the oneofs and LGTM this PR though.

jskeet

I'm happy to have a pass at reformatting comments and adding more of them in a future PR if that would be helpful.

duglin · 2020-09-24T23:57:28Z

We're in kind of a weird situation. The PR was approved on today's call with the assumption that all parties involved in the PR were ok with the current state. After a few more commits it looks like that is true. However, we technically didn't approve this one, which mean we should really wait. BUT, based on the lack of comments on today's call I suspect the people who really care are part of this PR's thread and not the weekly call. So, rather than waiting for next week's call I'm going to let this sit until EOD tomorrow to see if there are any lingering concerns - then I'll merge.

Hope that's ok with everyone.

duglin · 2020-09-30T15:05:24Z

ok ... sat long enough. Merging!!!
Thanks to all, especially @JemDay , for all of the effort on pushing this thru!

Approved on the 9/24 call

* Re-Introducing Protocol Buffer Representation Signed-off-by: Day, Jeremy(jday) <[email protected]> * required and optional attributes are explicitly represented Signed-off-by: Day, Jeremy(jday) <[email protected]> * tweaks and examples to address review comments Signed-off-by: Day, Jeremy(jday) <[email protected]> * Updated following dicussion during 6/28 weekly meeting. - Only required attributes are formally represented. - Optional & Extension attributes are carried in the map. - Documentation & Examples changed as appropiate. - Added version marker to proto & java package. Signed-off-by: Day, Jeremy(jday) <[email protected]> * tweaks Signed-off-by: Day, Jeremy(jday) <[email protected]> * Include spec reference Signed-off-by: Day, Jeremy(jday) <[email protected]> * Changed media-type designation as-per weekly call discussion Signed-off-by: Day, Jeremy(jday) <[email protected]> * Fixed links to work before merging to master Signed-off-by: Day, Jeremy(jday) <[email protected]> * Really fixing the links this time Signed-off-by: Day, Jeremy(jday) <[email protected]> * DO NOT MERGE: data representation proposal, attribute type naming change Signed-off-by: Day, Jeremy(jday) <[email protected]> * Moved back to one_of representation for Data, updated docs Signed-off-by: Day, Jeremy(jday) <[email protected]> * Minor tweaks based on review commentary Signed-off-by: Day, Jeremy(jday) <[email protected]> * - Modified proto map property to increase raadability of generated code. - Clarified usage of `dataschema` CloudEvent attribute. - Updated examples. Signed-off-by: Day, Jeremy(jday) <[email protected]> * Changes based on review comments Signed-off-by: Day, Jeremy(jday) <[email protected]> * Added language specific package definitions Signed-off-by: Day, Jeremy(jday) <[email protected]> * - Removed c# directive, can be added back in a later PR. - Added Java directive for seperate fies. - re-pluralized 'attributes' to follow proto convention. Signed-off-by: Day, Jeremy(jday) <[email protected]> * Tidied 'one-of' names Signed-off-by: Day, Jeremy(jday) <[email protected]>

Re-Introducing Protocol Buffer Representation

f483cca

Signed-off-by: Day, Jeremy(jday) <[email protected]>

required and optional attributes are explicitly represented

22686da

Signed-off-by: Day, Jeremy(jday) <[email protected]>

evankanderson reviewed May 22, 2020

View reviewed changes

bsideup reviewed May 22, 2020

View reviewed changes

tweaks and examples to address review comments

095d757

Signed-off-by: Day, Jeremy(jday) <[email protected]>

tweing reviewed May 28, 2020

View reviewed changes

JemDay mentioned this pull request May 29, 2020

Proposal for a new Protobuf spec #576

Closed

tweaks

e76aef5

Signed-off-by: Day, Jeremy(jday) <[email protected]>

JemDay added 4 commits June 4, 2020 08:53

Include spec reference

5d31355

Signed-off-by: Day, Jeremy(jday) <[email protected]>

Changed media-type designation as-per weekly call discussion

24c03d9

Signed-off-by: Day, Jeremy(jday) <[email protected]>

Fixed links to work before merging to master

dc96488

Signed-off-by: Day, Jeremy(jday) <[email protected]>

Really fixing the links this time

2857d6d

Signed-off-by: Day, Jeremy(jday) <[email protected]>

evankanderson reviewed Jun 4, 2020

View reviewed changes

alexvanboxel suggested changes Jun 5, 2020

View reviewed changes

hrasadi reviewed Jun 11, 2020

View reviewed changes

jskeet reviewed Sep 21, 2020

View reviewed changes

- Removed c# directive, can be added back in a later PR.

091e72a

- Added Java directive for seperate fies. - re-pluralized 'attributes' to follow proto convention. Signed-off-by: Day, Jeremy(jday) <[email protected]>

Tidied 'one-of' names

a6ec529

Signed-off-by: Day, Jeremy(jday) <[email protected]>

jskeet approved these changes Sep 24, 2020

View reviewed changes

duglin merged commit 6cbb44a into cloudevents:master Sep 30, 2020

duglin mentioned this pull request Dec 7, 2020

v1.0.1 #744

Merged

	specification uses 'application/x-protobuf' to identify such content.
	specification uses 'application/protobuf' to identify such content based on the draft [Encoding rules and MIME type for Protocol Buffers](https://tools.ietf.org/html/draft-rfernando-protocol-buffers-00).

Re-Introducing Protocol Buffer Representation #626

Re-Introducing Protocol Buffer Representation #626

Conversation

JemDay commented May 20, 2020

duglin commented May 21, 2020

clemensv commented May 21, 2020

markfisher commented May 21, 2020

JemDay commented May 21, 2020 • edited Loading

evankanderson commented May 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JemDay May 22, 2020 • edited Loading

Choose a reason for hiding this comment

evankanderson commented May 22, 2020

evankanderson commented May 22, 2020

bsideup commented May 22, 2020

Choose a reason for hiding this comment

JemDay May 22, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

duglin commented May 26, 2020

JemDay commented May 26, 2020

Choose a reason for hiding this comment

JemDay Jun 2, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

duglin commented Jun 4, 2020

JemDay commented Jun 4, 2020

evankanderson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexvanboxel left a comment • edited Loading

Choose a reason for hiding this comment

JemDay commented Jun 9, 2020

JemDay commented Jun 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JemDay commented Sep 15, 2020

jskeet commented Sep 16, 2020 • edited Loading

JemDay commented Sep 16, 2020

jskeet commented Sep 16, 2020

JemDay commented Sep 16, 2020

jskeet commented Sep 16, 2020

JemDay commented Sep 21, 2020

jskeet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JemDay commented Sep 22, 2020

jskeet commented Sep 23, 2020

duglin commented Sep 24, 2020

jskeet commented Sep 24, 2020

duglin commented Sep 24, 2020

JemDay commented Sep 24, 2020

JemDay commented May 21, 2020 •

edited

Loading

JemDay May 22, 2020 •

edited

Loading

JemDay May 22, 2020 •

edited

Loading

JemDay Jun 2, 2020 •

edited

Loading

alexvanboxel left a comment •

edited

Loading

jskeet commented Sep 16, 2020 •

edited

Loading