Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XXX_unrecognized broke code #594

Closed
minaevmike opened this issue May 3, 2018 · 12 comments
Closed

XXX_unrecognized broke code #594

minaevmike opened this issue May 3, 2018 · 12 comments

Comments

@minaevmike
Copy link

minaevmike commented May 3, 2018

After adding XXX_unrecognized to generated structs, they can't be used as map key, i think this may broke some code. May be adding this fields must be optional? Or type of this filed can be changed to string?

@minaevmike minaevmike changed the title XXX_NoUnkeyedLiteral broke code XXX_unrecognized broke code May 3, 2018
@dsnet
Copy link
Member

dsnet commented May 3, 2018

It is unlikely that XXX_NoUnkeyedLiteral causes this as it is of type struct{}, but rather the presence of XXX_unrecognized of type []byte. The addition of XXX_unrecognized was to support unknown field preservation in proto3 (see protocolbuffers/protobuf#272).

In general, the proto specification does not formally define what equality means for protos. Since comparability is undefined, and using protos as map keys requires comparability, then that means using protos as map keys is undefined behavior. The fact that proto3 structs were comparable before was unfortunate because it gave the illusion that this was the case when it wasn't.

The way around this is to define your own Go struct that is comparable and copy the fields you care about into that struct and use that instead as the map key.

@dsnet dsnet closed this as completed May 3, 2018
@awalterschulze
Copy link

I appreciate where you are coming from Joe.
Pushing the combination of library and code generation to a sane place is a tricky job.
I really believe you are the only person who can solve the problems in the go protobuf eco system.

Here is an example of a user @minaevmike that now has to use the generated protobuf struct simply as a proxy object.
This will mean, as you suggested, that code will need to be written and executed that copies between the generated proxy object and struct that is really needed.
Now when fields are added to the protobuf, someone will need to remember to also update the copy function, because no compiler that I know of is going to help you to remember that.
This copying is both expensive and hard to maintain.
This is main reason that gogoprotobuf had to become more than a code generator, but a library as well.
Now you and I get to deal with trying to push these two libraries back together.

@leighmcculloch
Copy link

While I understand a solution is needed to support preserving fields, I think the solution to add fields to all requests should actually be considered a breaking change. While comparability of generated requests and replies was undefined, it's released behavior.

Additionally it makes the APIs messy, with the XXX fields being exposed in documentation, auto complete, and is leaking optional functionality into every application using the requests and replies.

Beyond the comparability for map keys, this makes it significantly hard to write tests for GRPC handlers written in Go. It's now impossible to compare two replies without checking each field individually, where-as it was trivial to compare two replies as a whole before.

I get what you're saying @dsnet that there was motivation to add these fields to achieve compatibility with proto3, but can we explore alternatives that do not add these fields to the request or reply themselves? Or make their inclusion optional and opt-in?

@nhooyr
Copy link

nhooyr commented Jul 29, 2018

Given this change, I think the version tag for the commit that does should be v2.0.0 because it's a breaking change.

@leighmcculloch
Copy link

I realize a comment in my comment above was incorrect, the values can still be compared by using the generated Equal method, just not in maps as a map key. 🎉

@nhooyr
Copy link

nhooyr commented Jul 29, 2018

In that case

@dsnet
In general, the proto specification does not formally define what equality means for protos. Since comparability is undefined, and using protos as map keys requires comparability, then that means using protos as map keys is undefined behavior. The fact that proto3 structs were comparable before was unfortunate because it gave the illusion that this was the case when it wasn't.

proto3 structs are comparable given the existence of proto.Equal. If what you say is true, we should remove proto.Equal as well.

@nhooyr
Copy link

nhooyr commented Jul 30, 2018

@leighmcculloch To be clear, I don't think we should. I like having my protobuf structs have equality defined on them. It makes everything so much easier. But if you take @dsnet's argument to its logical conclusion, it should be removed.

@dsnet
Copy link
Member

dsnet commented Jul 30, 2018

What constitutes a violation of "backwards compatibility" is a tricky subject that combines a number of different factors:

  1. First and foremost is what the compatibility agreement says. For Go protobufs, this is documented here:
  • In that list, it mentions our right to make changes related to "[Protocol Buffer] Specification errors or changes". For the issue at hand, unknown field preservation was a specification change that happened upstream (see proto3 and unknown fields protocolbuffers/protobuf#272).
  • In that list, it also says that we may be "Adding, removing, or changing methods or fields in generated structs that start with XXX", which is exactly what happened here.
  1. However, just because the compatibility statement says we can make a change does not necessarily mean that we should. For example, the protobuf specification says that string fields must be valid UTF-8. A few months ago, we were not doing UTF-8 validation at all, which was causing problems with interoperability with protobuf implementations in other languages. After a rocky rollout, we settled on enforcing UTF-8 validation on proto3 (to be consistent with most other languages) and not for proto2 (as a pragmatic position to avoid breaking too many targets). If we enabled strict UTF-8 enforcement in proto2 as well, we would technically still be in compliance with point 1, but pragmatically, the cost in this case outweighed any benefits (there is benefit in maintaining validation for proto3, but that's a separate discussion).

In regards to a major version bump, this is actually more breaking that you would expect. For example, if you have a new major proto version (call it protov2), then protov2.Message and proto.Message actually have distinctly different types and are not interoperable. A major version change comes with great costs and should only be done if there are significant benefits that the major version brings. It is not as simple as "a change broke my code, therefore it should be a major version bump". If we are going roll out with a major version change of the proto package, it had better make a lot more changes than this relatively small (in relation to the entirety of the proto ecosystem) issue of broken comparability of generated messages.

@dsnet
Copy link
Member

dsnet commented Jul 30, 2018

In regards to the possibility of an opt-out, if the purpose of the opt-out is to allow messages to be comparable again, then that opt-out is really just circumventing the immediate problem and kicking the can down the road. That doesn't change the fact that that some other field may be added in the future that again break comparability.

The XXX pollution is unfortunate, but that is an issue orthogonal to this one. See #276

@sitano
Copy link

sitano commented Aug 2, 2018

@dsnet the design behind that decision of preserving unknown fields in the models is broken. this intermediate state which the parser tries to keep between de/ser ought to lay in the serialization context which is NOT a model. I understand that maybe decision maker had no choice cuz API does not provide any intermediate state for working with de/ser, but I can't understand why it was decided to tradeoff this huge and popular usage pattern - models code generation.

protoc3 dropped unknown fields. then protoc3.5 added them back. your arguments that you/google had that right are not supporting and does not prove it was the right thing to do.

in the end, yes, you follow the specification, but why then golang/protobuf has only single release v.1.0.x? and does not have any other, any of the previous, and the compat was broken to all 0.x.x versions.

devs could at least TAG a version before breaking everyones code.

p.s. remembering dep vs vgo case.

@dsnet
Copy link
Member

dsnet commented Aug 2, 2018

@sitano, if your issue is with the addition of unknown fields in proto3 without bumping the version of the proto syntax version, then this is not the place to address that (see protocolbuffers/protobuf#272). This repo's primary purpose is to implement the proto specification for Go. Whether the specification should have changed or not is irrelevant to this specific issue (again bring it up at protocolbuffers/protobuf#272). This is irrelevant to Go.

If your issue is with the addition of XXX fields that broke comparability, then the compatibility agreement reserves our right to add new fields as is necessary to implement protos. In Go, it is not safe to depend on type comparability unless the type explicitly documents that it permits comparability or all of the directly nested fields of the type are directly under your control (and you can ensure comparability). For example, tar.Header was comparable prior to the addition of the Xattrs and PAXRecords fields. However, the addition of the Xattrs field broke comparability (since maps are incomparable). However, note that this supposedly breaking change was made in Go1.3 as opposed to Go2. Why? Because the Go project also reserves the right to add fields to a struct.

Perhaps it is a weakness of the language that it is too easy for users to rely on comparability in a poorly forwards-compatible way, but I digress.

@sitano
Copy link

sitano commented Aug 5, 2018

This repo's primary purpose is to implement the proto specification for Go...

yes, all these specs are different and you could emit different releases for different version of specs instead of ignoring issues with backward compatibility to the existing code bases. broken design and liability do not prove you are right at least for me.

I've decided my compatibility problems with moving to the gogo protoc implementation. I understand everything you are saying. I just think that such a process of evolution of this very popular project is not very fair. It's very one-sided.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants