-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APIv2: handling of nil map values #964
Comments
I vote for the C++ semantics. Doing otherwise will complicate our API for a dubious protobuf semantic. However, we should gather data on what (if anything) depends on this. |
Related: #965 |
C++ semantics seems more reasonable to me, and that the current v1 and v2 behavior are just emergent behavior of the library. Although, now that I think about it, you say, “We need to either deliberately choose…” which totally says the current behavior is emergent. 😆 If we were to make a deliberate choice, let us then choose the clearest alignment with protobuf semantics: a nil message just does not exist. |
Global testing turns up three failures in Google. Two are tests explicitly checking behavior when a map value is nil. So far as I can tell, neither expects this to occur in practice. One is treating a nil message value as indicating a default, but can be easily updated to treat an empty message in the same way. Since there's currently no way for a C++ client to indicate a default, that seems like a plain bugfix. (I assume there are no C++ clients to this service.) I think this argues for the C++ semantics. |
Final decision is to retain the new, C++-compatible semantics: A |
Please reconsider. It is breaking the notion of "wire protocol", making it "we send/receive something that C++ thinks it is right". The definition of "wire protocol" or "data transfer/storage protocol" (IMHO) is to preserve whatever got mangled in no matter how wrong it was on 2nd level rules. Most important part - it de-serializes your data back to whatever original was. Rinse-repeat will bring exactly the same result as original was. With this change above is no longer true. As much as Google internal use covers, it is still does not cover everything. Including bugs, assumptions and historical reasons in outside of Google world. We use proto for storage of data, and I dread the moment when old data deserialized with some values that were set to nil will all of a sudden have "empty" structs attached to them. To me, if protocol format description document allows it - it is legal. (If protocol allows dropping of required fields - so be it. I guess that's why there are no required fields anymore, but wire encoding did not change - it always allowed it. Encoders were enforcing it.) Key-Value maps (that used in serialization of maps and lists) in protocol buffers described that Key kind of MUST be something (because nil keys are not allowed), but is "Value" part described as "MUST present"? I don't think it is even possible with all fields being optional in proto3 One more thing - I think you will have to support "bad" maps anyway because Thank you very much for your time. |
The v2 behavior strictly adheres to the protobuf data model. It was a bug in the v1 implementation that transient information not in the protobuf data model was accidentally leaking to/from the wire format.
The Go protobuf compatibility agreement reserves the right to fix bugs. This change in behavior is to fix buggy behavior in v1 to be compliant with the protobuf data model. That being said, I understand that this change could potentially be problematic for existing data that is written to disk. If so, we would need to understand whether this is a real problem or just a hypothetical. And if real, how often it happens. Supporting the incorrect behavior is costly as well because it causes incompatibility with the other language implementations. |
No, an author of a widely accepted document “required considered harmful” argued quite successfully that it a) served no good purpose that was not better fulfilled by other means, and b) marking a field as “required” was sticky, i.e. once you apply it, there is no safe and forward-and-backward-compatible way remove it. So there were lots of protobufs out there shoving dummy values into these fields, because they were marked “required” but lost their purpose, and were no longer used. Basically, there was nothing about the “required” feature that had any redeeming value. So, by the time proto3 was being defined, it was decided to drop the feature entirely. But more specifically:
The protocol never allowed dropping of required fields. That one could produce an invalid wire-encoded message that was missing a required field is irrelevant, the protocol is a full set of agreements on behavior and handlings, not just a wife format; and that protocol always required that required fields appear. |
Problematic? I see it as "devastating" - data that was serialized (maps with nil values) all of a sudden becomes maps with empty values. It is not the same data anymore. |
Are you speaking from experience regarding a real problem or is this hypothetical?
The generated Go code is not a 1:1 match to the protobuf data model. There are other ways where you can express things in the Go data structures that cannot be represented in the protobuf data model. One simply cannot expect that a round-trip marshal/unmarshal reproduces the exact same Go data structure. This is a false expectation and never documented as being valid. |
(I have no skin in the game, but...) Still seems hypothetical. Would be nice to have more data to back this up, and whether or not real code actually misbehave as a result; do the tests still pass? Also, remember v2 is a breaking change anyway. Now is the time to fix this. |
I would need to dive in in (that I know of) 3 different projects that are using protobuf for serialization/storage here. I can do it for 1 (will take at least a day), 2 others are out of my reach and level of understanding of how their data is stored and used. I am not sure how to quantify that time with my manager, I think I will be asked (again) why did we choose protobuf for serialization if it is not deemed stable.
I, that that "one", am expecting exactly that from serialization format. Especially when I am using same language, same library. |
Thank you @meling. You made me thinking - maybe it is a time to change the spec to what really happens in a wild and allow nil values for the maps? (But it that would be impossible if Java or C++ libs are using something that effectively prohibits that, like some home-grown map classes) |
The v1 API permits nil values in message-valued maps. For example:
The v1 API will:
The v2 API will currently:
The C++ proto API makes no distinction between empty and absent map values. A nil map value will not be preserved if it passes through a C++ unmarshal/marshal layer.
We need to either deliberately choose to drop support for nil messages as map values (as distinct from empty ones--you can have a nil value, it's just no different from an empty
&somepb.Message{}
), or decide that this would be too hazardous a change and preserve the v1 semantics in the v2 API.The text was updated successfully, but these errors were encountered: