Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode unknown enum values #233

Closed
stepancheg opened this issue Jun 28, 2017 · 14 comments
Closed

Decode unknown enum values #233

stepancheg opened this issue Jun 28, 2017 · 14 comments

Comments

@stepancheg
Copy link
Owner

stepancheg commented Jun 28, 2017

Currently rust-protobuf results in error when decoding unknown enum values.

Protobuf 3 requires that message should preserve unknown enum values, not simply drop them.

Proposal

struct<E : ProtobufEnum> ProtobufEnumOrUknown<E> { ... }

will be introduced in the protobuf library.

Generated fields will be of that type:

color: ProtobufEnumOrUnknown<Color>; // for protobuf3 signular enums
color: Option<ProtobufEnumOrUnknown<Color>>; // for protobuf2 singular enums
colors: Vec<ProtobufEnumOrUnknown<Color>>; // for repeated enums

Getters and setters will work with enums as previously (e. g. Color).

@sashahilton00
Copy link

Is there an easy solution/workaround to avoid this behaviour? Even just ignore the unknown enums? We just experienced this error recently in https://github.com/librespot-org/librespot where the enum added was not important but everyone was required to update anyway.

@stepancheg
Copy link
Owner Author

As for temporary workaround I'd suggest patching your .proto files before code generation by simply removing unused enum fields (or by replacing them with i32).

I'm currently working on proper fix.

@stepancheg
Copy link
Owner Author

Current patch set: https://github.com/stepancheg/rust-protobuf/commits/unknown-enum-value

Comments, suggestions are welcome! Especially since this change will be somewhat backward incompatible.

@plietar
Copy link

plietar commented Mar 22, 2018

In protobuf2 an unknown enum value should be ignored (has_foo() returns false) and added to the list of unknown_fields.

From https://developers.google.com/protocol-buffers/docs/proto#updating

In the current Java and C++ implementations, when unrecognized enum values are stripped out, they are stored along with other unknown fields. Note that this can result in strange behavior if this data is serialized and then reparsed by a client that recognizes these values. In the case of optional fields, even if a new value was written after the original message was deserialized, the old value will be still read by clients that recognize it. In the case of repeated fields, the old values will appear after any recognized and newly-added values, which means that order will not be preserved.

I personally find this horrible, and ProtobufEnumOrUnknown sounds like a much better solution, but matching the official implementation's behaviour should be considered too.

@stepancheg
Copy link
Owner Author

@plietar thank you very much for pointing to this valuable piece of spec.

I'm not sure what would be better.

Seems like syntax=proto3 ProtobufEnumOrUnknown is a clear winner.

The question what to do with syntax=proto2 on unknown values. There are options:

  • return error as now. People complain, that's against any spec, bad
  • simply discard unknown fields. That's bad for debugging and for load/modify/write case
  • match Google protobuf2 spec, store in unknown fields. In Google they did it I guess for backward compatibility, because they have a huge codebase, which is probably not a big issue for rust-protobuf, especially because it returns error now.
  • implement ProtobufEnumOrUnknown as for proto3

So we need to choose between options (3) and (4).

Anyway, ProtobufEnumOrUnknown should go to version 1.5 of rust-protobuf, and for 1.4 branch I think I should implement storing unknown values in unknown fields for both proto2 and proto3.

stepancheg added a commit that referenced this issue Mar 22, 2018
This fix long standing issue of protobuf returning error when parsing
unknown enum values.

New behavior matches proto2 (even is code is generated for proto3).

This is backward-incompatible change.

Issue #233
@stepancheg
Copy link
Owner Author

I've created a patch which implement storing unknown values in unknown fields: #276

This is source-compatible change, but behavior is changed. I want to merge it into 1.4 branch and create a new version.

This is somewhat dangerous change, so I'd appreciate code review.

stepancheg added a commit that referenced this issue Mar 25, 2018
This fix long standing issue of protobuf returning error when parsing
unknown enum values.

New behavior matches proto2 (even is code is generated for proto3).

This is backward-incompatible change.

Issue #233
stepancheg added a commit that referenced this issue Mar 25, 2018
This fix long standing issue of protobuf returning error when parsing
unknown enum values.

New behavior matches proto2 (even is code is generated for proto3).

This is backward-incompatible change.

Issue #233
@stepancheg
Copy link
Owner Author

Merged that diff and released as 1.5.0.

Keeping the issue open to implement ProtobufEnumOrUnknown.

@hicqu
Copy link
Contributor

hicqu commented Jun 5, 2018

Hi, Any progress for implement ProtobufEnumOrUnknown? We need the feature to avoid check unknown_fields every time.

@hicqu
Copy link
Contributor

hicqu commented Jun 5, 2018

Could ProtobufEnumOrUnknown be used for both proto 3 and proto 2? personally I think it's better than store it in unknown fields, even if it's not compatible with offical implement.

@stepancheg
Copy link
Owner Author

@hicqu could you describe your use case? Why do you check for unknown enum values in unknown fields? I mean, unset enum and unknown enum are usually the same for the most applications, and storing enums in unknown fields is usually needed only to a) not failing during parsing b) preserving during decoding/encoding.

I'm asking to better understand what people need to make a proper decision.

@stepancheg
Copy link
Owner Author

@hicqu I also think that storing enums in ProtobufEnumOrUnknown is OK, and compatibility with official implementation is not a big issue. My largest concern is ergonomics: users will need to explicit unwrap it instead of using it directly.

@hicqu
Copy link
Contributor

hicqu commented Jun 6, 2018

My use case is about compatibility.
Suppose we are rolling update a distribution system, in which nodes are sending messages with old enum enum E { A = 0, B = 1 }. After rolling update the enum changes to enum E { A = 0, B = 1, C = 2}, so if nodes after the rolling update send messages with E::C to other nodes, they will get E::A instead and store 2 into unknown fields.

It won't be a problem if E::A means Invalid. However many Enums we used have a valid default value. So our code must change from

if enum_from_message == EnumType::DefaultValue {
    // do special logic.
}

to

if enum_from_message == EnumType::DefaultValue &&
    message.get_unknown_fields().get(proto_enum_field_id).is_none() {
    // do special logic.
}

It's hard to check all places have used Enums. So ProtobufEnumOrUnknown is very helpful for us.

@BusyJay
Copy link
Contributor

BusyJay commented Jun 8, 2018

How about represent the enum as i32? And generated code looks like following:

#[repr(i32)]
pub enum EnumA {
    Value0 = 0,
    Value1 = 1,
    Value2 = 2,
}

struct B {
    a: i32,
}

impl B {
    pub fn set_a(&mut self, a: EnumA) {
        self.a = a as i32;
    }

   pub fn set_a_value(&mut self, a: i32) {
        self.a = a;
   }
    
    pub fn get_a(&self) -> EnumA {
        match self.a {
            0 => EnumA::Value0,
            1 => EnumA::Value1,
            2 => EnumA::Value2,
            _ => EnumA::Value0,
        }
    }
    
    pub fn get_a_value(&self) -> i32 {
        self.a
    }
}

This way we can keep similar behavior as other languages' implementations and also follow proto3's open enum semantics while keeping the interface clean and neat.

@stepancheg
Copy link
Owner Author

Protobuf in master now generates enum fields as ProtobufEnumOrUnknown<E>.

Example

It's not late to revert this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants