Skip to content

Serialize Enum to underscored String#9905

Closed
caspiano wants to merge 6 commits intocrystal-lang:masterfrom
caspiano:enum-serialisation
Closed

Serialize Enum to underscored String#9905
caspiano wants to merge 6 commits intocrystal-lang:masterfrom
caspiano:enum-serialisation

Conversation

@caspiano
Copy link
Contributor

@caspiano caspiano commented Nov 12, 2020

This PR makes enums default JSON/YAML serialization target an underscored string.

For backwards compatibility, I could merge a modified converter from #9887 to allow the specification of the serialisation target of an enum via a converter in a (JSON|YAML)::Field annotation

Follows from the discussion from #9887

Additionally, enums using the Flags annotation are now serialized to an array of downcased strings

@[Flags]
enum Example
  Foo
  Bar
end

Example::All.to_json # => ["foo","bar"]

To retain the previous serialization behaviour for your enum, you can do the following

enum Example
  Foo
  Bar
  
  def to_json(json)
    json.number to_i
  end
  
  def to_yaml(json)
    yaml.scalar to_i
  end
 end

@Fryguy
Copy link
Contributor

Fryguy commented Nov 12, 2020

The PR title says "underscored string", but the PR seems to be about "lowercased strings". Is it the latter? When I see "underscored string" I think something like "one_two_three". Perhaps this just needs a spec showing an underscored string?

@straight-shoota straight-shoota added breaking-change May have breaking effect in some edge cases. Mostly negligible, but still noteworthy. topic:stdlib:serialization labels Nov 12, 2020
@caspiano caspiano changed the title Serialise Enum from underscored String Serialize Enum to underscored String Nov 13, 2020
@caspiano caspiano force-pushed the enum-serialisation branch 2 times, most recently from 1adecd0 to ffd228a Compare November 15, 2020 13:53
Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 💚

@vladfaust
Copy link
Contributor

What if I need a custom string format, e.g. "myEnum": "oneHundred" for my JSON protocol?

@straight-shoota
Copy link
Member

For custom formats you should use a custom converter.
Alternatively, you can also override #to_json in a specific enum type.

@caspiano
Copy link
Contributor Author

@vladfaust, as @straight-shoota said, you can define your own #to_json. I think this PR's primary intention is to move away from the default serialization to integers, which will cause issues if the members (unnumbered) of an enum are reordered.

The underscored serialization target was chosen as a sane default.

@straight-shoota
Copy link
Member

@crystal-lang/crystallers I think this is a good change and I'd like to get this merged before 1.0. Anyone care to review?

@asterite
Copy link
Member

But why? I don't see a very big discussion about this. Only non-core member replied.

@straight-shoota
Copy link
Member

That's why I'm asking for reviews from core-members.

@asterite
Copy link
Member

I'm not sure about this. No use case or reason to change this was given

@Sija
Copy link
Contributor

Sija commented Jan 17, 2021

@asterite For me the reason is twofold, this format is:

  1. moar descriptive - given imaginary status flag: value 3 tells you nothing, but suspended is immediately understandable
  2. widely used in most APIs

@straight-shoota
Copy link
Member

Use case was originally described in #9881

@straight-shoota straight-shoota linked an issue Jan 17, 2021 that may be closed by this pull request
@straight-shoota
Copy link
Member

The main reason why I prefer names for stringification is that it's more reliable.

Unless explicitly assigned (like in HTTP::Status), the values can easily change between builds when members are added, removed or reordered.

#9881 (comment)

@asterite
Copy link
Member

This all depends on what APIs use most commonly. For example in databases enums are usually represented with numbers. Numbers are more efficient because they occupy less space. The same is true for serialization.

I'd say there's no right or wrong here, except this is a breaking change we'll be doing for no reason.

@Sija
Copy link
Contributor

Sija commented Jan 17, 2021

This all depends on what APIs use most commonly. For example in databases enums are usually represented with numbers. Numbers are more efficient because they occupy less space. The same is true for serialization.

@asterite I disagree. JSON is mostly used for inter-service communication, meaning explicitness is favoured over byte efficiency - for that you wouldn't use JSON/YAML anyway.

If you need any real world examples check out Twitter API docs (search for enum)

@asterite
Copy link
Member

In github the enum values are uppercase.

I don't have a strong opinion so others will need to approve this.

@asterite
Copy link
Member

I guess this is fine. But @bcardiff will have to decide if it's worth including this in 0.36.0, or ever.

@Sija
Copy link
Contributor

Sija commented Jan 17, 2021

In github the enum values are uppercase.

@asterite In this api reference from GitHub docs they're downcased 🤔

@asterite
Copy link
Member

@Sija
Copy link
Contributor

Sija commented Jan 17, 2021

@asterite yeah, just noticed, but that's for GraphQL

@Sija
Copy link
Contributor

Sija commented Jan 17, 2021

Also see this discussion on SO.

@bcardiff
Copy link
Member

If we are going forward and change the default serialization, I would have a guidance on how to keep the old behavior at least as a note in this PR.

This is a discussion of what is the default serialization, but the user should be able to switch between them. In that regard, there is little need for a breaking-change to me.

I understand the benefit of having a human readable representation, but what happens with flag enum in this PR?

It's hard to me to measure the impact of this change. I imagine that some JSON serialization could be present between server and client or server and other systems that could break silently with this. I can only think of @jwoertink to weight the impact.

@jwoertink
Copy link
Contributor

As far as Lucky goes, I don't think this will really impact much if anything. Our Enum usage within Avram is wrapped by a struct. If anything, we would just add a method to handle the conversion.

I do agree with @bcardiff that we should have some documentation, at the very least, on how to keep the current implementation if you need. (i.e. using both to_json and from_json with Integers for enums) but the change itself seems ok to me 👍

@straight-shoota
Copy link
Member

For reference on impact, I regularly use such code in any serializable enum type: https://github.com/shardbox/shardbox-core/blob/19fb582433bbd914ba31295c76cf411d322c2a71/src/repo.cr#L17-L31

Flag enums is a problem, indeed because Foo.parse("BAR | BAZ") doesn't work atm. Maybe it should? Or we stay with numeric serialization for flag enums.

@asterite
Copy link
Member

For flags enum we can use an array of strings

@caspiano
Copy link
Contributor Author

Serializing to an array of strings for a Flags enum is great! So many use-cases

@caspiano
Copy link
Contributor Author

@bcardiff updated original PR comment to include guidance on retaining old serialisation behaviour
@asterite updated PR to include serialisation of Flags enums to an array of downcased strings

@bcardiff
Copy link
Member

I don't think is a good decision to decode from number or strings and encode to strings.

If strings (or array of strings) are to be used for enum by default what I think should be available in the std-lib is a converter that would be used to encode and decode to numbers as before.

When that converter is available and the user can choose to use it, then I think we can use strings as default encode and decode (without number fallback).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change May have breaking effect in some edge cases. Mostly negligible, but still noteworthy. pr:needs-review topic:stdlib:serialization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC] Enum JSON string converter

8 participants