RFC: Add Enums module for enum support through the `@enum` macro #10168

quinnj · 2015-02-11T23:44:20Z

This builds largely on and supersedes the work by @vtjnash in #2988 (minus the bitmask business), with some modifications in line with comments in #3080 and the python PEP for enums.

Basically the usage is:

In  [3]: @enum Fruit apple orange kiwi

In  [4]: Fruit <: Base.Enums.Enum

Out [4]: true

In  [5]: apple

Out [5]: apple::Fruit

In  [6]: orange

Out [6]: orange::Fruit

In  [7]: Fruit(0)

Out [7]: apple::Fruit

In  [8]: Fruit(2)

Out [8]: kiwi::Fruit

In  [9]: f(x::Fruit) = "hey, I'm a Fruit"

Out [9]: f (generic function with 1 method)

In  [10]: f(apple) == "hey, I'm a Fruit"

Out [10]: true

In  [11]: f(Fruit(3))

Out [11]: ERROR: LoadError: invalid enum value for Fruit: 3
while loading In[11], in expression starting on line 1

 in error at error.jl:19
 in call at Enums.jl:22
 in call at Enums.jl:20

The RFC status is because there are still a few design questions that may be worth discussing:

Currently negative enum values are allowed (e.g. @enum Fruit apple=-1); I couldn't really think of a reason to disallow them; I think in some languages they're disallowed however
There's no overflow checking when assigning default enum member values (e.g. @enum Fruit apple=0xff orange, orange will have value 0x00); I don't think this is a big deal in practice, but while writing tests it came up
@JeffBezanson made a comment here about being able to add another subtype under Enums.Enum (could be used in a module for groups of enums, for example); we could allow this through syntax like @enum YellowFruits=Fruit banana lemon, but I'm not sure if that's pretty or useful enough
Scoping: in python, you create an enum by creating a class that subtypes enum and enum members are fields of that class; this provides a nice scoping for enum members that belong to the enum and don't leak out into the current module namespace. In this implementation, enum member values are just const enum_value = EnumType(x), which isn't bad, except if you inadvertently do something like @enum MyImportantEnum zero=0 one=1, which overwrites the bindings for the Base functions zero and one. One idea I had for this would be, in the @enum macro call, to wrap the enum type definition and enum member assignments in a baremodule. With allow overloading of a.b field access syntax #1974, you could then access enum member values only by doing something like Fruit.apple (which is python's syntax) and this would access the enum member value within the auto-generated baremodule.

I'd appreciate any comments/feedback.

StefanKarpinski · 2015-02-11T23:45:43Z

test/enums.jl

+
+@test_throws MethodError convert(Base.Enums.Enum, 1.0)
+
+Base.Enums.@enum Fruit apple orange kiwi


Wouldn't it make more sense to import the @enums macro and then use it bare? I'd expect that we export this from Base if this gets merged.

It actually is exported from Base in this PR, I just tend to write tests as fully qualified as possible :)

That's fair, although I might argue that the fact that it's exported is something one might want to test for. After all, if it somehow was suddenly deleted from Base's exports that would certainly break a lot of code.

That's a good point.

JeffBezanson · 2015-02-11T23:57:42Z

I like this approach, and I think this is fine as-is. I agree scoping is the most significant remaining design issue. However I think we can live with this, as you can already wrap an enum in a module if you want the values hidden better.

Should we automatically pick the smallest integer type that works for the given values? E.g. an enum with 3 values, and no specified integer values, could use Int8.

quinnj · 2015-02-12T00:16:56Z

Yeah, I originally thought of picking the smallest value, but went this this approach instead (using promote_type on all the given enum member values). Basically, it lets the user "declare" the types by using whatever they want, within the bounds of promote_type. We could change it, but I thought this was a pretty flexible approach.

JeffBezanson · 2015-02-12T00:20:18Z

Yes, that approach is good. But in addition to that, we could pick Int8 for @enum Fruit apple orange kiwi, where there is no particular user-requested type.

quinnj · 2015-02-12T03:29:50Z

That's true. I'll work on putting that in plus a few other tweaks.

quinnj · 2015-02-12T05:10:48Z

test/enums.jl

+@test ff.n === 0xff
+@test overflowed.n === 0x00
+
+@test_throws MethodError eval(macroexpand(:(Base.Enums.@enum Test1 _zerofp=0.0)))


Am I missing a better way to test error-throwing macros? This felt a little weird.

This is pretty much it, but just eval should be sufficient. It does macro expansion.

quinnj · 2015-02-12T05:23:03Z

Alright, changed enum field name to val instead of n and implemented the default type "slimming" + updated tests.

…to fit all values

quinnj · 2015-02-12T22:23:45Z

The CI tests all passed except a timeout on OSX (which I've tested locally). Any other comments/feedback? Responses to the questions I mentioned above?

StefanKarpinski · 2015-02-12T22:28:07Z

Should we be concerned about C storage compatibility as a default? Is there much call for passing arrays of enums? I can't recall many C APIs that do that, so I'm thinking no.

jakebolewski · 2015-02-12T22:33:45Z

I believe the enum type is up to the compiler, the C spec only requires that the enum type be big enough to hold all elements.

vtjnash · 2015-02-12T23:06:42Z

To be sure, C will let you set flags to choose a different size, but you're generally safe assuming int.

Possibly not for arrays, but it's common enough for structs.

quinnj · 2015-02-19T16:28:35Z

Any other comments? Good to merge?

Conflicts: base/exports.jl test/runtests.jl

quinnj · 2015-02-21T03:49:36Z

Rebased and added docs/NEWS entry. I'll merge this weekend unless there are other objections.

quinnj · 2015-02-21T04:08:37Z

Hmmmm, linux travis failed trying to download Rmath and OSX travis failed with the error below, which I believe is unrelated:

ERROR: LoadError: MethodError: `<<` has no method matching <<(::Function, ::Function)
Closest candidates are:
  <<(::Any, !Matched::Int32)
  <<(::Any, !Matched::Integer)
 in include at /private/tmp/julia/lib/julia/sys.dylib
 in include_from_node1 at loading.jl:128
 in process_options at /private/tmp/julia/lib/julia/sys.dylib
 in _start at /private/tmp/julia/lib/julia/sys.dylib
while loading /private/tmp/julia/share/julia/test/runtests.jl, in expression starting on line 1

tkelman · 2015-02-21T07:07:03Z

Failed in the same way on both win32 and win64 on appveyor, so seems to be related.

RFC: Add Enums module for enum support through the `@enum` macro

jakebolewski · 2015-02-21T18:00:46Z

Overloading names here to return the enum values is a bit weird. I would expect that names(::Type) would return the field names that I can lookup with getfield. Maybe names should be renamed to fieldnames as it is more descriptive. It would be nice to export the abstract Enum type as well.

The functionality is great, but as you pointed out it is very easy to overwrite functions in Base due to naming conflicts. Perhaps a note in the documentation could be added about this?

jakebolewski · 2015-02-21T20:12:27Z

You have already claimed the Enum name, no harm in exporting it.

Makes it easier to write

julia> @enum Foo bar baz

julia> isa(baz, Enum)
ERROR: UndefVarError: Enum not defined

julia> isa(baz, Base.Enums.Enum)
true

On Sat, Feb 21, 2015 at 3:08 PM, Jacob Quinn [email protected]
wrote:

Yeah, I thought about exporting Enum but also couldn't really think of a
use for it, so I left it unexported for now. Changing names to fieldnames
seems like a good idea. I've always been a little thrown off by how
DataFrames overloads names for column names as well.

I'll add a note about overloading Base names.

—
Reply to this email directly or view it on GitHub
#10168 (comment).

nalimilan · 2015-02-21T22:33:19Z

+1 for renaming names to fieldnames or fields. In addition to DataFrames, we need names in the context of NamedArrays (arrays with row/column/dimension names), and I can't find different terminology that would be natural.

Ken-B · 2015-02-22T21:43:41Z

Sorry if it's an obvious question, but why does enum counting start at zero? I would have guessed it starts at one like with Arrays.

To me the documentation also suggests it starts at one.

jakebolewski · 2015-02-22T22:18:49Z

C compatibility.

On Sun, Feb 22, 2015 at 4:43 PM, Ken-B [email protected] wrote:

Sorry if it's an obvious question, but why does enum counting start at
zero? I would have guessed it starts at one like with Arrays.

To me the documentation also suggests it starts at one.

—
Reply to this email directly or view it on GitHub
#10168 (comment).

jakebolewski · 2015-02-23T20:48:24Z

Is this the behavior we want?

julia> @enum(Foo, bar=1, baz=1)

julia> bar
barbaz::Foo

julia> is(bar, baz)
true

quinnj · 2015-02-23T20:53:05Z

Hmmm....good question. It's the same behavior as C, but C is also much more "simple" in it's implementation of just mapping to integers and that being more transparent. It seems like it could possibly be useful to allow, but also a tad confusing when you do something like convert(Foo,1) with multiple values of 1 defined.

StefanKarpinski · 2015-02-23T21:22:20Z

I suspect that we should initially err on the side of conservatism and consider this to be an error. Just because C allows it doesn't mean it's a great idea; let's wait and see if anyone actually needs this, and if so, we'll have a little more sense of what it should do. Concatenating the symbol names, for example, seems a bit weird.

jakebolewski · 2015-02-23T21:24:38Z

I needed it today (mapping a C enum), but I don't know if we want to retain that behavior. Agree about the concatenating symbols part.

StefanKarpinski · 2015-02-23T21:25:20Z

Well, what behavior would make sense for your application?

vtjnash · 2015-02-23T21:29:28Z

I've seen it used in Gtk to make aliases (eg they might make one with DOUBLE, TWO, and 2 as a modifier for CLICK. Which one gets used then can be dependent on user choice: limitations of a language binding variable naming rules vs. length)

jakebolewski · 2015-02-23T21:31:04Z

For the C interop case I think allowing this behavior and having bar == baz and bar !== baz would make sense, but I don't know if the current design allows this.

StefanKarpinski · 2015-02-23T21:38:00Z

With immutables, I don't think that's possible (or sensible) – how would you distinguish bar form baz? I think we just need to pick one of the names as the canonical name for the value – either the first or the last seem like the obvious options. In your case, which one was better?

quinnj · 2015-02-23T21:43:36Z

Yeah, definitely feels uncomfortable messing with is. We could make the type declaration with type instead of immutable, but that seems like a waste. Even if we picked one as the canonical, do we still make a const baz = bar statement as well? Seems like it would be weird to declare an enum member and have it mysteriously vanish until I dig through all the declarations and realize two happened to have the same value. I'd rather just error when we see two with the same value. Or add the const assignment.

StefanKarpinski · 2015-02-23T21:46:15Z

It's inherently impossible to make bar !== baz without making enums heavier – each enum value would need to store not only its value but also its symbol somehow. I really don't think we want to do that.

jakebolewski · 2015-02-23T21:47:22Z

I'm not really advocating that as a solution, only I feel that would be the desired behavior.

StefanKarpinski · 2015-02-23T21:48:40Z

Ok, so since that's impossible, I think we just need to pick either the first or last symbol as the canonical one. Which makes more sense in your use case?

StefanKarpinski · 2015-02-23T21:49:01Z

My experience with C enums makes me think that the first one is probably the one we'd want to go with.

jakebolewski · 2015-02-23T21:54:34Z

Usually this crops up to retain backwards compatibility in the API, so picking the first one would make sense.

StefanKarpinski · 2015-02-23T22:07:11Z

Ok, so there we have it: allow multiple bindings for enum values, but the first one is the canonical one.

quinnj · 2015-02-23T22:11:09Z

What about names? Currently, it shows everything that was declared, but with proposed change, bar::Foo would show up twice since it was declared twice, but both are now shown as the first canonical. Should we call unique on the enum member value array for names?

StefanKarpinski · 2015-02-23T22:12:51Z

Hmm. That's a little tough. It seems like all the alternative names should show up, even if more than one refer to the same value. Either that or only the canonical names should show up.

jakebolewski · 2015-02-23T22:27:18Z

I'm coming around to the fact that this should just be an error. I've never come across a situation where this is absolutely necessary. We are just complicating things for convenience.

StefanKarpinski · 2015-02-23T22:27:59Z

Even better!

ivarne · 2015-02-23T22:37:54Z

The most important difference is that Julia provides a two way mapping, but C is restricted to a one way mapping. Assigning multiple names for the same value is convenient in some cases, but isn't the same need served by having the user manually type

@enum(Foo, bar=1)
const Baz == bar

quinnj · 2015-02-23T23:22:25Z

I think I'm ok going either way. If we allow duplicates in declaration, then I think names should just return the list of canonical names.

zenna · 2015-04-03T22:55:36Z

Is there an established way to automatically export all the values of an enum in a module?

Add Enums module for enum support through the macro

0396092

quinnj changed the title ~~RFC: Add Enums module for enum support through the macro~~ RFC: Add Enums module for enum support through the @enum macro Feb 11, 2015

StefanKarpinski reviewed Feb 11, 2015
View reviewed changes

quinnj added 2 commits February 11, 2015 21:57

Change .n field to .val in enums

20ff9f5

Unqualify usage of @enum to test that export is working correctly

02cb520

quinnj reviewed Feb 12, 2015
View reviewed changes

If no enum member values are given, pick the smallest signed integer …

f2d7f0f

…to fit all values

quinnj force-pushed the jq/enum branch from b2f5dcd to f2d7f0f Compare February 12, 2015 06:13

quinnj added 2 commits February 12, 2015 07:45

Remove enum.jl from examples

cce9a74

Remove running of examples/enum.jl from test suite

9533436

quinnj added 2 commits February 20, 2015 15:16

Merge branch 'master' into jq/enum

fb8894a

Conflicts: base/exports.jl test/runtests.jl

Add docs and NEWS entry

d004891

quinnj added 2 commits February 21, 2015 00:16

Merge branch 'master' into jq/enum

bedd890

Resolve merge conflicts

5f346d9

quinnj added a commit that referenced this pull request Feb 21, 2015

Merge pull request #10168 from JuliaLang/jq/enum

4319353

RFC: Add Enums module for enum support through the `@enum` macro

quinnj merged commit 4319353 into master Feb 21, 2015

jakebolewski mentioned this pull request Feb 24, 2015

Error on duplicate values in declaration of@enum #10309

Closed

quinnj mentioned this pull request Feb 24, 2015

Enum inner reworking to remove usage of Val{T} #10315

Closed

quinnj deleted the jq/enum branch February 25, 2015 17:51

mauro3 mentioned this pull request Jul 30, 2018

1-based @enums #28351

Closed


		@test_throws MethodError convert(Base.Enums.Enum, 1.0)

		Base.Enums.@enum Fruit apple orange kiwi

RFC: Add Enums module for enum support through the @enum macro #10168

RFC: Add Enums module for enum support through the @enum macro #10168

Conversation

quinnj commented Feb 11, 2015

StefanKarpinski Feb 11, 2015

Choose a reason for hiding this comment

quinnj Feb 11, 2015

Choose a reason for hiding this comment

StefanKarpinski Feb 11, 2015

Choose a reason for hiding this comment

quinnj Feb 12, 2015

Choose a reason for hiding this comment

JeffBezanson commented Feb 11, 2015

quinnj commented Feb 12, 2015

JeffBezanson commented Feb 12, 2015

quinnj commented Feb 12, 2015

quinnj Feb 12, 2015

Choose a reason for hiding this comment

JeffBezanson Feb 12, 2015

Choose a reason for hiding this comment

quinnj commented Feb 12, 2015

quinnj commented Feb 12, 2015

StefanKarpinski commented Feb 12, 2015

jakebolewski commented Feb 12, 2015

vtjnash commented Feb 12, 2015

quinnj commented Feb 19, 2015

quinnj commented Feb 21, 2015

quinnj commented Feb 21, 2015

tkelman commented Feb 21, 2015

jakebolewski commented Feb 21, 2015

jakebolewski commented Feb 21, 2015

nalimilan commented Feb 21, 2015

Ken-B commented Feb 22, 2015

jakebolewski commented Feb 22, 2015

jakebolewski commented Feb 23, 2015

quinnj commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

jakebolewski commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

vtjnash commented Feb 23, 2015

jakebolewski commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

quinnj commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

jakebolewski commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

jakebolewski commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

quinnj commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

jakebolewski commented Feb 23, 2015

StefanKarpinski commented Feb 23, 2015

ivarne commented Feb 23, 2015

quinnj commented Feb 23, 2015

zenna commented Apr 3, 2015

RFC: Add Enums module for enum support through the `@enum` macro #10168

RFC: Add Enums module for enum support through the `@enum` macro #10168