Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add Enums module for enum support through the @enum macro #10168

Merged
merged 10 commits into from
Feb 21, 2015
Merged

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Feb 11, 2015

This builds largely on and supersedes the work by @vtjnash in #2988 (minus the bitmask business), with some modifications in line with comments in #3080 and the python PEP for enums.

Basically the usage is:

In  [3]: @enum Fruit apple orange kiwi

In  [4]: Fruit <: Base.Enums.Enum

Out [4]: true

In  [5]: apple

Out [5]: apple::Fruit

In  [6]: orange

Out [6]: orange::Fruit

In  [7]: Fruit(0)

Out [7]: apple::Fruit

In  [8]: Fruit(2)

Out [8]: kiwi::Fruit

In  [9]: f(x::Fruit) = "hey, I'm a Fruit"

Out [9]: f (generic function with 1 method)

In  [10]: f(apple) == "hey, I'm a Fruit"

Out [10]: true

In  [11]: f(Fruit(3))

Out [11]: ERROR: LoadError: invalid enum value for Fruit: 3
while loading In[11], in expression starting on line 1

 in error at error.jl:19
 in call at Enums.jl:22
 in call at Enums.jl:20

The RFC status is because there are still a few design questions that may be worth discussing:

  • Currently negative enum values are allowed (e.g. @enum Fruit apple=-1); I couldn't really think of a reason to disallow them; I think in some languages they're disallowed however
  • There's no overflow checking when assigning default enum member values (e.g. @enum Fruit apple=0xff orange, orange will have value 0x00); I don't think this is a big deal in practice, but while writing tests it came up
  • @JeffBezanson made a comment here about being able to add another subtype under Enums.Enum (could be used in a module for groups of enums, for example); we could allow this through syntax like @enum YellowFruits=Fruit banana lemon, but I'm not sure if that's pretty or useful enough
  • Scoping: in python, you create an enum by creating a class that subtypes enum and enum members are fields of that class; this provides a nice scoping for enum members that belong to the enum and don't leak out into the current module namespace. In this implementation, enum member values are just const enum_value = EnumType(x), which isn't bad, except if you inadvertently do something like @enum MyImportantEnum zero=0 one=1, which overwrites the bindings for the Base functions zero and one. One idea I had for this would be, in the @enum macro call, to wrap the enum type definition and enum member assignments in a baremodule. With allow overloading of a.b field access syntax #1974, you could then access enum member values only by doing something like Fruit.apple (which is python's syntax) and this would access the enum member value within the auto-generated baremodule.

I'd appreciate any comments/feedback.

@quinnj quinnj changed the title RFC: Add Enums module for enum support through the macro RFC: Add Enums module for enum support through the @enum macro Feb 11, 2015

@test_throws MethodError convert(Base.Enums.Enum, 1.0)

Base.Enums.@enum Fruit apple orange kiwi
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to import the @enums macro and then use it bare? I'd expect that we export this from Base if this gets merged.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually is exported from Base in this PR, I just tend to write tests as fully qualified as possible :)

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, although I might argue that the fact that it's exported is something one might want to test for. After all, if it somehow was suddenly deleted from Base's exports that would certainly break a lot of code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point.

@JeffBezanson
Copy link
Sponsor Member

I like this approach, and I think this is fine as-is. I agree scoping is the most significant remaining design issue. However I think we can live with this, as you can already wrap an enum in a module if you want the values hidden better.

Should we automatically pick the smallest integer type that works for the given values? E.g. an enum with 3 values, and no specified integer values, could use Int8.

@quinnj
Copy link
Member Author

quinnj commented Feb 12, 2015

Yeah, I originally thought of picking the smallest value, but went this this approach instead (using promote_type on all the given enum member values). Basically, it lets the user "declare" the types by using whatever they want, within the bounds of promote_type. We could change it, but I thought this was a pretty flexible approach.

@JeffBezanson
Copy link
Sponsor Member

Yes, that approach is good. But in addition to that, we could pick Int8 for @enum Fruit apple orange kiwi, where there is no particular user-requested type.

@quinnj
Copy link
Member Author

quinnj commented Feb 12, 2015

That's true. I'll work on putting that in plus a few other tweaks.

@test ff.n === 0xff
@test overflowed.n === 0x00

@test_throws MethodError eval(macroexpand(:(Base.Enums.@enum Test1 _zerofp=0.0)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing a better way to test error-throwing macros? This felt a little weird.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty much it, but just eval should be sufficient. It does macro expansion.

@quinnj
Copy link
Member Author

quinnj commented Feb 12, 2015

Alright, changed enum field name to val instead of n and implemented the default type "slimming" + updated tests.

@quinnj
Copy link
Member Author

quinnj commented Feb 12, 2015

The CI tests all passed except a timeout on OSX (which I've tested locally). Any other comments/feedback? Responses to the questions I mentioned above?

@StefanKarpinski
Copy link
Sponsor Member

Should we be concerned about C storage compatibility as a default? Is there much call for passing arrays of enums? I can't recall many C APIs that do that, so I'm thinking no.

@jakebolewski
Copy link
Member

I believe the enum type is up to the compiler, the C spec only requires that the enum type be big enough to hold all elements.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Feb 12, 2015

To be sure, C will let you set flags to choose a different size, but you're generally safe assuming int.

Possibly not for arrays, but it's common enough for structs.

@quinnj
Copy link
Member Author

quinnj commented Feb 19, 2015

Any other comments? Good to merge?

@quinnj
Copy link
Member Author

quinnj commented Feb 21, 2015

Rebased and added docs/NEWS entry. I'll merge this weekend unless there are other objections.

@quinnj
Copy link
Member Author

quinnj commented Feb 21, 2015

Hmmmm, linux travis failed trying to download Rmath and OSX travis failed with the error below, which I believe is unrelated:

ERROR: LoadError: MethodError: `<<` has no method matching <<(::Function, ::Function)
Closest candidates are:
  <<(::Any, !Matched::Int32)
  <<(::Any, !Matched::Integer)
 in include at /private/tmp/julia/lib/julia/sys.dylib
 in include_from_node1 at loading.jl:128
 in process_options at /private/tmp/julia/lib/julia/sys.dylib
 in _start at /private/tmp/julia/lib/julia/sys.dylib
while loading /private/tmp/julia/share/julia/test/runtests.jl, in expression starting on line 1

@tkelman
Copy link
Contributor

tkelman commented Feb 21, 2015

Failed in the same way on both win32 and win64 on appveyor, so seems to be related.

quinnj added a commit that referenced this pull request Feb 21, 2015
RFC: Add Enums module for enum support through the `@enum` macro
@quinnj quinnj merged commit 4319353 into master Feb 21, 2015
@jakebolewski
Copy link
Member

Overloading names here to return the enum values is a bit weird. I would expect that names(::Type) would return the field names that I can lookup with getfield. Maybe names should be renamed to fieldnames as it is more descriptive. It would be nice to export the abstract Enum type as well.

The functionality is great, but as you pointed out it is very easy to overwrite functions in Base due to naming conflicts. Perhaps a note in the documentation could be added about this?

@jakebolewski
Copy link
Member

You have already claimed the Enum name, no harm in exporting it.

Makes it easier to write

julia> @enum Foo bar baz

julia> isa(baz, Enum)
ERROR: UndefVarError: Enum not defined

julia> isa(baz, Base.Enums.Enum)
true

On Sat, Feb 21, 2015 at 3:08 PM, Jacob Quinn [email protected]
wrote:

Yeah, I thought about exporting Enum but also couldn't really think of a
use for it, so I left it unexported for now. Changing names to fieldnames
seems like a good idea. I've always been a little thrown off by how
DataFrames overloads names for column names as well.

I'll add a note about overloading Base names.


Reply to this email directly or view it on GitHub
#10168 (comment).

@nalimilan
Copy link
Member

+1 for renaming names to fieldnames or fields. In addition to DataFrames, we need names in the context of NamedArrays (arrays with row/column/dimension names), and I can't find different terminology that would be natural.

@Ken-B
Copy link
Contributor

Ken-B commented Feb 22, 2015

Sorry if it's an obvious question, but why does enum counting start at zero? I would have guessed it starts at one like with Arrays.

To me the documentation also suggests it starts at one.

@jakebolewski
Copy link
Member

C compatibility.

On Sun, Feb 22, 2015 at 4:43 PM, Ken-B [email protected] wrote:

Sorry if it's an obvious question, but why does enum counting start at
zero? I would have guessed it starts at one like with Arrays.

To me the documentation also suggests it starts at one.


Reply to this email directly or view it on GitHub
#10168 (comment).

@jakebolewski
Copy link
Member

Is this the behavior we want?

julia> @enum(Foo, bar=1, baz=1)

julia> bar
barbaz::Foo

julia> is(bar, baz)
true

@quinnj
Copy link
Member Author

quinnj commented Feb 23, 2015

Hmmm....good question. It's the same behavior as C, but C is also much more "simple" in it's implementation of just mapping to integers and that being more transparent. It seems like it could possibly be useful to allow, but also a tad confusing when you do something like convert(Foo,1) with multiple values of 1 defined.

@StefanKarpinski
Copy link
Sponsor Member

I suspect that we should initially err on the side of conservatism and consider this to be an error. Just because C allows it doesn't mean it's a great idea; let's wait and see if anyone actually needs this, and if so, we'll have a little more sense of what it should do. Concatenating the symbol names, for example, seems a bit weird.

@jakebolewski
Copy link
Member

I needed it today (mapping a C enum), but I don't know if we want to retain that behavior. Agree about the concatenating symbols part.

@StefanKarpinski
Copy link
Sponsor Member

Well, what behavior would make sense for your application?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Feb 23, 2015

I've seen it used in Gtk to make aliases (eg they might make one with DOUBLE, TWO, and 2 as a modifier for CLICK. Which one gets used then can be dependent on user choice: limitations of a language binding variable naming rules vs. length)

@jakebolewski
Copy link
Member

For the C interop case I think allowing this behavior and having bar == baz and bar !== baz would make sense, but I don't know if the current design allows this.

@StefanKarpinski
Copy link
Sponsor Member

With immutables, I don't think that's possible (or sensible) – how would you distinguish bar form baz? I think we just need to pick one of the names as the canonical name for the value – either the first or the last seem like the obvious options. In your case, which one was better?

@quinnj
Copy link
Member Author

quinnj commented Feb 23, 2015

Yeah, definitely feels uncomfortable messing with is. We could make the type declaration with type instead of immutable, but that seems like a waste. Even if we picked one as the canonical, do we still make a const baz = bar statement as well? Seems like it would be weird to declare an enum member and have it mysteriously vanish until I dig through all the declarations and realize two happened to have the same value. I'd rather just error when we see two with the same value. Or add the const assignment.

@StefanKarpinski
Copy link
Sponsor Member

It's inherently impossible to make bar !== baz without making enums heavier – each enum value would need to store not only its value but also its symbol somehow. I really don't think we want to do that.

@jakebolewski
Copy link
Member

I'm not really advocating that as a solution, only I feel that would be the desired behavior.

@StefanKarpinski
Copy link
Sponsor Member

Ok, so since that's impossible, I think we just need to pick either the first or last symbol as the canonical one. Which makes more sense in your use case?

@StefanKarpinski
Copy link
Sponsor Member

My experience with C enums makes me think that the first one is probably the one we'd want to go with.

@jakebolewski
Copy link
Member

Usually this crops up to retain backwards compatibility in the API, so picking the first one would make sense.

@StefanKarpinski
Copy link
Sponsor Member

Ok, so there we have it: allow multiple bindings for enum values, but the first one is the canonical one.

@quinnj
Copy link
Member Author

quinnj commented Feb 23, 2015

What about names? Currently, it shows everything that was declared, but with proposed change, bar::Foo would show up twice since it was declared twice, but both are now shown as the first canonical. Should we call unique on the enum member value array for names?

@StefanKarpinski
Copy link
Sponsor Member

Hmm. That's a little tough. It seems like all the alternative names should show up, even if more than one refer to the same value. Either that or only the canonical names should show up.

@jakebolewski
Copy link
Member

I'm coming around to the fact that this should just be an error. I've never come across a situation where this is absolutely necessary. We are just complicating things for convenience.

@StefanKarpinski
Copy link
Sponsor Member

Even better!

@ivarne
Copy link
Sponsor Member

ivarne commented Feb 23, 2015

The most important difference is that Julia provides a two way mapping, but C is restricted to a one way mapping. Assigning multiple names for the same value is convenient in some cases, but isn't the same need served by having the user manually type

@enum(Foo, bar=1)
const Baz == bar

@quinnj
Copy link
Member Author

quinnj commented Feb 23, 2015

I think I'm ok going either way. If we allow duplicates in declaration, then I think names should just return the list of canonical names.

@zenna
Copy link

zenna commented Apr 3, 2015

Is there an established way to automatically export all the values of an enum in a module?

@mauro3 mauro3 mentioned this pull request Jul 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants