Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adds enumerations #695

Open
wants to merge 1 commit into
base: wdl-1.3
Choose a base branch
from
Open

Conversation

claymcleod
Copy link
Collaborator

@claymcleod claymcleod commented Jan 15, 2025

This commit adds enumerations to WDL.

Based on my discussions and browsing previous threads surround the idea, it sounds like the value of adding enumerations to WDL is nearly universally agreed upon as an improvement regarding the user experience for limiting assignment to a set of valid values within a particular context; it's simply the details that need to be hashed out.

In this PR, enumerations in WDL are valued, meaning that they have an assigned valued for each variant therein. These values can either be explicitly or implicitly typed. When it comes to the details, I try to take a common sense approach and a "middle of the road" position on the spectrum of conciseness versus flexibility. In particular:

  • The inner types can be elided if they are trivially computed (leans towards conciseness).
  • When you'd like to be explicit with the inner type, you can be (leans towards flexibility).
  • Enumeration variants with no assigned value are assumed to be typed as a String with an assigned value equivalent to the variant name (leans towards common sense and conciseness).
  • Enumerations that have the same type can be compared by their inner value. Enumerations that are not of the same type cannot be compared, as that is equivalent to a type mismatch (leans towards common sense approach, as comparing two enums of different types implies different contexts, even if the inner values of the variants are the same).

Other detailed notes

On the removal of the .name accessor

In a previous iteration of the enum concept, the stringified variant name could be accessed with a .name accessor. With the adoption of flexibly valued enums, I'd argue that an enforced separation of concerns between the variant name and the assigned value for the variant is more clear:

  • The variant name itself is only relevant within the context of the enum within the type system. Strictly speaking, it is not intended to be serialized and has no meaning outside the type system in and of itself.
  • When a variant is evaluated (say, via serialization), the inner value assigned to that variant is substituted into the context. This allows for flexibility of a variety of situations based on how you design the enum. For example, you could use an enum like this to evaluate a verbosity flag:
enum VerbosityFlag {
  Quiet = "",
  Info = "-v",
  Debug = "-vv",
  Trace = "-vvv"
}
  • Furthermore, the behavior of the old model is still represented by the default case in the current model: simply define the enum with no type and assigned value, and you will be able to access the stringified version of the variant via the assigned value.

Checklist

  • Pull request details were added to CHANGELOG.md
  • Valid examples WDL's were added or updated to the SPEC.md (see the guide on writing markdown tests)

@claymcleod claymcleod force-pushed the enums branch 3 times, most recently from 94cb3e3 to 53d28e9 Compare January 15, 2025 06:07
@claymcleod claymcleod self-assigned this Jan 15, 2025
@claymcleod claymcleod force-pushed the enums branch 4 times, most recently from 6bc8f2b to 92e82ac Compare January 16, 2025 05:40
@claymcleod claymcleod marked this pull request as ready for review January 16, 2025 05:44
@claymcleod
Copy link
Collaborator Author

I think this is ready for a preliminary review—I haven't added any tests yet, but that's because I want to make sure the idea, as I have codified it here, has support from a good number of people before investing the time to do that.

@claymcleod claymcleod changed the title feat: add enums feat: adds enumerations Jan 16, 2025
This commit adds enumerations to WDL. Enumerations in WDL are valued,
meaning that they have an assigned valued for each variant therein.
These values can either be explicitly or implicitly typed. Ultimately,
enumerations are aimed at improved UX regarding a limited set of valid
values within a particular context.

Relevant issues: openwdl#139, openwdl#658

Closes openwdl#139.

Co-authored-by: jdidion <[email protected]>
Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the proposal. I think you should add that Enum values cannot be mutated or reassigned (they are immutable) and that Enums are closed once declared (you can't extend them to add new values).

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question, can the values of an enum be scattered? In most programming languages Enums are iterable.

Comment on lines +3438 to +3443
# ERROR: because the enum is implicitly typed, the type cannot be unambiguously
# resolved, which results in an error.
enum FavoriteNumber {
ThreePointOh = 3,
FourPointOh = 4.0
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given other places where we deduce a common type in WDL (e.g. [1, 2.0] -> Array[Float]), I would expect this to unambiguously resolve to Float as all of the assigned types coerce to it, rather than be considered an error.

I would expect it to error if the first assignment was to "3", for example.

Copy link
Contributor

@peterhuene peterhuene Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, given that the type parameter to the enum declaration is required to be primitive and that there aren't that many primitive type coercions, I wonder if we can do without supporting an explicit syntax, at least initially.

The only use I can foresee with explicitly specifying the type would be if one wanted to have an enum of File or Directory, which makes little sense to me (an enum of String would have its values implicitly coerced to File and Directory where needed anyway).

Copy link
Contributor

@peterhuene peterhuene Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, reading it again, there doesn't seem to be an explicit requirement that a type in the explicit syntax be primitive. Can we do something like:

enum Foo[Array[String]] {
   Foo = ["foo", "bar", "baz"],
   Bar = ["qux", "quux", "quuux"],
}

?

If this is desired, then we probably should support the explicit syntax. If not, I personally think we can eliminate it.

Additionally, we should call out that the RHS expression in the assignment must only contain literal expressions and that string interpolation isn't supported in this context as these are global declarations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given other places where we deduce a common type in WDL (e.g. [1, 2.0] -> Array[Float]), I would expect this to unambiguously resolve to Float as all of the assigned types coerce to it, rather than be considered an error.

I would expect it to error if the first assignment was to "3", for example.

Ah yes, you are correct. I will update this example.

Copy link
Collaborator Author

@claymcleod claymcleod Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, reading it again, there doesn't seem to be an explicit requirement that a type in the explicit syntax be primitive. Can we do something like:

enum Foo[Array[String]] {

   Foo = ["foo", "bar", "baz"],

   Bar = ["qux", "quux", "quuux"],

}

?

If this is desired, then we probably should support the explicit syntax. If not, I personally think we can eliminate it.

Additionally, we should call out that the RHS expression in the assignment must only contain literal expressions and that string interpolation isn't supported in this context as these are global declarations.

Exactly. I can see a map being useful: for example, perhaps you have an enum that can help you scatter over all contigs versus just a defined list of canonical contigs.

That being said, perhaps it's not required to explicitly define non-primitive types as well. For instance, the above is unequivocally resolvable to the Array[String]—would it be fair to just say the type must be unambiguously resolved?

I'll add the clarification on string interpolation.

Copy link
Contributor

@peterhuene peterhuene Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be unambiguous to deduce a type or otherwise error; the only catch is File and Directory since they don't have a literal type (but otherwise coerce from String just fine).

@claymcleod
Copy link
Collaborator Author

claymcleod commented Jan 16, 2025

Open question, can the values of an enum be scattered? In most programming languages Enums are iterable.

Great point. In keeping with the spirit of what was done for similar methods, like getting the keys of an object, I will add a variants standard library function.

@markjschreiber
Copy link
Contributor

Could the function just be called enumerate

@claymcleod
Copy link
Collaborator Author

claymcleod commented Jan 16, 2025

Could the function just be called enumerate

Personally, I think it makes me sense to call it variants. This is to be consistent with the choice of the function name matching the concept name for enumeration of those concepts within Objects. For example, keys defines what is being enumerated, not the literal name of the enumeration action (enumerate, enumerate_keys, enumerate_values). Since each member of an enum is called a "variant", the name variants would align with the pre-existing convention. Furthermore, you might want to enumerate more than just the variants, so it has the added benefit of being co-introduced with an overload to the values function for an enum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants