-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tagged unions #1132
Labels
Comments
Prerequisite for #1134 |
degory
added a commit
to degory/ghul-vsce
that referenced
this issue
Mar 22, 2024
Enhancements: - Highlight the `union` keyword (see degory/ghul#1132) - Highlight !!! (error) and ??? (not inferred) types as invalid / red in in hover
degory
added a commit
to degory/ghul-vsce
that referenced
this issue
Mar 22, 2024
Enhancements: - Highlight the `union` keyword (see degory/ghul#1132) - Highlight !!! (error) and ??? (not inferred) types as invalid / red in in hover
6 tasks
Merged
degory
added a commit
that referenced
this issue
Mar 26, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Tagged Union Types
Introduction
This issue proposes the addition of tagged union types to the ghūl programming language. Tagged unions (also known as discriminated unions or sum types) allow defining a type as a union of distinct cases, each of which can carry its own data.
Syntax
The syntax for defining tagged unions in ghūl is:
union
keyword followed the union name, then an optional list of generic type parameters, and finally anis
/si
delimited bodyname: type
pairs.Semantics
NONE
inOption[T]
).Option[T]
,Result[T, E]
).Immutability
Construction
Unions are constructed using the
new
keyword followed by the qualified variant name (including any necessary actual generic type arguments) and any necessary field values. Note that type arguments are applied to the union type, not the variant:Although unit variants contain no fields, construction still requires parentheses, for consistency with use of
new
with other parameterless constructors in ghūl:For convenience, constructor functions can be defined for commonly used unions, to avoid having to specify generic argument types:
Representation
The compiler can choose the most efficient representation for each union type based on its characteristics:
struct
) can be used, with a discriminator field to identify the active variant and fields for the variant data. In particular unions with only two variants, one of which is a unit variant are an obvious candidate for struct representation.class
) hierarchy can be used, with a base class for the union and derived classes for each variant.The initial implementation may be limited representing a union and variants as a base class and derived classes, in which case struct representation support will be delivered under an separate issue.
Requirements Checklist
union
keyword, name, optional generic type parameters, andis
/si
delimited bodynew
keyword, qualified variant name, generic type arguments applied to union type, and field valuesOptional:
Implementation Notes
The parser, the type expression syntax trees, and the generic type system all need some changes in order to handle variants of a generic union type.
Type Expression Syntax Trees
Need to update syntax trees so we can represent type member expressions where the part to the left of the dot is a generic type application
Parser
Need to update the parser to handle type member expressions where the part to the left of the dot is a generic type application
Type System and Generic Specialization
Specialization of variants of generic unions needs special handling.
In principle we could specialize variants in the usual way, treating them as members of their owning union and replacing all references to the parent unions formal type arguments within the variant with the actual type arguments supplied in the generic application type expression.
However, the .NET representation of the variants will be as classes in their own right. Hence it's more appropriate to make them generic classes with the same formal type arguments as the owning union as this will better align with the actual .NET representation.
So
will be represented as
Implementation Notes
The parser, the type expression syntax trees, and the generic type system all need some changes in order to handle variants of a generic union type.
Type Expression Syntax Trees and Parser
Need to update syntax trees and the parser to represent and handle type member expressions where the part to the left of the dot is a generic type application, such as
Option[T].SOME
.Type System and Generic Specialization
Specialization of variants of generic unions needs special handling.
In principle, we could specialize variants in the usual way, treating them as members of their owning union and replacing all references to the parent union's formal type arguments within the variant with the actual type arguments supplied in the generic application type expression.
However, the .NET representation of the variants will be as classes in their own right. Hence, it's more appropriate to make them generic classes with the same formal type arguments as the owning union, as this will better align with the actual .NET representation.
For example:
will be represented as:
When declaring symbols, we need to ensure:
System.Object
Because variants contain no code that could reference any symbols, there's actually no need for them to be inside their parent's scope. It may actually be easier, however, not to attempt to break them out of it - it likely won't matter in practice either way. Plus, the one symbol in the parent we do potentially want to reference from the variants (the tag, which we might need to reference from compiler-generated code) will be inherited.
For a type expression like
Option[int].SOME
, we can't construct a GENERIC type or a GENERIC symbol that represents it, nor do we want to since we actually want to apply the type arguments to the variant, not the union. Hence, we should transformOption[int].SOME
intoOption.SOME[int]
.We could do this blindly in the parser, before we even know what kind of symbol
Option
will be declared as, since the only kinds of generic types that contain (accessible) type members will be unions. This wouldn't be very future-proof, however, and it might result in confusing error messages.It would be preferable to defer rewriting
Option[int].SOME
intoOption.SOME[int]
until the resolve-types phase, at which point we'll be able to check thatOption
really is a union, that it's generic and takes one type parameter, and thatSOME
is a member of it. We can then pretend the user had actually written a qualified identifier generic type applicationOption.SOME[int]
and proceed to specialize that. In the case of any errors resolving the type, though, we will still have the information needed to report it asOption[int].SOME
.We should also notify the symbol use listener that both
Option
andSOME
are referenced here, to ensure hover, symbol navigation, and rename all work (in particular, rename will fail to renameOption
otherwise as it won't be referenced in any phase after resolve-types).For a type expression
Option[int].SOME
, we should not attempt to specializeOption[int]
(making a copy ofOption[T]
withint
substituted forT
) and search for memberSOME
within it. Instead, we should search the unspecialized non-genericOption
class for its memberSOME[T]
and then specialize that with actual type argumentint
.Within the body of a variant, any references in type expressions to other variants of the containing union should not be legal. References to the containing union are allowed - this is required to support recursive union types such as lists and trees. If a union is generic, then references to its type within the body of one of its variants need appropriate formal type arguments, which can (and probably should) be the formal type arguments of the union itself.
We need to synthesize an
init(...)
method for each variant that takes an argument for each of the variant's fields and then initializes the fields accordinglyThis can be done similarly to synthesized read and assign accessor methods - we can just construct appropriate syntax trees and inject them into the variant's syntax tree before the define symbols pass (e.g. in add accessors for properties pass)
However, unit variant instances need to be singletons and this complicates things. If the code that handles
new
expressions needs to be updated to handle this case, then perhaps instead we can add two new IR values - one for instantiating variants with arguments and the other for getting the singleton value for a given unit variantImplementation Checklist
Option[T].SOME
)init()
methodOption[T].SOME
toOption.SOME[T]
SOME
is not found inOption[T]
, not thatSOME[T]
isn't found inOption
)Option[int].SOME
, to support hover, symbol navigation, and rename functionalityinit()
method, by a new IR value, or by an innate callThe text was updated successfully, but these errors were encountered: