Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounded polymorphism and typeclasses - ideas #502

Open
atacratic opened this issue May 17, 2019 · 18 comments
Open

Bounded polymorphism and typeclasses - ideas #502

atacratic opened this issue May 17, 2019 · 18 comments
Labels

Comments

@atacratic
Copy link
Contributor

atacratic commented May 17, 2019

I wanted to throw in my tuppence on what Unison could do about typeclasses (see comment below) so I'm opening this issue as a place to put such things.

I'm aware that this is one of those topics that can generate endless discussion, so I propose that this issue just be for concrete proposals and observations, rather than opinions and preferences. Even so maybe Paul will just want to close it...

Quoting this post from the blog for starters:

What about bounded polymorphism and typeclasses? This deserves a longer post, but here are a few answers:

  • For just name overloading, Unison supports that via type-directed name resolution.
    • For effectful programming, you use algebraic effects, no need to pass dictionaries around.
    • Functions like <, ==, etc, work for any type (actually we’ll probably restrict their type a bit just to keep parametricity intact). But these are builtins, you can’t implement your own (could we somehow make that possible though?).
    • The only thing that Unison has which is as general as typeclasses is explicit dictionary passing. This can sometimes be a bit tedious.
    • Typeclasses are super convenient but I don’t think that as is they’re a good fit for Unison, and they have problems even in the Haskell world (relevant post). Also, not for nothing: they’re a pain to implement.
    • Summary: We’re thinking about it. If you’ve got ideas, let us know!

There was an interesting discussion around typeclasses after the meetup (some folks headed over to a local pub for drinks and food). All the approaches to bounded polymorphism (typeclasses a la Haskell, modules a la ML / Scala, dictionary passing + implicit parameters, OO style subtyping ..) have strengths and weaknesses. And that’s the problem: there’s no approach that’s a total slam dunk, which makes you think “YES, PROBLEM 100% SOLVED.” Every approach has its issues. I’m hoping for some bolt of insight about this…

[edit] As of 29 Aug '19 this issue contains two proposals:

  1. A proposal based around typeclass constraints being encoded as regular function parameters - starting at the first post below.
  2. A proposal to use abilities as a typeclass mechanism, so encoding a constraint as an ability requirement in the function's signature.
@atacratic
Copy link
Contributor Author

atacratic commented May 17, 2019

This is a proposal for adding typeclasses to Unison, highlighting differences to the Haskell system. The basic concepts of classes, instances and (appropriately sugared) dictionary-passing are all there, with some variations.

What do people think? What do people see as the main drawbacks?

The basic gist is

  • all instances are named - no canonical instances
  • do it like Idris does: dictionaries are fundamentally just function arguments, that you can control explicitly, but which (due to some funky syntax sugar) you often don't need to.

All instances are named instances

We want Unison code to live in an 'open universe', where (as Paul said here) "people can after the fact discover common structure and layer it onto existing data types".

So it's a problem if we let one author define a canonical instance for a type - any instance someone else defines is now second-class.

So let's not have any canonical instances. Let's say all instances are named, and the coder needs to choose which they want to use.

So to define an instance you'd say something like

[showFoo] Show Foo where
  show (Foo x) = "Foo (" ++ show x ++ ")"

and then your instance is named showFoo. This just defines a regular term of type Show Foo.

One upside of letting go of canonical instances, is we're no longer steered towards making unnatural choices: we don't have to choose whether the default Monoid Int is doing addition or multiplication, for example.

And particularly for Unison, it's good not to have to deal with the environment data of 'which typeclass instance goes with which type': we don't have modules/files to put this information in, and we certainly don't want to have that information existing separately for every term.

The user can see and control the dictionary-passing, when they need to

The choice and plumbing of dictionaries needs to be under the coder's control. So they can specify them explicitly, and sometimes this is mandatory: Unison will never silently assume that the coder means to use showFoo rather than some other instance.

Explicitly visible dictionary arguments

Suppose we have a typeclass method like this...

show : Show a => a -> String

Now consider these definitions, and how they're explicitly passing and pattern-matching on dictionary arguments, using a @{} syntax.

hi1 : Foo -> String
hi1 x = "Hi " ++ (show @{showFoo} x)

hi2 : Show a => a -> String
hi2 @{s} x = "Hi " ++ (show @{s} x)

The => arrow and the @{} are just sugar - ultimately these are just function arguments like any other, except these argument positions (the first arguments of show and hi2) have been flagged as getting this special syntactic treatment.

Elided dictionary arguments

To make a given instance the default in some scope, you write a use statement.

hi3 : Foo -> String
hi3 x = use showFoo : Show Foo     -- pretty-printed code looks like this,
        "Hi " ++ (show x)          -- but the type annotation is actually optional

In the following case, a use statement is unnecessary.

hi4 : Show a => a -> Bool          -- the Show a argument is not optional on the signature,
hi4 x = "Hi " ++ (show x)          -- but it is optional in the pattern match, and if omitted
                                   -- is treated as if we'd done a `use` on it: it's treated as
                                   -- being the canonical instance, for the scope of this function.

There's a nice parallel between use statements for resolving fully qualified names, and a use statement that's controlling the resolution of a dictionary parameter. In both cases,

  • the underlying Unison ABT we're creating is always full resolved and specified
  • it is being presented to the user in an abbreviated way
  • resolving the abbreviation is local - you don't have to look outside the function to see what's going on.

The pretty-printer will render the code clearly even in cases where there are several instances in play for the same type:

hi5 : Bool -> Foo -> String
hi5 full x = if full then
               use showFooFull : Show Foo
               "Hi " ++ (show x)
             else
               use showFoo : Show Foo
               "Hi " ++ (show x)

Classes as records

A class declaration is just a record declaration but where some of the accessors use => instead of ->.

So whereas

type Show a = { show : a -> String }

produces (amongst other things) an accessor function

show : Show a -> a -> String

if we instead make it a class

class Show a = { show : a -> String }

then the generated accessor is:

show : Show a => a -> String

Coherence

There isn't any 😄 Or rather, if your function consumes/produces data that includes a choice of instance amongst its invariants, you'd better say so in your API docs, like this:

-- For both the input set and output map, `use OrdFoo : Ord Foo`
bar : Set Foo -> (Foo -> v) -> (Foo -> v) -> Map Foo v

(Hopefully comments like that can be attached to the argument/result in question, and in a standard format.)

Maybe it's because I'm not a true haskeller, but I don't see this as a problem. Of all the various invariants and assumptions we code into the data we work with, which we don't manage to capture in types, questions of which instance we're using are only a small minority.

Maybe there's some future in which a dependently-typed Unison lets us reach greater perfection by encoding instance choice in types, but til then I don't think we need to sweat it.

Subclasses?

Ideally you'd only need to specify Ord k, not also Eq k. Haskell typeclasses can be extended by subclasses to capture this. Would we want the same?

Maybe we should look into more general record subtyping / row polymorphism / extensible records stuff as an approach to this? So, one record type extending another when it has a superset of the fields. To me, record subtyping seems to be a general setting into which interface extension fits quite nicely, so I like this idea, but I am a bit scared off by what an ordeal it seems to be for haskell.

This is a big question, but maybe for starters we should see how things shake out without doing anything special here?

Background reading

Terminology

In the above, I've used 'class' and 'instance' to make it more familiar. But I'd actually propose using the Idris terminology of 'interface' and 'implementation' instead. It emphasises that it's fine to have multiple interfaces to a given type - rather than a type belonging irrevocably to a given class in just one way.

Also we wouldn't actually talk about 'dictionaries', but probably 'interface arguments' or 'interface parameters'.

Less important notes

  • Maybe we wouldn't bother with the
[showFoo] Show Foo where
  show (Foo x) = "Foo (" ++ show x ++ ")"

syntax, but instead make it even clearer what's going on: we're just defining a term.

showFoo : Show Foo
showFoo = Foo {
  show (Foo x) = "Foo (" ++ show x ++ ")"
}
  • The @{arg} syntax comes from Idris. Idris also permits any argument to be named, like (age : Int), see here for more. I'd be interested in having that in Unison (even though the main motivator is dependent types which we don't have.)

  • With the use showFoo : Show Foo:

    • maybe we wouldn't choose the keyword use since we're already using that for FQN imports
    • I was thinking that the type annotation is optional, but checked if present. And that the pretty-printer always emits the annotation.
  • I'm not a huge fan of the => arrow, flagging a dictionary argument on its left. (Why is it indicating something about the thing to its left when it's pointing to the right?) Also it makes things look more mysterious and less like regular arguments, which is the intuition I think we should be going for. However I can't see anything better. Other ideas:

    • show : @{Show a} -> a -> String
    • show : {Show a} -> a -> String
    • show : auto Show a -> a -> String
    • show : interface Show a -> a -> String
  • Note you don't have to have the dictionary arguments all on the left. You can have:
    baz : Ord v => Ord k1 => Map k1 v -> Ord k2 => Map k2 v -> Set v

  • (Would we want baz' : (Ord v, Ord k) => Map k v -> Set v syntax?)

  • You can pattern match on a dictionary argument, and then use it, if you want to:

hi2' : Show a => a -> Bool
hi2' @{s} x = use s : Show a
              "Hi " ++ (show x)
  • Maybe one day a user can declare their preferences for what instances they want to be their defaults. 'Yes, for me, Monoid Int shall by default be addition.' Then those preferences could be factored in by the pretty-printer: drop use statements that agree with their preferences, and keep those which don't. So this is seeing instance defaults as just another 'view preference', rather than a property of the code itself.

@pchiusano
Copy link
Member

Interesting proposal! Will noodle on it.

@atacratic
Copy link
Contributor Author

A few issues/embellishments on the proposal above...

View preference

I was pondering the following example, and doubting whether it's really OK to make the user type out all the instances they're using!

hi6 : [Foo] -> String
hi6 x = use showFoo : Show Foo
        use showList : Show [a]
        "Hi " ++ (show x)

Making the user spell out those two use statements is a usability penalty, and will make Haskell people think 'oh please, this is painful...' (although there is an upside for comprehensibility coming from the locality property).

The notes above suggest one approach to making this less verbose: allow the user to specify their preferred default instances as part of some 'view preferences' - kind of their own personal lens through which they view the code. Given the usability worry, I'm now thinking this is a necessary part of this proposal, rather than just an add-on, so I'd just like to develop the idea a little bit further.

Suppose you could configure Unison with a preference to automatically use (unique) instances from some parts of the branch, say everything underneath Unison.Prelude plus whatever else. ('Unique' meaning that if there are two instances matching some type then they are both ignored, and the user has to say explicitly which they want in their code.) Then you'd get back to...

hi7 : [Text] -> String
hi7 x = "Hi " ++ (show x)  -- `use` statements not necessary assuming Unison.Prelude has a unique
                           -- `Show Text` and a unique `Show [a]`.

If someone else has different view preferences then they'd see the code pretty-printed with use showText : Show Text etc.

Wiring together instances

Back in the hi6 example above, note that Unison is working out how to concoct a Show [Foo] itself - it doesn't make you specify showList @{showFoo} by hand.

I'm not sure where it happens, but somewhere between saying that Unison can do that trick, and allowing the user to throw in use statements and view preferences with any (non-conflicting) instances they want, we probably give Unison a tricky job to do, working out which instances to combine into Show [Foo] and the like, and how to do it. Clearly it can be done, because Haskell does it. Is this the "not for nothing: they’re a pain to implement" from Paul's quote above?

Is there any world in which we side-step cases that make this hard, at least for now? Maybe by restricting what instances can be in play simultaneously? Or making the user wire up more challenging cases manually?

Zero-parameter typeclasses

Zero-parameter typeclasses seem like they'd be useful, and I can't see why they shouldn't be allowed. It's useful to program to an interface, regardless of whether it's an interface to a specific type.

This makes more sense once you're calling them 'interfaces' rather than 'typeclasses'.

Example:

class Strategy = {
  foo : Bar -> Baz
  -- plus more methods... (otherwise you'd just pass a function into `eg`)
}

eg : Strategy => Bar ->{IO} () 
eg b = let b' = foo b in
      ...

@TomasMikula
Copy link
Contributor

What I find compelling about this proposal is how little it requires of the representation of typechecked Unison programs—just one flag on function parameters indicating whether the actual argument might be left to the compiler to figure out. The exact strategy for figuring out these implicit arguments can be left to the compiler front-end (and possibly differ by the particular Unison dialect).

@atacratic
Copy link
Contributor Author

I want to list a few of the properties of the proposal sketched out above, partly because that helps think about it and assess it, and partly because I've got a proposal for how subclasses could work (further down this post), which threatens one of the properties.

Properties of this proposal

  1. Instances are just values (with a bit of syntax sugar around argument-passing). The user can understand and control them - no magic.
  2. Classes are just record types, the same as a user can define. The user can interact with and destructure instances like any other data.
  3. Open-world: not having canonical instances means that anyone can define an instance for any type, and it will be as 'first class' as anyone else's.
  4. Minimal footprint on the ABT, as @TomasMikula said above.
  5. Instance selection is explicit and can be reasoned about locally (although the instance selection view preference element weakens this a bit).
  6. Coherence is entirely the user's problem.

Note that it's not obvious why you'd want or need properties 1 and 2.

  • Haskell does without them: it treats them as completely special separate beasts in the type system, and then notes the happy coincidence that they have a 'helpful operational interpretation' as dictionaries (last para of 3.1 in the SPJ et al 'exploring the design space' paper).
  • Idris has property (1), but if it has property (2) then it's only internally - it's not visible to the user, they can't (I don't think) destructure instances like any old data.
  • I think in this proposal, as in Idris, (1) is necessary, since you could be working with a range of named instances of the same type.
  • (2) is more questionable. Why would the user need to understand the desugared internal representation of an instance, or pattern match on its fields? I think it's the 'no magic' principle that draws me to (2) - knowing how something is elaborated down to a simple core system is actually a pretty powerful way of understanding and reasoning about it. But there might be a 'constraints liberate' argument for hiding this stuff, maybe.

I dwell on this because the subclassing proposal below is janky enough in how it represents class types, that you might want to hide that under the covers, so losing property (2).

Subclasses again

So far this proposal has a gap around subclassing. (I should probably be calling it 'class extension'.) I said:

  • maybe we could get subclasses via extensible records (if your record value literally is both an Eq k and an Ord k, then it's easy)
  • maybe we don't need subclasses anyway?

I'm pretty ignorant about the technicalities of extensible records, but I have a suspicion that they'd be a nice approach to this (and you could maintain property (2) nicely, if that was important to you). Also the fact that elm and purescript both have them (1 - last three screenshots, 2) makes me think they'd be worth looking into anyway. But maybe it's a bit drastic to demand we add them to Unison.

The next section sketches out a way of getting subclassing without extensible records.

An approach to subclasses via products

It's a bit janky in a couple of respects, but we could get typeclass extension by making the witness types into products. Paul alluded to this in one of his comments under here.

Suppose for starters we've got the basic Eq typeclass.

class Eq a = {
  (==) : a -> a -> Boolean
}

And suppose we extend it as follows.

class Eq a => Ord a = {
  compare : a -> a -> Ordering
}

Let's say that this desugars to a product type, something like...

class Ord a = (Eq a, {
  compare : a -> a -> Ordering
})
-- (supposing here for the sake of exposition some kind of anonymous
-- inline record declaration syntax, but that's not essential)

...and generates the following accessors.

Ord.== : Ord a => a -> a -> Boolean
(Ord.==) @{o} = (Eq.==) @{fst o}

Ord.compare : Ord a => a -> a -> Ordering
Ord.compare @{o} = compare (snd o)

Ord.getEq : Ord a => Eq a
Ord.getEq @{o} = fst o

Maybe to define an instance you'd say something like this:

[OrdFoo] (eqFoo : Eq Foo) => Ord Foo where
  compare (Foo x) (Foo y) = ...

When actually using these classes, the interesting case is when you have an Ord but need an Eq. Note that in the following example this is no problem.

findExact : Ord k => k -> Tree k v -> Optional (k, v)
findExact k t = let f = (rk, _) -> if rk == k then Some k else None
                in join $ map f (find k t)

Even though we're using == in the definition of f, we don't need an Eq k, because we generated the accessor Ord.== which takes an Ord k, and goes and finds the underlying Eq k instance itself.

But in the following example we can't dodge the need for an Eq k.

foo : Eq k => k -> Int
foo = ...

baz : Ord k => k -> Int
baz = let f = foo @{getEq} k in ...

The user would have to specify how to go from the Ord k to the Eq k, using getEq.

Now, you could imagine putting a little bit of type-driven smarts into the compiler, for it to work this out itself, but I'm not assuming that in this sketch, since it's getting into dangerous 'implicit conversions' territory. Perhaps in the restricted case of populating a typeclass parameter it would be OK.

Less important detail

  • Purescript reverses the arrow in class Eq a => Ord a, making it instead class Eq a <= Ord a, so that => can always be read as logical implication. Looks a bit weird to me but maybe something to consider.

@ChrisPenner
Copy link
Contributor

For super/subclassing it may be sufficient to provide constraint aliases; then you can do something like this:

class Eq a where
a == b = ...

class Compare a where
compare a b = ...

type Ord = (Eq a, Compare a)

This happens in Haskell from time to time when you have a ball of constraints that are typically all satisfied together. This means instances of Compare would specify an Eq constraint manually if they need it, but using Ord literally just means Compare + Eq.

@TomasMikula
Copy link
Contributor

TomasMikula commented May 29, 2019

One interesting situation is when there are two subclasses of a superclass in scope and the code calls a function from the superclass, as in

class Eq a = { ... }
class Eq a => Ord a = { ... }
class Eq a => Hash a = { ... }

foo : Ord a, Hash a => a -> a -> Boolean
foo x y = x == y -- ambiguous ==, one from Ord, one from Hash

This is not a problem when there is typeclass coherence. Without coherence, it would be worthwhile to consider how any given proposal would resolve this issue.

@pchiusano
Copy link
Member

Hey just a quick comment, I think this discussion is interesting and please continue (I'm reading everything) but just FYI we probably won't have time to do much with this in the short term, are focused on more basic stuff for M1. :)

@TomasMikula
Copy link
Contributor

TomasMikula commented May 29, 2019

Another variation would be to make class extension only a matter of code generation. That is,

class Eq a = {
  (==) : a -> a -> Boolean
}

class Eq a => Ord a = {
  compare : a -> a -> Ordering
}

would desugar to

class Eq a = {
  (==) : a -> a -> Boolean
}

class Ord a = { -- note no more relation to Eq
  (==) : a -> a -> Boolean
  compare : a -> a -> Ordering
}

Like extensible records, this retains property (2) classes are just record types, but otherwise needs explicit conversion from Ord to Eq, like the product encoding.

@atennapel
Copy link

@atacratic
Copy link
Contributor Author

On the topic of approaches to this via abilities/effects, see slack discussions starting with the idea from @runarorama here.

@atacratic
Copy link
Contributor Author

This comment discusses @runarorama's idea of using abilities as Unison's take on typeclasses. This is my current favorite approach.

I initially thought that the ability mechanism was too powerful to be used as a typeclass system, like trimming flowers with a chainsaw... but now I'm seeing abilities as a generalization of typeclasses, and have a feeling of 'oh, the answer was in front us all along'.

My main worry is around performance of the generated code, thinking ahead to the day when that's the sort of thing we worry about! I think we'd need a path to being able to say that 'typeclass-style' abilities don't give you a performance penalty on account of using the ability mechanism. That line of thinking (which I'm not very confident in) has brought me round to saying that typeclass-style abilities should get their own syntax (maybe interface Monoid a where ... instance textMonoid : Monoid text where), but the text below only adopts that idea towards the end.

In a nutshell, the proposal is:

  • just define an ability for your typeclass, e.g. ability Monoid a (or maybe interface Monoid a)
  • deal with it just like you would any ability, except
    • use a new language facility called tilde conversion (written up in Automatic code generation via 'tilde conversions', enabling abilities as typeclasses #765) so that instead of writing code like handle monoidText in mconcat xs you can just write ~(mconcat xs) (there's some configuration so that the ~ can find your monoidText)
    • maybe there's a kind of instance declaration syntax so you don't have to write full-blown handlers for this kind of ability.

Abilities as typeclasses - what it could look like

Let's step through how we'd express a Monoid a typeclass in Unison.

Ability declaration

use .base
use .base.io

ability Monoid a where
  mempty : a
  mappend : a -> a -> a

We use an ability as our class declaration. This feels nice. An ability declaration looks just like a Haskell class declaration.

Using a method

Now let's use our typeclass:

mconcat : [a] ->{Monoid a} a
mconcat xs = List.foldb id mappend Monoid.mempty xs

We've expressed mconcat's Monoid a constraint as an ability requirement. That puts the Monoid a in a different place in the signature than you see in Haskell, but that's fine. It's a bit strange to have to attach the Monoid a to a specific function arrow though. That's a consequence of the fact that abilities are more general than classes, allowing their methods to cause effects during computation, rather than just evaluate to values.

Overall this also feels nice.

Calling a polymorphic function

We've got our function using bounded polymorphism, now let's try and call it at a specific known type.

hi : [Text] -> Text
hi xs = handle monoidText in mconcat ("Hi " +: xs)

This is less nice. To avoid having a ->{Monoid Text} leaking into the signature of hi, we've had to write a handle expression wrapping the call to mconcat, specifically naming a handler that knows how to do Monoid Text. I think for this approach to be viable, we need to find a way to save the user from having to write, or even see, this handle expression.

Haskell doesn't make you do type anything at all here: when a typeclass constraint can be resolved to something with no type variables, it goes and finds/constructs an instance for you to satisfy the constraint, without you having to say anything.

Would we want to be able to write the following?

hi' : [Text] -> Text
hi' xs = mconcat ("Hi " +: xs)

For that to work we'd need to have told Unison that

  • we're happy for it to convert from '{Monoid Text, e} a to '{e} a silently wherever it can
  • there is a handler of type Request (Monoid Text) a -> a to help it do that.

By same the argument as mentioned in 'All instances are named instances' further up this thread, I don't think we want a public canonical handler Request (Monoid Text) a -> a, because it demotes other people's handlers to second class status.

But we could have a personal one - say you tell ucm a namespace containing handlers to use for this sort of thing. But as soon as the choice of handler becomes something variable like this, you're going to need to print it sometimes. If someone had a different preferred handler than you, then you want to see their code printed as handle theirMonoidText in mconcat ("Hi " +: xs).

I think it's too tricksy to sometimes elide the handler entirely, and sometimes not. Given that there's semantically meaningful information lurking in the program around the call to mconcat we've got to give at least some cue to that effect, always.

I'm proposing the least intrusive possible cue, a single character 'conversion' operator:

hi'' : [Text] -> Text
hi'' xs = ~(mconcat ("Hi " +: xs))

Note the ~ (a tilde) around the mconcat call. That's an instruction to use some configured canonical handler information to cook up a '{Monoid Text, e} a -> '{e} a conversion. If you're viewing code from someone else with a different configured handler, then you'll see their full code, no ~.

The ~ idea is developed at length in #765, which calls it 'tilde conversion'. It does more than it might seem at first glance: for example, it needs to be able to cook up a handler for Show [Text], given a handler for Show Text and a handler for Show [a]. I'm also proposing that the conversion mechanism be more general that just eliminating effects - rather that it be able to construct chains of conversions between types. This extra generality is debatable, and the use cases to motivate it are not super-persuasive, but it seems arbitrary to restrict it to conversions of the form '{E, e} a -> '{e} a.

Defining a handler

Here's how that looks:

monoidText : Request (Monoid Text) a -> a
monoidText r = case r of
  { Monoid.mempty -> k } -> handle monoidText in k ""
  { Monoid.mappend x y -> k} -> handle monoidText in k (x ++ y)
  { a } -> a

(Note how it has a special form in which each case uses k once, passes the result to a recursive handle expression, then returns - this is how all typeclass-style handlers will look.)

Plus throw in a bit of extra ceremony required by the tilde conversion mechanism (or I guess we could make Unison do this under the covers):

.base.conversions.monoidTextConversion : '{Monoid Text, e} a -> '{e} a
.base.conversions.monoidTextConversion a = '(handle monoidText in !a)

Yuk, seven (or five) lines of obscure stuff, just to say what Haskell says with

instance Monoid Text where
  mempty = ""
  mappend = (++)

I guess we could consider expanding the following...

instance monoidText : Monoid Text where
  mempty = ""
  mappend = (++)

into the full term of type Request (Monoid Text) a -> a shown above. That gives the 'typeclass-style' handler a special status in the language syntax, and increases the total stuff for a user to learn, as they have to learn both syntaxes and the relationship between them. So seems like it could be a bad idea, although I'm not totally sure.

Performance of the generated code

This section ends up arguing towards the conclusion that the typeclass use case should get its own kind of ability (called 'interface' say).

Let's take another look at this:

mconcat : [a] ->{Monoid a} a
mconcat xs = List.foldb id mappend Monoid.mempty xs

I think a key concern here is our prospects for compiling this to fast code. I believe today ability request calls are dispatched in the runtime by throwing exceptions. I'm sure this is something that could be attacked with optimization strategies at some point, but fundamentally in the general case an ability request dispatch is likely to be different to a regular method dispatch (e.g. consider the case where the continuation is used twice, and how that would interact with how optimizations are able to treat the stack frame.)

But I think we need to get to the point where we can see a plausible future path to compiling typeclass-style request calls to regular method dispatches. Otherwise, by forcing people to go into the more general setting of abilities to get polymorphic function dispatch, we've forced them to take a performance hit, which is never going to go away.

So what is a 'typeclass-style' handler? It's one of the form

h : Request Foo a -> a
h = case r of 
  { Foo.x p q  s-> k } -> handle h in k (x' p q  s)
  
  { a } -> a

(A) Each case is linear in the continuation k, in the sense of linear types, i.e. uses it exactly once. That usage (B) is in tail position, apart from the handle expression, which (C) must be recursive, rather than calling some other handler.

Would it help if Unison had linearity tracking in its type system? It would help for (A). Maybe we could say that
mconcat : [a] ->{1 Monoid a} a
could do typeclass-style compilation of mempty and mappend calls, and we'd know that

h : 1 Request Foo a -> a

was linear in the continuation (and request arguments), and we'd require that a linearly-used ability was handled by a linear handler. But even if that worked out, that doesn't give us conditions (B) or (C).

So I suspect actually we'd need just a separate kind of ability (interface Monoid maybe), which is restricted to behave in this way.

That then strengthens the case for the instance Monoid handler syntax... but perhaps overall weakens the case for using the ability mechanism for the typeclass use case at all: if they're being compiled to different things, and declared and handled with different syntax, then all we've really reused is the requirement tracking in the type system. Still, that's a whole lot more than nothing. Do we really want a second mechanism for declaring constraints and inferring them, running in parallel with abilities? That's quite a penalty on learnability and usability of the type system.

The above line of reasoning would be bogus if we didn't actually need to compile method dispatches differently between the ability and interface cases - for example if the optimization for the interface case was handled as part of some later global optimization pass over the build. I don't know how that would work though. (Note here I'm assuming there is still, one day, a whole-codebase build step used in perf-critical cases.)

Other issues

{Monoid a, Monoid b}

As Runar pointed out, we don't have good support for ability lists where multiple abilities share the same head type constructor. That would be a deal-breaker for effectively using abilities as a typeclass mechanism. Hopefully this could be fixed? I don't know enough to comment. I guess one issue would be resolving an mappend call to one or other ability - that would have to be done in a type-driven way. I guess there might also be an issue operationally with how the requests are dispatched to the right handler at run-time.

Subclasses

Is there a path to something like Haskell's Semigroup a => Monoid a subclass relationship?. Probably we don't want to try and bolt on an ability subtyping system to the core type theory. That leaves something like saying that Monoid a is actually a kind of type alias for an ability list, equivalent to maybe {Semigroup a, Monoid' a}. That does seem to pass at least an initial plausibility check.

Commentary

  • It's actually nice that ability inference means that we often don't need to manually thread our typeclass constraints through all our signatures.

  • I initially thought that the ability mechanism was too powerful to be used as a typeclass system. And that mismatch was embodied in the performance issue mentioned above. But, subject to identifying some path out of the performance problem, I'm now seeing abilities as exactly a generalization of typeclass/interface style system. And the alternative of adding another system for tracking constraints/requirements in types, when we've already got abilities, seems like a duplication (OK, maybe not too bad if typeclass instances are just a sugared regular parameter as per the previous proposal).

@anovstrup
Copy link
Contributor

#499 raises the idea of pluggable syntax for Unison. It occurs to me that if the mechanism for that is sufficiently granular, then it may provide an elegant way forward on this issue without fear of committing to an approach that may turn out to be a poor fit for the language in the long run.

Specifically, if existing language constructs (whether through the ability system or regular function parameters) are already sufficiently general to encode bounded polymorphism in a way that could be handled efficiently, then maybe more ergonomic syntactic sugar could be provided via the plugin mechanism so that different approaches could be explored in parallel (and could even co-exist in the long term, with developers opting for the sugarings / surface encodings that suite their own preferences).

@scottjmaddox
Copy link

What about combining Rust's Trait (i.e. Typeclass) coherence rules (which it sounds like are the informal rules in Haskell) with a Unison-specific mechanism for patching dependencies with glue impls (i.e. instances)?

Unison's design makes patching dependencies almost trivial. In particular, if a library that introduces a new type is missing an impl for a Trait you would like to use on that type, you can simply patch the library to introduce the impl. There could even be a dedicated class of "glue libraries" for providing these impls, that Unison could provide first-class support for.

I suppose there's still the possibility that you could depend on one "glue library" in your code, and then later on try to add a dependency that depends on a conflicting "glue library". In those cases you might have to do a non-trivial patch to resolve the conflict. But I suspect the ecosystem would quickly coalesce on standard, compatible implementations.

@atacratic
Copy link
Contributor Author

Sounds interesting @scottjmaddox but I didn't understand it yet! 😢 A few more lines spelling out the proposal, with free-standing text rather than by reference to other languages, would help me.

@scottjmaddox
Copy link

@atacratic Sorry for the confusion, this is very much a half-baked idea.

Unfortunately, I don't think I can really explain it without referring to other languages, since:

  1. I don't have a deep enough understanding of the theory behind Traits (Typeclasses), and
  2. the primary complaint about Traits (Typeclasses) appears to be a detail about their coherence rules in Rust/Haskell.

But let me see if I can better to communicate the concept (as ill formed as it might be):

First, lets define some terms: I am going to use "Trait" and "impl", rather than "Typeclass" and "instance", because I'm more familiar with Rust than with Haskell. If you're more familiar with Haskell, then please substitute "Typeclass" for "Trait", and "instance" for "impl".

Now, as far as I can tell, the primary complaint about Traits is that there is an inherent conflict with Modules. If you want both Traits and Modules in your language, then the issue of orphan impls arises. An orphan impl is an impl that is defined in a module other than the module in which either the Trait or the implementing Type is defined. Orphan impls are a problem because they can conflict. So allowing orphan impls in your language means you can end up with two modules that cannot be used in the same codebase (or at least, cannot be used without some approach for resolving the conflict, which could result in unintuitive behavior).

In Rust, orphan impls are outlawed. So you can only impl a Trait either in the module that defines the trait, or in the module that defines the implementing Type. In practice, this works reasonably well, although it can cause frustrations at times, for example when you want to use the Serde crate to serialize/deserialize a type defined in another crate that does not implement Serde's Serialize/Deserialize traits. Typically, the way this is handled is by either providing a compile-time feature flag to add these impls, or by writing a wrapper Type with the desired impls.

In Unison, it seems like it should be possible to completely bypass these complaints about Traits, by leveraging Unison's unique ability to patch modules. Unison could provide a tool for easily (perhaps even automatically) patching external dependencies with conflicting orphan impls. So rather than banning orphan impls, as in Rust, you could allow them and provide a tool for semi-automated or fully-automated merging of orphan impls.

Going a step further, Unison could provide a sort of package manager for single-impl-packages, i.e. "glue packages". When you want to use an impl for a trait and type defined in two different external dependencies, you would attempt to import the appropriate glue package. If it doesn't exist in the ecosystem yet, you could write it and publish it. This would encourage the ecosystem to standardize on particular glue impls, and keep conflicts to a minimum.

Does that make the idea any clearer?

@atacratic
Copy link
Contributor Author

@scottjmaddox interesting! 🙂 The orphan impls situation sounds the same as in Haskell, where rules excluding orphans maintain 'coherence' (avoiding mixing the use of different instances).

I don't know if I'd phrase it as 'bypass the constraint': plenty of people think that Haskell's enforcement of coherence is a good feature. There's a heading 'Coherence' above that touches on this, although it assumes knowledge of the haskell background.

I wonder what it would mean to 'merge' two orphan impls? Maybe finding an example use case would be a good way to get clarity there.

@atacratic
Copy link
Contributor Author

Looks relevant...

Tom Schrijvers, Bruno C. d. S. Oliveira, Philip Wadler, and Koar Marntirosian. 2019. COCHIS: Stable and coherent implicits. J. Funct. Program. 29 (2019), e3. https://doi.org/10.1017/S0956796818000242

https://pdfs.semanticscholar.org/76ad/639df681bc1757e270245851b9d7dd7ab810.pdf abstract:

Implicit Progamming (IP) mechanisms infer values by type-directed resolution, making programs more compact and easier to read. Examples of IP mechanisms include Haskell’s type classes, Scala’s implicits, Agda’s instance arguments, Coq’s type classes, and Rust’s traits. The design of IP mechanisms has led to heated debate: proponents of one school argue for the desirability of strong reasoning properties; while proponents of another school argue for the power and flexibility of local scoping or overlapping instances. The current state of affairs seems to indicate that the two goals are at odds with one another and cannot easily be reconciled. This paper presents COCHIS, the Calculus Of CoHerent ImplicitS, an improved variant of the implicit calculus that offers flexibility while preserving two key properties: coherence and stability of type substitutions. COCHIS supports polymorphism, local scoping, overlapping instances, firstclass instances, and higher-order rules, while remaining type safe, coherent and stable under type substitution. We introduce a logical formulation of how to resolve implicits, which is simple but ambiguous and incoherent, and a second formulation, which is less simple but unambiguous, coherent and stable. Every resolution of the second formulation is also a resolution of the first, but not conversely. Parts of the second formulation bear a close resemblance to a standard technique for proof search called focusing. Moreover, key for its coherence is a rigorous enforcement of determinism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants