Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix intersection-typed function calls with union-typed arguments #514

Merged
merged 17 commits into from
Mar 1, 2023

Conversation

erszcz
Copy link
Collaborator

@erszcz erszcz commented Feb 2, 2023

This PR extracts the intersection-typed function call logic improvements from #512 to offer a smaller chunk of code in a single PR. Without this functionality in place it will be very hard to finish #512 - that is fixing the new self-check errors there.

The good news is that this PR fixes a number of tests (including some new ones) and solves the problem of calling a function with an intersection type with an argument with a union type.

The bad news is that this PR makes it obvious why overlapping spec clauses, mentioned in #461, are bad. They make it impossible to tell which clause's result type is the correct result type of the call. I'm not sure how to solve this yet or if it's possible at all. This means that we might have to forego the nicety of preserving (non)emptiness of lists through function calls :'(

Constraints aren't handled properly yet, so some tests fail because of that.

…nion_arg_should_pass.erl

And rearrange some already existing tests.
Copy link
Collaborator

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's solved by this PR? No new should pass or should fail, only some changes in known problems? Did you forget to git add them?

This means that we might have to forego the nicety of preserving (non)emptiness of lists through function calls :'(

I don't understand why. An empty list and a non-empty list are not overlapping. If a function f([]) -> []; ([A, ...]) -> [A, ...] is called with a non-empty or empty list, it's type can be preserved, can't it?

src/typechecker.erl Show resolved Hide resolved
src/typechecker.erl Outdated Show resolved Hide resolved
src/typechecker.erl Outdated Show resolved Hide resolved
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 3, 2023

What's solved by this PR? No new should pass or should fail, only some changes in known problems? Did you forget to git add them?

This PR is still a draft as you can see. Things are not moved to their intended destinations yet, although they are in the repo. I'll be cleaning it up. What works now, but didn't, is for example https://github.com/erszcz/Gradualizer/blob/ed424bac9657dd2048efb5aaeb056dc66df270c3/test/known_problems/should_pass/call_intersection_function_with_union_arg_should_pass.erl.

On master:

14:31:10 erszcz @ x6 : ~/work/erszcz/gradualizer (master %)
$ ./bin/gradualizer call_intersection_function_with_union_arg_should_pass.erl
call_intersection_function_with_union_arg_should_pass.erl: h/1 call arguments on line 16 at column 5 don't match the function type:
fun((a) -> a)
fun((b) -> b)
Inferred argument types:
a | b
call_intersection_function_with_union_arg_should_pass.erl: h/1 call arguments on line 25 at column 9 don't match the function type:
fun((a) -> a)
fun((b) -> b)
Inferred argument types:
a | b
call_intersection_function_with_union_arg_should_pass.erl: j/2 call arguments on line 41 at column 5 don't match the function type:
fun((t1, u1) -> one)
fun((t2, u2) -> two)
Inferred argument types:
t1 | t2, u1 | u2

This PR:

14:32:46 erszcz @ x6 : ~/work/erszcz/gradualizer (fix-intersection-typed-function-calls *%)
$ ./bin/gradualizer call_intersection_function_with_union_arg_should_pass.erl
14:32:48 erszcz @ x6 : ~/work/erszcz/gradualizer (fix-intersection-typed-function-calls *%)

An empty list and a non-empty list are not overlapping.

But a list() and a nonempty_list() are overlapping. This means such specs might have to go away:

%% Preserve the (non)empty property of the input list.
-spec map(fun((A) -> B), [A, ...]) -> [B, ...];
         (fun((A) -> B), [A]) -> [B].

which is sad, as they're very convenient. It's often the case that we want to take hd(L) and it would be nice to be able to do it without ?assert_type(L, nonempty_list()). With such specs the type checker can tell that. Dropping such specs means the type checker will drop that information, too. Tradeoffs everywhere :/

Specifically, calling intersection-typed functions with union-typed arguments.
Overlapping clauses lead to an unsolvable problem when selecting which clause of an
intersection-typed function to call.
Strictly speaking, they're not that accurate anymore,
since any list containing even a single binary will lead to returning the result as a binary.
It's not reflected in the current specs.
@erszcz erszcz force-pushed the fix-intersection-typed-function-calls branch from ed424ba to b4b008b Compare February 3, 2023 19:25
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 4, 2023

Quick summary of where we're at.

The following:

-spec i2(t(), u()) -> one | two.
i2(T, U) ->
    j(T, U).

-spec j(t1, u1) -> one;
       (t2, u2) -> two.
j(t1, u1) -> one;
j(t2, u2) -> two.

doesn't pass anymore - it's properly detected to be an error ✅

We're detecting overlapping spec clauses when functions with such are called (just a printout right now, but can be thrown as a proper error) ✅

The price to pay for the above is slowness... big enough to significantly slow down the self-check ❌

Moreover, there's a relatively high number of self-check errors on this branch, too (around 200 lines) ❌

Another loss is that:

-spec i(a, b) -> {a, b};
       (d, e) -> {d, e}.
i(V, U) -> {V, U}.

-spec j({a, b} | {d, e}) -> {a, b} | {d, e}.
j({V, U}) -> i(V, U).

does NOT type check anymore ❌
This is caused by V and U getting assigned union types in the venv (likely by add_type_pat), so this case incorrectly collapses into the first one from this post.

Next steps:

  • introduce overlapping_spec error
  • check if self-check errors can be reduced by dropping overlapping specs from Gradualizer's code base
  • profile

@erszcz
Copy link
Collaborator Author

erszcz commented Feb 9, 2023

Some new results:

introduce overlapping_spec error

It's not as simple as I thought. The condition that we detect doesn't necessarily mean that the called function has an overlapping spec - that's only one of the possibilities, the others being less defined. I think we have to stay with the more general call_intersect error for now.

profile

I've run cprof and eflame on the Gradualizer checking its own typechecker.erl and it turned out that subtype/3 is called a lot of times. Adding caching to subtype/3 has brought down self-check times to numbers (slightly, yet still) lower than on the master branch 🚀

The code needs cleanup, of course, but things don't look bad as of now.

rebar.config Outdated Show resolved Hide resolved
@erszcz erszcz force-pushed the fix-intersection-typed-function-calls branch from 7b11ac8 to e872104 Compare February 9, 2023 21:25
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 10, 2023

Ahh, it's not just overlapping spec clauses that cause problems. The problem is that we get multiple clauses upon selection based on argument types - overlapping specs might lead to this, but it might also happen if argument types are any of the "gradual" types, i.e. any() (subtype of anything), list() (subtype of any list), tuple() (subtype of any tuple), ... All of these might lead to multiple clauses being selected and therefore dictate the call return type.

If we allow such gradual types to match multiple spec clauses, then we can just as well allow overlapping spec clauses. This means we have to take the LUB of the clauses' return types as the return type of the call. In the light of overlapping specs, it's useless, though, since for a spec like:

-spec map(fun((A) -> B), [A, ...]) -> [B, ...];
         (fun((A) -> B), [A]) -> [B].

We would still need to deal with a call result type like [B, ...] | [B], which gives nothing when it comes to the example with non-emptiness preservation:

hd(lists:map(F, NonEmptyList))

I think it's fair to say that "the less we know, the less we can guarantee", and therefore where it's possible to return a single clause's return type, but where multiple clauses match the call argument types return the union of these clauses' return types. This way the more precise types the programmer provides, the more specific feedback Gradualizer will return.

@erszcz erszcz force-pushed the fix-intersection-typed-function-calls branch from c0206bb to bf32ae0 Compare February 10, 2023 09:43
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 10, 2023

Ok, I think this PR is ready functionally-wise. I'll clean up the code and git history a bit and then switch it to "Ready for review".

@erszcz erszcz force-pushed the fix-intersection-typed-function-calls branch from b6244bd to f4f78ce Compare February 15, 2023 17:36
@erszcz erszcz force-pushed the fix-intersection-typed-function-calls branch from f4f78ce to 792d13f Compare February 15, 2023 18:36
@erszcz erszcz marked this pull request as ready for review February 15, 2023 18:42
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 15, 2023

Ok, I think this is finally ready. @zuiderkwast WDYT?

@erszcz
Copy link
Collaborator Author

erszcz commented Feb 28, 2023

If there are no review comments I'll be merging this shortly

@josefs
Copy link
Owner

josefs commented Feb 28, 2023

I think it'd be worth to do some kind of impact assessment of not supporting overlapping specs. How common are they in the wild? Does OTP use them? My initial philosophy for Gradualizer was to go to great lengths to support the kind of code that people actually write.

It will still be possible to write overlapping specs. What's your plan for what Gradualizer should do with them?

To be clear, I'm not necessarily against this change. But it feels like quite a big step to rule out perfectly well-formed specs, just because they are overlapping. So it'd be good think about what the impact will be and ensure that the behaviour of Gradualizer is not too surprising, should someone write an overlapping spec by accident, say.

@erszcz
Copy link
Collaborator Author

erszcz commented Mar 1, 2023

I think it'd be worth to do some kind of impact assessment of not supporting overlapping specs. How common are they in the wild? Does OTP use them? My initial philosophy for Gradualizer was to go to great lengths to support the kind of code that people actually write.

I think these specs aren't common at all. Erlang docs warn about overlapping specs not being supported by Dialyzer:

A current restriction, which currently results in a warning by Dialyzer, is that the domains of the argument types cannot overlap.

I think all (or at least most?) of them in the tests and Gradualizer source code were introduced by me after #461, when it turned out they can be useful for preserving some properties "through" a function call, but it was not clear yet that function call checking in general doesn't handle union typed args to intersection typed functions properly.

It will still be possible to write overlapping specs. What's your plan for what Gradualizer should do with them?

Strictly speaking, this PR doesn't disallow them, they're still handled as best as possible, i.e. if a function spec has overlapping clauses, the result type of the function call is assumed to be the union of clauses' result types.

However, in practice this usually means we lose the ability to preserve non-emptiness of a list through the call in a case like map/2:

-spec map(fun((A) -> B), [A, ...]) -> [B, ...];
         (fun((A) -> B), [A]) -> [B].

because, for example, a non-empty list of [1,2,3] matches both [A, ...] and [A] and the union of [B, ...] and [B] is, of course, just [B].

Ultimately, though, I think we have to pay this price for being able to properly check that the call to h/1 is valid:

-spec f(a | b) -> a | b.
f(V) ->
    h(V).

-spec h(a) -> a;
       (b) -> b.
h(V) -> V.

In other words, we have to check that all clauses of a multi-clause spec, in total, cover a union-typed argument, not that each of the spec clauses does it. The above example fails on master, but does not fail with this PR. Otherwise we'll be swamped with false positives about perfectly valid code.

@zuiderkwast
Copy link
Collaborator

zuiderkwast commented Mar 1, 2023

Breaking something to fix something else is what I was worried about too.

I think it's fine to drop overlapping specs, but what about gradually-overlapping? We use any() to denote a dynamic type which means we trust the developer when any is used. Does that mean we should support a spec like

-spec f(integer()) -> apa;
       (any()) -> bepa.

... and trust that this particular instance of any() doesn't include integer()?

@erszcz
Copy link
Collaborator Author

erszcz commented Mar 1, 2023

Breaking something to fix something

Breaking something that was introduced recently as an experiment to fix something that's fundamental, but never worked properly, seems like a good enough reason to me ;)

@zuiderkwast I think that to be on the safe side we have to assume that f/1 returns apa | bepa. That's what this PR does. It might be cumbersome for the programmer having to handle the alternative, but we can't guarantee anything more.

@zuiderkwast
Copy link
Collaborator

zuiderkwast commented Mar 1, 2023

Yes I agree, let's merge this.

Of course f returns apa|bepa. It's the parameter type that's overlapping according to Dialyzer semantics. (I failed to find an example where it matters to us.)

@zuiderkwast
Copy link
Collaborator

In the paper about gradual typing with union-intersection-negation types, they rewrite the spec until it's not overlapping anymore. The above would be rewritten to f(integer()) -> apa; (any() & ~integer()) -> bepa. (where & is intersection and ~ is negation).

@erszcz erszcz merged commit ba4476f into josefs:master Mar 1, 2023
@erszcz erszcz deleted the fix-intersection-typed-function-calls branch March 1, 2023 12:23
@josefs
Copy link
Owner

josefs commented Mar 1, 2023

In the paper about gradual typing with union-intersection-negation types, they rewrite the spec until it's not overlapping anymore. The above would be rewritten to f(integer()) -> apa; (any() & ~integer()) -> bepa. (where & is intersection and ~ is negation).

Yes, this could be a good way forward. It introduces an ordering of the type declarations, so it matters in what order they are written. But it's a fairly small price to pay for the extra expressivity. The current situation of allowing people to write overlapping types but discouraging it and not handling it well is not a good status quo. It's very easy to accidentally write overlapping types and, as you point out @zuiderkwast, it becames even easier when you involve gradual types.

erszcz added a commit to erszcz/erlang_ls that referenced this pull request Jun 16, 2023
This version brings approx. 30 new PRs. The highlights are:

- Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid
- Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz
- Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz
- Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz
erszcz added a commit to erszcz/erlang_ls that referenced this pull request Jun 16, 2023
This version brings approx. 30 PRs. The highlights are:

- Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid
- Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz
- Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz
- Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz
erszcz added a commit to erszcz/erlang_ls that referenced this pull request Jun 16, 2023
This version brings approx. 30 PRs. The highlights are:

- Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid
- Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz
- Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz
- Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz
robertoaloi pushed a commit to erlang-ls/erlang_ls that referenced this pull request Jun 17, 2023
This version brings approx. 30 PRs. The highlights are:

- Improve map exhaustiveness checking [#524](josefs/Gradualizer#524) by @xxdavid
- Fix all remaining self-check errors [#521](josefs/Gradualizer#521) by @erszcz
- Fix intersection-typed function calls with union-typed arguments [#514](josefs/Gradualizer#514) by @erszcz
- Experimental constraint solver [#450](josefs/Gradualizer#450) by @erszcz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants