-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: curry underscore arguments to create anonymous functions #24990
base: master
Are you sure you want to change the base?
Conversation
Note that if this is merged, using an underscore as an rvalue should probably become an error (currently it is deprecated). As an lvalue, it is fine — we can keep using it for "discarded value" (#9343). |
My gut feeling is that I'd rather not rush this and that a few ad hoc legacy currying functions in Base aren't going to kill us. Although I have to say that the simplicity of the rule has the right feel to me. |
I agree that we don't want to rush new features like this, but I feel like this idea has been bouncing around for a long time (since 2015), the reception has been increasingly positive, and we keep running into cases where it would help as we transition to a more functional style (thanks to fast higher-order functions). |
(I guess technically this is partial function application, not currying. Note that Scala does something very similar with underscores, and it allows multiple underscores. In a some circumstances Scala apparently requires you to explicitly declare the type of the underscore argument, though, at least if you want its type inference to work.) |
What potential backwards incompatibilities does this rule expose us to? I know Stefan spent a while trying to find a simple set of rules for determining the "tightness" of a partial application expression and found there were some difficulties. Merging this would close the door on changing the tightness rule in 1.0. Is there anything else? |
@yurivish, using |
Scala's rule for "tightness" is (Scala Language Specification, version 2.11, section 6.23.1):
which seems essentially the same as the one I've used here (i.e. the innermost expression that is not itself an underscore "binds" it). Scala is more general in two ways:
Both of these could easily be added later, after my PR, since they are a superset of my functionality. Regarding Stefan's rules, I found them pretty complicated and confusing: why should |
@stevengj Of course, you're right – I think I misused the phrase "backwards incompatibility". I meant to ask whether there were any "terse partial function application" syntaxes that we may want to introduce in the future that would conflict with the functionality implemented in this branch. If it turns out that future enhancements would almost certainly be supersets of this one, then, well, fantastic. 😄 I'd personally love to see something like this in the language so long as it's not limiting our options too much down the line. Edit, written before the last paragraph in the preceding post: Hmm. Expressions with operators (like |
I don't have strong feelings one way or another about this change but I've played around with it a little and I noticed something kind of odd: julia> _ + 1
#1 (generic function with 1 method)
julia> _
WARNING: deprecated syntax "underscores as an rvalue".
ERROR: UndefVarError: _ not defined That is, the behavior at the top level is surprising. I kind of wonder whether it might be better to adjust the parsing behavior in that context, kind of like how generators require parentheses at the top level: julia> i for i in 1:10
ERROR: syntax: extra token "for" after end of expression
julia> (i for i in 1:10)
Base.Generator{UnitRange{Int64},getfield(, Symbol("##3#4"))}(getfield(, Symbol("##3#4"))(), 1:10) Also, I realize this is expected based on the stated rules, but it took me by surprise: julia> map(_, [1,2,3])
#9 (generic function with 1 method) I'm not entirely sure what I expected with that expression, but it wasn't that. 😛 Behavior aside, I love how tiny the implementation actually is! It's impressively minimal for a potentially powerful feature. |
Really cool :) @ararslan The behavior you highlight as odd seems natural to me. (An aside: what is an unbracketed |
Really nice, also for the data ecosystem where the frequently used function |
As |
@ararslan, we could certainly implement I would rather not require parens around e.g. |
@rfourquet, this is totally orthogonal to the piping syntax (#20331); I'm not sure you think why the latter would affect this. We've been discussing a currying/partial-application shorthand for literally years now, and all of the discussions seem to have been converging towards underscore syntax, which has a long track record in Scala. |
The underscore syntax is actually already used, for example in in the Query package, via macros (see docs). The only difference with Query is that there there is no tight binding, for example |
This link concerns a removal of the piping syntax without plan to re-introduce it later, which was discussed recently at #5571. It's true that curying syntax can exist independantly to piping syntax, but the design of piping syntax, which deals directly with curryed functions, could influence how the currying syntax should be designed. |
I think it would be short-sighted to tie currying syntax to piping syntax. Currying (partial function application) is useful for lots of things beside piping. |
I could imagine a "loose-binding" syntax like |
Is |
@yurivish, in this PR, |
@stevengj I deleted my comment right after posting when I realized what I said didn't make sense .(my example was But I thought it didn't make sense for a different reason — because it would turn the entire expression into an anonymous function. It seems your approach is even more conservative than I realized. 😄 Would |
@yurivish, yes |
What if we lower with an "exactly one call" rule, lowering into a special The if we define |
I love the idea of using language features to build flexibility while the syntax is strict! |
Imho the advantage of being statically understandable probably outweighs the flexibility gained from doing that. |
discoverability surely will be important! we could introduce a certain way of hinting which arguments take part in autocompose. |
There's an increasing consensus on recent triage discussions to just implement the "argless lamda" version of this which starts the anonymous function with |
hate to be cold water but imo the marginal value of code legibility with headless |
This is why this issue never gets anywhere. Any time there's something straightforward core devs can get on board with, everyone piles on with what they dislike about it and with random alternative proposals and considerations. The end result is no progress at all. I have a hard time understanding how |
I don't think either of those are particularly legible what's wrong with
I mean 😬 maybe that's the lesson to learn here is there isn't a great solution |
Tbh, I don't see why this is objectively a negative in this case. Both in terms of syntax (eg, I find it surprising that a single symbol Meanwhile, there are actual existing solutions in packages, some of them quite popular (eg Accessors.jl and its macros). They may not cover the area completely, but that's typically because of the inherent complexity and unlikely to be helped by putting one of the solutions into Base. |
Same thing meaning appropriately different things in different places is the whole core feature of multiple dispatch. It's just about getting used to how it's used. In the cases you showed, I see there's some profit to be made by adding a way to replicate the argument into all places (which is easier than the other way around). And currying is about applying multiple arguments, not replicating a single argument into multiple slots. (Thus, for your usecase, consider combining the proposal with a replication function that just replicates a single argument into as many slots as you want it). For further thoughts, take a look at my comment in the other issue (#38713 (comment)). Plus, on a related but different note, take a look at #53946. So, to frame it that way, the underscore is the object that stands for "a slot to curry into" within the argument-less anonymous function syntax. To me, that's a plausible perspective.
Can't be a
Take a look at #53946 for that particular use case. And you're totally right. The cases you gave in the previous paragraphs aren't particularly well-fit for this proposal. That's why I'd not use them as examples for this proposal. Plus, as said, currying is about applying multiple arguments, not about putting a single argument in multiple places.
How it scales for cases like This example is a particularly good one due to the closeness of the arguments Also goes particularly well with the The core benefit is that we already have trained eyes to scan for Additionally, to me, the leading
Which shouldn't be a lot of effort, as far as I understand it, since it's a simple lowering pass that just collects underscores from left to right and augments the already existing One of the biggest advantages I see with this approach, is (as @StefanKarpinski said it), there's nothing that could be broken by implementing it that way, since that syntax isn't used anywhere. So by introducing it, we technically cannot hurt anyone. It's just not everyone will want to use it. (Plus fence-less variants could still be added at a later point). So I'm totally on board for "Just do it", as @o314 presented it. For "how to use it when it's there" read the Tl;Dr:'s at the end. And for all those arguing that But to be fair, I am already sold on this perspective for its cleanliness and was one of the first appreciators of that idea (#24990 (comment)) and even created a standalone issue for it (#38713). Tl;Dr: When to use the proposal?Whenever the "what will happen to the arguments?" is clearly inferrable from the used functions and operators and the argument list is close in code so that using an ordinary lambda would feel like name duplication. Tl;Dr: When to not use the proposal?Whenever the "what will happen to the arguments?" can't be easily inferred from the used functions and operators or the actually used arguments are far away. Use named functions with named arguments in that case.
|
Which may be socially optimal. The marginal benefit of adding syntax for a rather special case is small, so not doing it can be a reasonable choice. Unless a proposal has a large benefit, it is perfectly fine to reserve syntax that currently errors for future expansion of the language. Options have a value, and core devs could consider not filling up every nook and cranny of the of the syntax space with some gimmick as a perfectly fine choice. |
That's a valid opinion and writing
If repeated underscores mean the same thing each time, then, as @rapus95 said, this is not a feature for currying anymore, it's just a feature for creating single-argument functions. That's just not nearly as useful or general. It's also not how similar underscore currying syntax works in other languages that have it, such as Scala. |
I think after reading through the first 100 comments or so for the first time I was in the camp of "underscore to replace a single argument in a single function sounds nice" after the next 300 comments or so I think I am now more in the camp of "good lord it's impossible to find consensus, probably should just move on" I definitely don't like (subjective of course!) multiple underscores in a function bc the ambiguity of whether it means the same argument or distinct arguments will always overwhelm the convenience for me though realistically if |
I'm in the camp of I think the original sin here is that trying to handle more than a single function call with |
I hesitate to make another suggestion here, given the length of the discussion - but maybe just as a wild idea: If we would use |
Though, to be fair, having some core developers getting to a close-to consensus after more than 6 years of being an open issue and debated multiple times without getting to a consensus doesn't fit the description "filling up every nook and cranny". It rather hints that they regularly stumble over situations where they would've liked to use the particular syntax but ended up with "ah, still not implemented".
That ambiguity shouldn't exist in the first place because underscore is meant to be the thing that will never stick to any value. If you use it as a left-hand side, it'll just pass the value straight through to the dump and you won't be able to access the value again. This proposal just reverses sides for this approach (as lambdas go left to right while assignments go right to left). You put in an argument and it will be handed right through to the first pit(=underscore) but aside of that it will be lost, you can't access it in a later place.
You're totally right that we shouldn't use this proposal in that situation. Luckily there's Disclaimer: Arguing about the following points and to some extent invalidating them by no means shall invalidate personal preferences and personal opinions! It's intended as a non-emotional argument. If someone feels attacked by it, that's not my intent. And I'd very like to help to shift perspectives and creating missing key insights, to get to a similar conclusion/perspective as I have it, in order to see the cleanliness, mathematical elegance and benefits of it! To date, I feel like there are 3 types of people around (excluding those in favor of the proposal)
Point 3) isn't a blocker IMO, because it mixes multiple situations. Everything that is needed to make it compatible is a replication function that replicates a single object into multiple places. The other way around (dropping multiple arguments into different places if the proposal would just replicate a single argument) is harder to construct and thus less general, plus, still not currying. But I'd love to help to create the missing piece (replication) in another place so that we can mix replication, property currying (#53946) and this proposal together to make best use of all those features. For example, we could go for Point 2) Conciseness and clarity over shortness is IMO a very strong argument in favor of the Point 1) Well, yes, that's to some extent a matter of taste, in which I would refer to "not using it" as being a good strategy. And regarding the "it's confusing" argument, I assume that will change once the proposal has settled and is used in many places. Then it will feel natural. The only thing that prevents getting the natural feel for it right now, most presumably, is forcefully sticking to a perspective that's conflicting with that proposal. So regarding this point I'd aswell say it's a non-blocker as it will resolve itself with time. Tl;Dr: I'm in favor of adding the argless to the language nonetheless while educating people on intended use cases (it's not meant as a general replacement for functions and lambdas or just passing the named function itself) and thinking about a concise and short syntax for the replication (but independently of the underscore proposal) for those in 3). |
I'm totally on board with the idea of a single function call, but then the secondary sin is that Julia's syntax is fancy enough that it can be tricky to know what "counts" as a single function call. Is tuple construction IMO, we need a clear precedence boundary. |
We already have Someone on discourse posted a quote that "all new features start at -10 points" which is pertinent here. |
Disclaimer again: I don't want to heat this up, so there's sincere curiosity behind my questions further down. I can't expect a sincere answer to them, but I'll know for myself that I at least tried.
To quote myself
twice
thrice
four times
So I wonder, is there any particular strategy behind reiterating single-argument and other cases which are already accepted as bad-fit for this proposal? What would you need to discuss the (to our perspective) well-fit cases instead of bad-fit cases? Or to shift the focus away from cases which this proposal isn't designed for. It's like saying do block notation doesn't have a big benefit because why would I write identity() do x, y
return x+y
end instead of function (x,y)
return x+y
end Well yes, that's a miserable example to show the benefits of the do-block syntax. I'm totally with you there. And on top, the given examples could be reduced to And likewise, for the current currying proposal. All examples you gave are a particularly bad fit. No one wants to disagree there. But we measure by benefits in the designed-for case, instead of "how hard can you go against the design idea". Otherwise, we'd have a restrictive totally static, non-composable, and many more less-elegant adjectives language. |
Ok, but is it a negative? IMO, with multiple arguments, explicit single-letter names make the code easier to understand while not adding much overhead.
[citation needed] :) Note that this doesn't preclude multi-arg functions: they could use stuff like
It's how similar syntax works in other languages, such as anonymous functions in Mathematica: there, I'm sure there are languages leaning either way! In Julia, there is lots of prior art (in packages) with |
Of course "significant" is a subjective term, especially in this context, but arguably just dropping a single character from beginning of an anonymous function is stretching the concept. My take from this (and related discussions) so far is that given the syntactic complexity of Julia, "curry underscore" can do very little and the debate is about allocating this modicum of expressiveness. Coding styles differ so different people want to use it for various things, and there is no clear consensus. In which case, is it worth adding extra syntax for so little gain?
I think that there is a fourth category you missed: people who may have been initially interested in this feature, but after having seen the ramifications and limitations they don't think it would improve the language. It's not that underscores are confusing (the latest single-argument proposal with the |
If we add the "argless lambda" as in
This is subjective, I find all the following more readable except the
Another category missing on top of @tpapp's 4th:
although @mbauman makes a very good point:
This is probably the most powerful objection to the fence-less single call rule. Presumably the rule is not counted as "something straightforward" in @StefanKarpinski's comment because of this. I think the answer is that we should expand the documentation on syntax sugar. Every piece of sugar that correponds to a function call should be documented there. This would explicitly define what counts as a single function call: anything not in the list would not count. Actually documenting those sugars is something we should do anyway I think, because knowing this aspect of lowering is essential to really understand Julia syntax, and cannot be swept under the rug when you consider function dispatch (e.g. how overloading |
IMO this issue is going nowhere, and nobody will agree on how to make a useful syntax for this. However, there's a different way we can go about this which is likely to be much less controversial: JuliaLang/JuliaSyntax.jl#212 @c42f has shown with various analyses that it's really rare in the ecosystem for one to have |
Yeah, the crux of my point is that because intuitions vary on what this should do (see above 500 comments), we're in all the more need of an understandable and straightforward rule to clearly describe whatever we've chosen it to do. It's possible there exists a simple rule without a "fence", but goodness the edge cases are tricky. For example, the current implementation here notes that we'd need to add special support for broadcasting and kwcalls and chained comparisons — those are are broken because they are internally composed of multiple expressions. They also all feel like they have an obvious answer. But there's so many more cases. With the current implementation here:
What's the rule? Or is it just a whole pile of special cases? My dreams of a fenceless |
Note that #29875 (which introduces a pretty useful feature IMO, one that is consistent with current usage of |
If #54653 becomes reality, it would be really neat if we'd emit |
This PR addresses #554, #5571, and #22710 by "currying" underscores in function calls like
f(_,y)
into anonymous function expressionsx -> f(x,y)
. (Note that_.foo
works and turns intox -> x.foo
since it is equivalent to agetfield
call, and_[i]
works and turns intox -> x[i]
, since it is equivalent to agetindex(_,i)
call.)This will help us get rid of functions like
equalto
(#23812) oroccursin
(#24967), is useful for "destructuring" as discussed in #22710, and should generally be convenient in lots of cases to avoid having to explicitly dox -> f(x,y)
.Some simplifying design decisions that I made:
The currying is "tight", i.e. it only converts the immediately surrounding function call into a lambda, as suggested by @JeffBezanson (and as in Scala). So, e.g.
f(g(_,y))
is equivalent tof(x -> g(x,y))
. (Note that something likefind(!(_ in c), y)
will work fine, because the!
operator works on functions; you can also use_ ∉ c
.) Any other rule seems hard to make comprehensible and consistent.Only a single underscore is allowed.Similar to Scala, multiple underscores are converted into multiple arguments in the order they appear. e.g.f(_,_)
throws an error: this case seems ambiguous to me (do you wantx -> f(x,x)
orx,y -> f(x,y)
?), so it seemed better to punt on this for now. We can always add a meaning for multiple underscores later.f(_,y,_)
is equivalent to(x,z) -> f(x,y,z)
. See rationale below.The implementation is pretty trivial. If people are in favor, I will add
f.(x, _)
f(x; y=_)
3 ≤ _ ≤ 10
, since they parse as a single expression?