Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: bind a name to a destructurable place with @ #5471

Closed
cosmicexplorer opened this issue Nov 5, 2024 · 11 comments
Closed

Proposal: bind a name to a destructurable place with @ #5471

cosmicexplorer opened this issue Nov 5, 2024 · 11 comments

Comments

@cosmicexplorer
Copy link
Contributor

This is a feature request for new syntax. I believe this would be entirely backwards-compatible.

Problem

Consider the function:

f = ({x}) ->
   # if key name 'x' is changed,
   # this silently starts retrieving x from outer scope:
  {a, b} = x
  otherFunc x, a, b

# only way to write this e.g. in repl:
f = (o) -> otherFunc o.x, o.x.a, o.x.b

# saving the result of a function call cannot be done while destructuring:
x = g()
[a, b] = x

Because otherFunc requires both the parent object x as well as some of its fields a and b, we have to perform an additional destructuring step. This can be error-prone, as the name x must be duplicated for the required boilerplate. It also makes it impossible to write this logic in one line without avoiding destructuring entirely!

This syntactic limitation applies equally to all forms of destructuring assignment. The proposed solution frames the existing "destructuring" mechanics in terms of what I'm calling "place expressions", expounding on the second half of "destructuring assignment".

Proposal: @ to bind a name to a place

Abstractly:

  • Any expression you can destructure (place-expr) can also be given a name before applying the requested destructuring:
    • place-expr = <name>@<destructuring-expr> (except for object keys which use @:)
    • The choice of @ was motivated by the @ operator from rust, where let and match introduce similarly "destructurable" place expressions.
  • This staged destructuring desugars directly into a series of destructuring assignments.

Concretely:
This new syntax should be entirely backwards-compatible, and will show up in four places:

  1. top-level assignment: <name>@<destructuring-expr> = <value-expr>
  2. function arguments: (<name>@<destructuring-expr>) -> <value-expr>
  3. destructured object key: {<name>@: <destructuring-expr>, ...}
  4. destructured array element: [<name>@<destructuring-expr>, ...]

More concretely:

# 1. top-level assignment
x@[a, b] = f()
# desugars to:
# x = f()
# [a, b] = x
assert.deepEqual x[...2], [a, b]

# 2. function arguments
f = (x@{a}) ->
  assert.deepEqual x.a, a
# desugars to:
# f = (x) ->
#   {a} = x
#   assert.deepEqual x.a, a

# 3. destructured object key
{x@: [b, c]} = {x: [2, 3]}
# desugars to:
# {x} = {x: [2, 3]}
# [b, c] = x
assert.deepEqual x, [b, c]

# 4. destructured array element
[x@{y}, ...z] = [{y: 3}, 4]
# desugars to:
# [x, ...z] = [{y: 3}, 4]
# {y} = x
assert.deepEqual x.y, y
assert.deepEqual y, 3
assert.deepEqual z, [4]

These four separate places are actually all the same place: a place expression. The new syntax is backwards-compatible because it extends CoffeeScript's simple, elegant, and wholly underrated framework for binding names to values.

Prior Discussion

From searching issues, I have found multiple feature requests which I believe to be the result of this exact conundrum:

  1. Implicit kwargs object? #2475

I'm afraid I'm going to be a stick in the mud about cluttering up function signatures as usual. The explicit name of the argument with the destructing below looks best to me.

This is a pretty clear statement, but notably it was in response to vague mentions of haskell terms without a concrete proposal. The only concrete proposal was an analogy from livescript to use ({bar, baz}:kwargs), but it was not explained what that syntax actually meant or what equivalent js was generated.

In particular, this issue focused heavily on the immediate case of function arguments, which are much less flexible than other forms of destructuring in CoffeeScript because they need to conform to the limitations of javascript functions. The current proposal instead leverages an existing mechanism in CoffeeScript to directly address the awkwardness this issue tries to describe but fails to generalize.

  1. Add example of destructuring assignment in parameter list to documentation #2663 (comment)

    • as patterns are mentioned again, and a haskell code snippet is provided, but no proposal for how to express this in CoffeeScript syntax.
  2. Allow to change variables names in destructuring assignements #2879

    • This one was unclear about the nature of the problem, and I'm not sure it's the same thing, but it did raise a new use case: binding the result of a function call. For example:
x = f()
# if someone later adds some more code in between,
# then a and b are not clearly linked to the result of f():
{a, b} = x
otherFunc x, a, b

Background: Place Expressions

This proposal attempts to differentiate itself by characterizing and expanding on the very powerful paradigms CoffeeScript has already developed. While the term "destructuring" has become popular to describe any sort of declarative syntax for extracting fields from a value, the full phrase is "destructuring assignment": and this second task is rarely given as much thought.

Comparison: "destructuring" in js

Let's see how javascript is getting along these days:

For both object and array destructuring, there are two kinds of destructuring patterns: binding pattern and assignment pattern, with slightly different syntaxes.

It's true that the surface syntax is slightly different, but it would be more appropriate to emphasize their wildly different semantics. This distinction is worth dwelling on.

Binding patterns: let and const

Their "binding" patterns are exposed via a top-level let or const statement, which imposes the mutability semantics of let or const onto every attempted name binding. Since const produces an error at compile time, if there is a mixture of mutable and immutable variables to extract, the programmer has to extract their data in sequential "destructuring" statements:

All variables share the same declaration, so if you want some variables to be re-assignable but others to be read-only, you may have to destructure twice — once with let, once with const.

The page then continues without any sense of shame:

const obj = { a: 1, b: { c: 2 } };
const { a } = obj; // a is constant
let {
  b: { c: d },
} = obj; // d is re-assignable

Of course, there's a much easier alternative:

const obj = { a: 1, b: { c: 2 } };
let { a, b: { c: d } } = obj;
// a is mutable now, but the code seems to run fine

Without "destructuring" at all:

const obj = { a: 1, b: { c: 2 } };
const a = obj.a; // a is constant
let d = obj.b.c; // d is re-assignable

The recommended "destructuring" approach takes more lines of code than the old-school version, because it artificially splits the description of the input data across sequential statements which must be executed in order. This tightly couples the structure of the input data to our internal representation. It would seem much more reasonable for the let/const syntax to apply to the right-hand side of the =!

Destructuring: it's not uniform syntax, it's consistent semantics!

javascript has many built-in control structures (especially loops) which will generate a var-like binding, or dereference one if it exists already, but do not have the exact same semantics as a var binding. So of course, the language added this "destructuring" syntax to every control structure:

In many other syntaxes where the language binds a variable for you, you can use a binding destructuring pattern. These include:

From one point of view, this sounds great! Instead of having to learn the pitfalls of all these implicit binding sites, we can just use destructuring! Indeed, this is exactly what CoffeeScript succeeds at! The language never "binds a variable for you" -- every binding is intentional, and has the exact same semantics because the compiler helps you succeed!

But in javascript, the binding destructuring syntax actually just adds complexity instead of simplifying it, because it has absolutely no bearing on the semantics of how those variables are declared or dereferenced: those mechanics still vary wildly, and you still have to learn the pitfalls of all these implicit binding sites! Importantly, abusing destructuring syntax like this without a consistent semantics only further obscures intent! The reason CoffeeScript destructuring works is because the compiler ensures a consistent semantics for variable declarations! Meanwhile, the default semantics for loop variables in js is to pollute the global namespace!

Destructuring assignment: a self-documenting executable data model

One of the best features of destructuring assignment is how compactly it represents complex queries over unstructured data, by attaching a specific label to each component it extracts! Destructuring assignment attaches meaning to data in an executable specification. Consider the following hypothetical assignment:

{
  filePath: radiationLevels,
  convergence: {iterations: numDayNightCycles, finalResolution: numParticles},
}  = executeSimulation()

I think there's a lot more we could do with place expressions (start adding predicates, transformations, type annotations, dependency injection, ...). But I have been playing around with my own language for a while as a playground for that stuff, and I don't think CoffeeScript really needs much! The strong distinction between place and value expression (and corresponding effort on the compiler to achieve that) are exactly what makes it unique!

@cosmicexplorer
Copy link
Contributor Author

Especially if we enforce that the @ is always directly attached to the right side of a symbol (i.e. x@, not x @), I think parsing this should hopefully be relatively simple, and I would be surprised if it is not fully backwards-compatible (but will amend the proposal should that be the case). I have quite a few more thoughts about the incredibly shallow understanding of what "destructuring" means or what it could be in reading javascript's depiction of them.

@vendethiel
Copy link
Collaborator

vendethiel commented Nov 5, 2024

FWIW, these are called as-patterns in the ML languages (Ocaml, Haskell etc). (the name "places" is more often used to talk about assignables, like setfs in Common Lisp or addresses in Rust)
The oldest issue I can think of is #363, but see also #2620, #2276, #1708, #1617, and #3462.

I'm going to pull the answer from these:

Closing this request as it’s not a priority at the moment. That said, if someone feels like implementing this as a PR that doesn’t introduce any breaking changes, it would be considered.

Thank you for your well-redacted issue nevertheless

@cosmicexplorer
Copy link
Contributor Author

cosmicexplorer commented Nov 5, 2024

Thanks so much for the prompt response! ^_^ Part of what I redacted was a discussion of setf and pcase from emacs, but I do not think pcase is really feasible in CoffeeScript -- I was more trying to express my appreciation for how CoffeeScript already has such a strong concept of places. I also appreciate the references to prior issues & relevant jargon--was finding it difficult to search for this.

I will look at a PR! Thanks again!

@vendethiel
Copy link
Collaborator

I do not think pcase is really feasible in CoffeeScript

Adding pattern matching without macros or language support does not seem possible, no.

was finding it difficult to search for this.

Even with the name, looking up "as patterns" in any search engine matches a lot of non-references.

@cosmicexplorer
Copy link
Contributor Author

Adding pattern matching without macros or language support does not seem possible, no.

Sorry, just to be clear -- are you describing these as likely prerequisites for a potential path to that functionality in CoffeeScript, or are you saying macros and/or language support for pattern matching are likely out of scope for CoffeeScript? I was under the impression that either of those features were unlikely to be accepted in CoffeeScript, so I was trying to indicate that I would respect that in any further proposals/PRs. I focused on this proposal (gonna try to prototype it now) bc it avoids introducing any new semantics/control flow, and the codegen remains extremely simple, but I am definitely interested in researching further extensions.

My vaporware programming language is heavily inspired by CoffeeScript, and the biggest differences are in its more extensive pattern matching and pipelining control flow. I have been heavily inspired by R's magrittr pipelining approach, as well as R's NSE framework for stuff like macros but more introspectable and extensible. A transpiling language like CoffeeScript does seem amenable to macros, but I generally assume most languages are against that sort of thing (although it does mean users have to ask much more from language maintainers).

Notably, macros can also be used to implement an extensible compile-time type system (Racket has done incredible work on this). I think JSDoc generation is a fantastic way to interop with typescript communities, but I'm less confident about deeper typescript integration and would love for CoffeeScript to have a type system as flexible as CoffeeScript itself! But I don't know what that would look like at all right now (my vaporware language does not yet have macros for this reason).

Anyway, I'll focus for now on a PR for @ bindings and see if it's actually as unambiguous/backwards compatible as I think it is. "@ binding" is also hard to search for and not very descriptive--it's more like "staged destructuring" both as a concept and in implementation.

@vendethiel
Copy link
Collaborator

Sorry, just to be clear -- are you describing these as likely prerequisites for a potential path to that functionality in CoffeeScript

More likely would be that someone PRs something that TC39 added in its pattern matching proposal.

or are you saying macros and/or language support for pattern matching are likely out of scope for CoffeeScript?

I think if TC39 settles on a pattern matching syntax, and someone PRs that with a syntax that doesn't break anything, it has a chance to be included.

I am however saying that macros are out of scope. You can take a look at #3171, or the BlackCoffee fork.

@cosmicexplorer
Copy link
Contributor Author

Thanks so much!! :D :D really really appreciate taking the time to elaborate here, especially on how CoffeeScript makes decisions in general, and looking to TC39.

I am however saying that macros are out of scope.

This is what I expected ❤️ but wanted to make sure! Will totally check out those links.

Also, I have created a staged-destructuring branch for this proposal (see https://github.com/jashkenas/coffeescript/compare/main...cosmicexplorer:coffeescript:staged-destructuring?expand=1). I haven't tried to modify the parser yet, but was able to add the @lhs = yes field to IdentifierLiteral nodes in place context by extending the existing propagateLhs method, so there is now a tag on all nodes which are in place context (i.e. being assigned to or destructured). The next step is to create a new node for <IdentifierLiteral>@<destructuring-expr>. I think I will try to modify the lexer and parser first, to see if I'm correct that it's not a breaking change.

@cosmicexplorer
Copy link
Contributor Author

Ended up playing around a lot with the grammar, and identified some miscompiles in function param destructuring:

# Prior miscompiles (before this commit) from failing to sufficiently differentiate value vs place
# expressions in function params (different than LHS of assignment--cannot assign to existing
# values except this-properties e.g. `@x`).

; coffee -c -b -s --no-header <<EOF
({a: (x)}) -> x
EOF
# (function(arg) {
#   var arg, x;
#   x = arg.undefined;
# });

; coffee -c -b -s --no-header <<EOF
({["x"]}) -> x
EOF
# (function({["x"]: "x"}) {
#   return x;
# });

; coffee -c -b -s --no-header <<EOF
([x[1]]) -> x
EOF
# (function([x[1]]) {
#   return x;
# });

# This is an ICE!
; coffee -c -b -s --no-header <<EOF
({x: x[1]}) -> x
EOF
# TypeError: Cannot read properties of undefined (reading 'value')
#     at atParam (.../lib/coffeescript/nodes.js:6543:54)
# ...

These cases (which currently generate invalid js or ICE) are now recognized as invalid at parse time:

COMP=y AST_PATH='.body.expressions[0].body.expressions[0]' coffee semantics.coffee "$(echo -e '({a: (x)}) -> x')"
{
  text: undefined,
  expected: [ "'IDENTIFIER'", "'JSX_TAG'", "'['", "'@'", "'{'" ]
}
[stdin]:1:6: error: unexpected (
({a: (x)}) -> x

This stricter parsing mode should not modify any runtime semantics, but if we want backwards compatibility, it should probably not fail at coffeescript compile time.

I am going to split these changes to the parser (with fixes for the miscompiles I found) out into a separate branch, and will propose a PR for that before looking at extending any semantics with @ or the like.

@cosmicexplorer
Copy link
Contributor Author

Have been making a lot of progress on harnesses for testing changes, and optimized jison quite a bit. See #5473. I think the semantics.coffee debugging tool might be a useful contribution at some point as well.

@cosmicexplorer
Copy link
Contributor Author

I created #5474 for a necessary prerequisite to this work.

@cosmicexplorer
Copy link
Contributor Author

cosmicexplorer commented Nov 24, 2024

@aurium mentioned as here and I think it's a much better choice than @ for this staged destructuring proposal: #5307 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants