Skip to content

Support splats on left-hand sides of multiple assignment expressions#10410

Merged
straight-shoota merged 10 commits intocrystal-lang:masterfrom
HertzDevil:feature/multi-assign-lhs-splat
Nov 10, 2021
Merged

Support splats on left-hand sides of multiple assignment expressions#10410
straight-shoota merged 10 commits intocrystal-lang:masterfrom
HertzDevil:feature/multi-assign-lhs-splat

Conversation

@HertzDevil
Copy link
Contributor

Implements #132 by supporting one level of decomposition on the left-hand sides of multiple assignment expressions. The first case is when the right-hand side consists of a single expression:

a, b, *c, d, e, f = exp

# the above is equivalent to:
__temp_1 = exp
if __temp_1.size < 5 # number of non-splat variables on the LHS
  ::raise("BUG: multiple assignment count mismatch")
end
a = __temp_1[0]
b = __temp_1[1]
c = __temp_1[2..-4]
d = __temp_1[-3]
e = __temp_1[-2]
f = __temp_1[-1]

__temp_1 must respond to #size, #[](Int), and #[](Range). This includes Array, BitArray, Slice, String, Tuple, and to some extent Regex::MatchData. The main obstacle was getting Tuple to work here without allocating an Array, but now that #10379 is merged, the following is now possible:

a, *b, c = {1, true, 'a', ""}
typeof(b) # => Tuple(Bool, Char)
# without #10379: probably Array(Bool | Char | Int32 | String)

The second case is when there are multiple expressions on the RHS:

a, b, *c, d, e, f = exp1, exp2, ..., exp_last2, exp_last1

# the above is equivalent to:
a = exp1
b = exp2
*c = ::Tuple.new(exp3, ..., exp_last4) # empty tuple if there are 5 expressions
d = exp_last3
e = exp_last2
f = exp_last1

No temporaries are created, and it is a compile-time error to specify more non-splat assignment targets than RHS expressions (the shortest code that could trigger this is a, b, *c, d = 1, 2).

As with the splat-less case, each assignment target, including the splat target, can be a variable, a setter call, a []= call, or the underscore. The underscore can be used to omit some of the RHS values, since writing to it has no effect:

_, a, *_, b = exp # Ruby uses `*` as the underscore is a regular variable

# the above is equivalent to:
__temp_1 = exp
if __temp_1.size < 3
  ::raise("BUG: multiple assignment count mismatch")
end
a = __temp_1[1]
b = __temp_1[-1]

Like other places that perform splat collection, using more than one splat on the LHS is an error.

@asterite
Copy link
Member

asterite commented Feb 18, 2021

What does the community think of us having something like https://rubyheroes.com/ but for Crystal? I might have a nominee or two in my head...

@bcardiff bcardiff added this to the post-1.0 milestone Feb 18, 2021
@HertzDevil
Copy link
Contributor Author

Thinking that if the splat variable happens to be the underscore then we should omit that assignment altogether to avoid creating a temporary copy, since that range index can be fairly large.

@oprypin
Copy link
Member

oprypin commented Feb 19, 2021

If this PR is merged, it opens an easy migration path from the following:

a, b = [1, 2, 3, 4, 5]  # Silently ignores the last 3 items

to

a, b, *_ = [1, 2, 3, 4, 5]  # Exactly the same but explicit

-- this is good syntax to use if anyone is actually relying on the ignoring behavior, rather than having it by accident.

What I'm leading at, and the reason why there would need to be a "migration path" for anything, is that this silent ignoring behavior is very unexpected in my opinion, and should be replaced with an error about size mismatch (one like you see at the top of this PR). That is, a size check should be added to it.

You can say what you want about Ruby also ignoring this,... but somehow the error conditions in the new situations that this PR introduces make sense, don't they? Well, Ruby also silently ignores all of those cases, even up to a, b, *c, d = [1], and we're not copying that.

Additionally, this pull request literally makes those two syntaxes (see code blocks above) aliases of each other.
Particularly weird to me is the asymmetry of it. At face value, why should a, b = [1, 2] be an alias to a, b, *_ and not, say, *_, a, b??

Also I don't know any other programming language that ignores a, b = {1, 2, 3} but errors on a, b, c = {1, 2}. Ruby and JavaScript ignore both; Python and C++ error both.

Instead of gaining aliases I propose to use the pre-existing a, b = [1, 2] syntax for something better -- something I miss all the time. I almost always want the reassurance that I'm not ignoring extra items, and I don't want to write out the size check manually (it could even get outdated). Laziness here can definitely cause silent bugs.

With this in mind, it's sad to see the [post-1.0] label attached. Without this migration path we can't easily deprecate the error silencing, and after 1.0 we can't deprecate it at all.

@asterite
Copy link
Member

I agree with @oprypin : it would be nice if a, b = [1, 2, 3] would be a runtime error. That's a breaking change, but it will be impossible to do after 1.0 without breaking things.

I would like to think that if someone uses a, b = exp that's because they expect exp to have exactly two elements. And if they didn't expect that, the migration path is really straight-forward: just change it to a, b, * = exp.

@asterite
Copy link
Member

I'm thinking that if we really wanted to avoid a breaking change in some way, we could have a flag that says that a, b = ... should behave the old way, but let the default be the new, safer way. That means that if someone had Crystal code working on 0.36.1 and they upgrade to 1.0 and it breaks because of this, they can use that flag to get the old behavior. And they can do the upgrade at their time while still having their code working on 1.0.

Though maybe it's a lot of work...

@HertzDevil
Copy link
Contributor Author

The other day I thought about this and believe that maybe Crystal doesn't need 1-to-n multiple assignment after all if it leads to confusion between Ruby's lax multiple assignment rules (dropping elements at the end, including with block parameters) and pattern matching semantics (count mismatches, with or without LHS splats). Maybe it's time we have general pattern matching...

(We could still define 1-to-n multiple assignment in terms of some pattern matching construct afterwards.)

@HertzDevil HertzDevil marked this pull request as draft February 20, 2021 06:28
@oprypin
Copy link
Member

oprypin commented Feb 20, 2021

@HertzDevil I don't see how that comment is constructive here. Maybe you changed your mind at a whim, but there's nothing wrong with this pull request in that light. Let's not match Ruby, let's match Python 🙃

if it leads to confusion between Ruby's lax multiple assignment rules (dropping elements at the end, including with block parameters)

I have already outlined in my comment exactly how to remove the confusion.

@oprypin oprypin removed this from the 2.0.0 milestone Oct 28, 2021
@oprypin
Copy link
Member

oprypin commented Oct 28, 2021

@HertzDevil Is this pull request in a reviewable state? Or, why is it a draft?
I was thinking to get the ball rolling on this pull request first.

The rest I said in #11145 (comment)

@HertzDevil HertzDevil marked this pull request as ready for review October 29, 2021 10:02
@oprypin oprypin self-requested a review October 29, 2021 23:50
Copy link
Member

@oprypin oprypin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, that's quite a lot of CI failures. And the spec failures reproduce locally. bin/crystal spec/compiler/codegen/multi_assign_spec.cr

@oprypin
Copy link
Member

oprypin commented Nov 2, 2021

Thank you @HertzDevil !

@HertzDevil HertzDevil marked this pull request as ready for review November 4, 2021 05:52
@HertzDevil
Copy link
Contributor Author

HertzDevil commented Nov 4, 2021

Just to make sure again, should we require the value on the right-hand side of 1-to-n assignments to be Indexable yet? Do we do anything to these cases?

value = {0 => "x", 1 => "y", 2 => "z"}

# existing 1-to-n assignments (in a separate PR)
a, b, c = value

# 1-to-n splat assignments
# ill-formed because of `b = value[1..-2]`
a, *b, c = value

# 1-to-n splat assignments
# raises because of `b = value[-2]` and `c = value[-1]`
*_, b, c = value

# okay???
a, *b, c = Hash(Int32 | Range(Int32, Int32), String).new { |hsh, key| hsh[key] = 0 }

Related: #10937 (comment)

@straight-shoota
Copy link
Member

I'd say a restriction to Indexable would be good. The first case happens to work by chance, but I don't think it's an adequate use case. Multi-assign destruction is based on a collection with sequential items (that's expressed as Indexable). The fact that the same API for accessing this data is shared by other data types (such as Hash) is rather a coincidence. IMO it makes sense to require Indexable. If you want to use a hash as datasource like in the examples, you can just wrap it in a custom Indexable (which would just delegate to the hash).

@straight-shoota
Copy link
Member

straight-shoota commented Nov 4, 2021

We could consider adding .as(Indexable) instead of a dedicated compiler type check. That still adds a dependency on the Indexable type to the compiler, but it's not a built-in type.

@straight-shoota
Copy link
Member

This is probably a topic for a follow-up discussion, though. I'd prefer to merge this as is. There was no type restriction prior to this change. We can discuss and add that afterwards (just need to make sure it lands before 1.3 if it's a breaking change on newly-introduced behaviour).

@straight-shoota
Copy link
Member

I extracted the discussion about Indexable restriction for values to #11414 and added that to the 1.3.0 milestone to keep track of it.

@straight-shoota straight-shoota merged commit 9e52bc9 into crystal-lang:master Nov 10, 2021
@HertzDevil HertzDevil deleted the feature/multi-assign-lhs-splat branch November 14, 2021 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants