Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destructured Assignment #3897

Closed
marler8997 opened this issue Dec 12, 2019 · 19 comments
Closed

Destructured Assignment #3897

marler8997 opened this issue Dec 12, 2019 · 19 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@marler8997
Copy link
Contributor

marler8997 commented Dec 12, 2019

Destructured Assignment

EDIT: proposal modified to use a syntax based on suggestion from @MasterQ32

I'm exploring a new "variable declaration syntax" that requires the variable name match the field name of a struct. For now I'm calling it "Destructured Assignment". For example:

Normal assignment:

const TypeInfo = @import("builtin").TypeInfo;

Destructured Assignment

const (TypeInfo) = @import("builtin");

In this case the ( parenthesis ) surrounding the variable name enable "destructured assignment", which will assign the variable to the value of the field on the right hand side of the assignment with the same name.

To generalize, normal assignment from a struct field is of the form const V = S.F; and destructured assignment is of the form const (V) = S;. The two forms have a subtle difference in semantics. In normal assignment, a new variable V is being declared whose name is unrelated to the field name F. In destructured assignment, the compiler verifies the new variable name V matches the name of a field in S. It's a subtle difference but supporting it can prevent some types of errors and enable other features as I'll explain below. The following example shows a potential error that is not possible with destructured assignment:

// potential error, PackedSliceEndian != PackedIntSliceEndian
pub const PackedSliceEndian = @import("std").packed_int_array.PackedIntSliceEndian;

// if the previous was an error, it's not possible with destructured assignment
pub const (PackedSliceEndian) = @import("std").packed_int_array;

Multi-Variable Declaration

With this new feature that requires a variable name to match the field name of a struct, it provides a one-to-one mapping from variable to field, which enables multiple variables to be declared at once.

// normal assignment
const V1 = S.F1;
const V2 = S.F2;
const V3 = S.F3;
const V4 = S.F4;

// multiple destructured assignment
const (V1, V2, V3, V4) = S;

Note that these 2 types of assignment are not expressing the same intention. The destructured assignment explicitly requires that V1 through V4 are valid field names in S but the normal assignment does not. Also note that you could forgo the multi-variable part and just use single-variable destructured assignment to get this intention:

const (V1) = S;
const (V2) = S;
const (V3) = S;
const (V4) = S;

Here are some examples multi-variable assignment would enable:

Multi-value assignment from labeled blocks:

// note that destructed assignment guarantees that these variable names
// will match the names of the fields in the break statement
const (a, b, c) = init: {
    break :init {.a=1, .b=2, .c=3};
};

In this example, if we changed the break statement to break :init {.a=1, .b=2, .d=3}, then we would get an error indicating that the struct does not have a field named c. Note that this also works with functions:

var (width, height) = getSize();

Multiple module import declarations:

const std = @import("std");

// normal assignment
const c = std.c
const debug = std.debug;
const os = std.os;
const fs = std.fs;
const mem = std.mem;

// destructured assignment
const (c, debug, os, fs, mem) = std;

To see a real-world example, std.zig exposes some types from other modules. This change would ensure that the exposed names always match the internal names. It would look something like this:

pub const (AlignedArrayList,
           ArrayList) = @import("array_list.zig");
pub const (AutoHashMap,
           HashMap,
           StringHashMap) = @import("hash_map.zig");
pub const (BloomFilter) = @import("bloom_filter.zig");
pub const (BufMap) = @import("buf_map.zig");
pub const (BufSet) = @import("buf_set.zig");
pub const (Buffer) = @import("buffer.zig");
pub const (BufferOutStream) = @import("io.zig");
pub const (ChildProcess) = @import("child_process.zig");
pub const (DynLib) = @import("dynamic_library.zig");
pub const (Mutex) = @import("mutex.zig");
pub const (PackedIntArray,
           PackedIntArrayEndian,
           PackedIntSlice,
           PackedIntSliceEndian) = @import("packed_int_array.zig");
pub const (PriorityQueue) = @import("priority_queue.zig");
pub const (Progress) = @import("progress.zig");
pub const (ResetEvent) = @import("reset_event.zig");
pub const (SegmentedList) = @import("segmented_list.zig");
pub const (SinglyLinkedList,
           TailQueue) = @import("linked_list.zig");
pub const (SpinLock) = @import("spinlock.zig");
pub const (Target) = @import("target.zig");
pub const (Thread) = @import("thread.zig");

Here is the grammar change required to support this:

- VarDecl <- (KEYWORD_const / KEYWORD_var) IDENTIFIER (COLON TypeExpr)? ByteAlign? LinkSection? (EQUAL Expr)? SEMICOLON
+ VarDecl <- (KEYWORD_const / KEYWORD_var) Identifiers (COLON TypeExpr)? ByteAlign? LinkSection? (EQUAL Expr)? SEMICOLON
+ Identifiers <- IDENTIFIER | LPAREN IDENTIFIER (COMMA IDENTIFIER)* RPAREN
@ikskuh
Copy link
Contributor

ikskuh commented Dec 13, 2019

This looks pretty much like a selective usingnamespace which does not import all symbols but a selected list.
To be honest, i don't like the syntax at all as it is not obvious what it does nor is it common syntax. It would be better to have either a more common destructuring syntax like this:

const (A, B) = c; // requires c to _only_ have A and B
const (A, B, _) = c; // requires c to have A and B

This would work for both static members via type name and fields via struct instances.

Or we could improve/extend usingnamespace to allow selective import:

usingnamespace @import("std") with Mutex, Thread; // would only import std.Mutex and std.Thread

This would be my preferred way as it would allow keeping a single assignment syntax

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Dec 13, 2019
@andrewrk andrewrk added this to the 0.7.0 milestone Dec 13, 2019
@marler8997
Copy link
Contributor Author

@MasterQ32 that sounds like a reasonable variation. It means that you can see that it's a destructured assignment earlier on in the declaration and is more consistent with other languages like Python.

const a b ..=.. c;

VS

const (a, b) = c;

I'm going to edit the proposal description with your syntax variation.

@marler8997
Copy link
Contributor Author

marler8997 commented Dec 13, 2019

The usingnamespace varitions makes sense for imports, but doesn't work well with other use cases such as destructuring expressions like labeled blocks or function calls.

var (width, height) = getSize();
// VS
usingnamespace getSize() with width, height;
const (a, b, c) = init: {
    break :init {.a=1, .b=2, .c=3};
};
// VS
usingnamespace init: {
    break :init {.a=1, .b=2, .c=3};
} with a, b, c;

I'm also not sure how you would declare the new variables as const or var. Also, the common pattern seems to be that new symbols usually appear first in the declaration (function symbols will change to this as well.) rather than in the middle or at the end.

@mogud
Copy link
Contributor

mogud commented Dec 13, 2019

So with this, usingnamespace can be replaced by something like:

// 1
const (...) = @import("std.debug");

// 2
const (*) = @import("std.debug");

@ikskuh
Copy link
Contributor

ikskuh commented Dec 13, 2019

This would be my preferred way as it would allow keeping a single assignment syntax

I want to explain this a bit further: I am against destructuring of structure values. Why?

Your example only proposes destructuring on assignment, but it if done right, it should be allowed on every access to variables. This means: if, for, while will require destructuring as well:

const (a, b) = c;
var (a,b) = c;
while(c) |(a,b)) { }
while(c) |*(a,b)) { } // does this even make sense?!
for(c) |(a,b)| { }
for(c) |*(a,b)| { }
for(c) |(a,b), i| { }
for(c) |*(a,b), i| { }
if(c) |(a,b)| { }
if(c) |*(a,b)| { }
…

There's a whole lof of new options that must be taken into account and this will make the language a lot more complicated and is against (Only one obvious way to do things. and even more Reduce the amount one must remember.).

Or you disallow the destructuring of mutable variables which will make it a special case only available for const values which is also that great.

tl;dr: I don't like adding destructuring of values as it adds a lot of stuff to the language. Having a selective alternative to usingnamespace would be a nice thing

@marler8997
Copy link
Contributor Author

marler8997 commented Dec 13, 2019

@MasterQ32 thanks for your feedback. I agree there's alot to analyze in how this interacts with the rest of the language and there may be alternatives that are simpler or more complex. We should be sure to explore all of these. In my initial exploration, I found that only supporting this in "variable declaration" makes it simple (a new grammar rule to support (ID, ID, ID...) in VarDecl) and I think covers most of the use cases for it.

You said "if it is done right it should be allowed on every access to variable". Then you went on to describe how the "right" way of doing this is overly complicated :) You may have said it was the "right" way but I think you showed that it actually isn't. I intentionally limited this proposal to variable declaration because it is simple but still covers most uses case I could think of. Every language feature has a balance and it's important to find the right balance between simplicity and flexibility.

The questions we should be asking is what use cases does each variation of a new feature enable and how much does each variation complicate the language. Note that this proposal has 2 parts as well.

  1. Support destructuring one field from a struct
  2. Support destructuring multiple fields from a struct.

Only supporting one field is less complicated, but supporting multiple provides some nice "functional" language benefits because it allows you to create multiple variables from single expression without having to give that expression a name. For example, supporting multiple variables allows you to return multiple values from a labeled block but only supporting one does not. However it also makes it more complicated, so how do we find the right balance? I do this by trying to identify real-world use cases and trying to think of all the ways this can interact with other parts of the language. Community feedback and group effort is also critical here, someone always thinks of something that no one else has thought of. I think you showed that a feature like this needs to be designed carefully at the risk of creating an explosion of complication.

@marler8997
Copy link
Contributor Author

marler8997 commented Dec 13, 2019

So with this, usingnamespace can be replaced by something like:

// 1
const (...) = @import("std.debug");

// 2
const (*) = @import("std.debug");

Syntatically that looks quite nice. However, I think usingnamespace is intentionally verbose because instances of it should be easy to find (i.e. via grep) and they should "stand out" when you're looking at code. They introduce new variables into your scope that have not been explicitly typed out, so it's important to be able to know when this happens and easily find it.

In fact, one of the benefits of this feature is that it will encourage people to use usingnamespace less often because now there is a convenient alternative. Instead of:

usingnamespace @import("foo.zig");

I would hope to see more of:

const (a, b, c, d, e, f) = @import("foo.zig")

However, today you would have to do this to avoid usingnamespace:

const a = @import("foo.zig").a;
const b = @import("foo.zig").b;
const c = @import("foo.zig").c;
const d = @import("foo.zig").d;
const e = @import("foo.zig").e;
const f = @import("foo.zig").f;

@ikskuh
Copy link
Contributor

ikskuh commented Dec 13, 2019

You may have said it was the "right" way but I think you showed that it actually isn't.

That was probably the whole idea behind this. The "right" was meant in: "Why does it work at this place but not in any other? I like the feature in this place and want to use it everywhere!"

Note: All following statements are only on value destructuring as i think that usingnamespace should be expanded for selective namespace imports instead of expanding the assignment syntax. But if assignment syntax is expanded, usingnamespace is probably obsolete.

For real-world usecases for value destructuring, i can quote some of my C++ code code:

// Iteration over entity-component pairs in a ECS
for(auto [ entity, component ] : xcs::get_components<NameComponent>(universe))
{
    std::cout << "entity " << entity.id << " is named '" << component.name << "'";
}

This would be possible in Zig if we'd allow destructuring everywhere.

For other real-world examples, SDL_CreateWindowAndRenderer would be a thing:

const (window, renderer) = try SDL.createWindowAndRenderer(…);

On the two parts:

  1. Support destructuring one field from a struct

I personally don't think this is a existing use case for value destructuring as i cannot imagine a real world use case for this. If a function returns a struct instead of a single primitive value, it has a reason to do so. Otherwise i could just change the return type if not all of the results are relevant.

  1. Support destructuring multiple fields from a struct.

As shown above, there are use cases where multi-value return types are quite useful to prevent the code to be filled with clobber, so it may increase readability of the code.

@marler8997
Copy link
Contributor Author

marler8997 commented Dec 13, 2019

Note: All following statements are only on value destructuring as i think that usingnamespace should be expanded for selective namespace imports instead of expanding the assignment syntax. But if assignment syntax is expanded, usingnamespace is probably obsolete.

I think usingnamespace still has a place even with destructured assignment. You use it when you don't care about polluting your scope with undeclared symbols (you shouldn't use this often). This is a valid use case when you just want to forward all the symbols from one place to another. However, usingnamespace is easy to abuse because people don't want to type out a bunch of declarations assigning fields from the same structure.

usingnamespace @import("foo.zig");
// OR
const a = @import("foo.zig").a;
const b = @import("foo.zig").b;
const c = @import("foo.zig").c;
const d = @import("foo.zig").d;
const e = @import("foo.zig").e;
const f = @import("foo.zig").f;

By making it easier to assign multiple values from a struct, hopefully people will do this more often:

const (a, b, c, d, e, f) = @import("foo.zig")

Another thought I have on supporting "selective imports" with usingnamespace is that it dilutes the meaning of usingnamespace. Since usingnamespace shouldn't be used often and is easy to abuse, you want to be able to easily find where it's being used. However, if you start supporting "selective imports", now you're going to see usage all over the place and it's not easy to see which ones are importing selectively and which are importing everything. If we decide not to support destructured assignment but only to support selective imports instead, I would suggest another keyword or some way to easily differentiate the two so as not to obfuscate when people are mixing entire scopes together without limiting it to explicit symbols. That being said, this isn't a problem if we leave usingnamespace as it is and add support for destructured assignment.

Support destructuring one field from a struct
I personally don't think this is a existing use case for value destructuring as i cannot imagine a real world use case for this. If a function returns a struct instead of a single primitive value, it has a reason to do so. Otherwise i could just change the return type if not all of the results are relevant.

The first half of the description is only about destructing one field. Main use case being that you want to pull in a symbol from a module, and your intent is that the local name matches the name in the module. Although, I can't really think of alot of use cases outside of that for single destructured assignment.

@emekoi
Copy link
Contributor

emekoi commented Dec 13, 2019

so selective imports are going to work on both fields and top level declarations?

@ghost
Copy link

ghost commented Dec 14, 2019

There is already an accepted proposal for destructured assignment of tuples (not decls): #498

@nektro
Copy link
Contributor

nektro commented Jul 10, 2020

Would prefer this greatly over usingnamespace as it has use cases reaching far beyond just imports.

@Mouvedia
Copy link

Mouvedia commented Sep 20, 2020

// C#
var (foo, _, bar) = qux;

// Python
foo, _, bar = qux
head, *tail = [1, 2, 3, 4, 5]

// javascript
const { fName: firstName, lastName } = { fName: 'John', lastName: 'Doe' };
const [head, ...tail] = [1, 2, 3, 4, 5];

// Ruby
head, *tail = [1, 2, 3, 4, 5]

Id want to have at least 3 features:

  • rename (e.g. if the library's API sucks, or to avoid conflicts)
  • ignore/discard (_)
  • rest

@andrewrk andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 27, 2020
@marler8997
Copy link
Contributor Author

marler8997 commented Dec 4, 2020

I recently attended a specs meeting where I learned more about Andrew's perspective on new language features. He described that some languages (like the D language) will accept features if they seem like a good idea/improvement and have a low cost to implement. However, he says Zig has a much higher bar. Namely, before accepting a feature he says to ask:

  • Do we need this feature (how ugly is the solution without the proposed feature)

The next question I wanted to gauge is how ugly can the alternative solution be but still not be enough to warrant a language feature? I brought up the std.ArrayList example where we currently have 2 versions of Array list that only differ in where the allocator is stored (field or function parameter). All the code in their method bodies are duplicated because having one version of ArrayList call the other resulted in a problem that couldn't be solved well with current language features. Some features have been proposed to fix this, however, he feels the ugliness of code duplication (at least in this case) isn't enough to warrant new language features.

Now that I have a better gauge on how high the bar is for new features, I am certain that this proposal does not meet that bar. The alternative solution to this proposal is much less ugly than the std.ArrayList example. For that reason I'm certain it won't be accepted so I'll be closing it to save others' time, and hopefully, this explanation helps others gauge the viability of future proposals and help converge the communities perspective on what Zig is trying to be.

@Mouvedia
Copy link

Mouvedia commented Dec 4, 2020

@marler8997 I understand that you opened the issue but there are 15 👍 on it currently.
That puts it in the top 15 for issues with the most 👍

This is extremely useful/convenient and an almost ubiquitous feature in programming languages, nowadays.
Id prefer if you would reopen it; IMHO when dealing with very popular requests only core members should be allowed to close issues.

@nektro
Copy link
Contributor

nektro commented Dec 4, 2020

I am certain that this proposal does not meet that bar.

@marler8997 I'm going to have to respectfully disagree here. I would argue Zig code is in fact significantly more ugly without this proposal. Additionally, given the amount of positive +1's on this issue I find it very odd you chose to close this issue prematurely.

@andrewrk
Copy link
Member

andrewrk commented Dec 4, 2020

Note that we have an accepted proposal for destructured assignment: #498

I support what @marler8997 said in #3897 (comment) and to further clarify:

This proposal is reasonable and solves a problem. However, as pointed out, language features are added in Zig only if they are necessary to avoid footguns when solving real world use cases. The null hypothesis for any addition to the language is "no" and then it is an uphill battle to get it added.

It's hard to make a language that is small and simple. It requires saying "no" to a lot of reasonable ideas.

@Mouvedia
Copy link

Mouvedia commented Dec 4, 2020

@andrewrk could you give us 3 example snippets for the 3 use cases (rename, discard, rest) using the #498 proposal?
Also how would you achieve the usingnamespace trick with your proposal?

I am asking because I am seeing stuff I don't know about.
e.g. -> %DivResult {

@nektro
Copy link
Contributor

nektro commented Dec 4, 2020

Note that we have an accepted proposal for destructured assignment: #498

I really hope that doesn't get added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

7 participants