-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
destructuring via new useall
keyword
#21879
Comments
if the second line holds then how would this help
the places i've wanted struct destructuring i have not wanted to do it to multiple instances of the same type and this is the exact form i'd want. the practice in general has to be done very carefully so you dont break #7769 assumptions |
I know it's probably never going to happen, but sometimes it would be nice to opt-in to having destructors. I think destructors would handle most use-cases of this better. One could imagine a different container type that calls destructors on fields that are also of that container type const Bar = object {
array: std.ArrayList(u8),
// I don't know what syntax to define a destructor would look like
fn ~(this: *Bar) {
this.array.deinit();
}
};
const Foo = object {
my_destructable_type: Bar,
}; |
this pattern additionally doesnt map well to helping i'd be very curious to see more evidence of places in an existing codebases where having this keyword would have prevented a bug / memory leak edit: to answer my own question, the example that comes to mind the most is not in a it wouldve been really nice to make this diff: - const f = x.to_float();
+ const r, const g, const b, const a = x.to_float(); |
The way I read the proposal, it's not about deiniting. The first section continues with I'm writing serialization/parser code and for example this feature would help me guarantee I don't forget to add appropriate code whenever I add a field to my structs. In the parsing code you can already get today some help by not giving a default value to the field and using |
As someone who doesn't use any IDE, not even editor tab-complete. I always appreciate syntax that's as terse as possible. So would like to suggest a slightly different syntax. Perhaps a more restrictive use case as well. const Foo = struct {
a: u8,
b: u8,
};
fn foo() Foo {
return .{ .a = 1, .b = 2 };
}
test Foo {
useall f = foo();
var foo_a = f.a;
const foo_b = f.b;
foo_a += 1;
// Where any additional attempt to use f, might also be a compile error
const invalid_result = f.a + foo_b;
// error: f.a was assigned to foo_a
// note: foo_a declared here [line number]
const valid_use = foo_a + foo_b; // 4
// f can't be modified or accessed at all following the first.
const new_a = f.a; // error: f.a was already assigned
// note: f.a assigned to foo_a on [line number]
}
// Extended example as requested.
const Bar = struct {
a: u8 = 1,
b: u8 = 2,
// c was just added, and the following test results in a compile error
c: u8 = 3,
};
test Bar {
useall b = bar{};
// Duplicating the same unfortunate bug
var bar_a = b.b;
var bar_b = b.a;
// error: [b.c] is unused
}
test "Bar fixed" {
useall b = bar{};
// Duplicating the same unfortunate bug
var bar_a = b.b;
var bar_b = b.a;
// We don't actually need to use b.c in this code, so we can suppress the error
_ = b.c;
bar_a += 10;
bar_b += 20;
std.debug.print("{}", .{bar_a}); // 12 [note the bug/typo above]
std.debug.print("{}", .{bar_b}); // 21 [note the bug/typo above]
// The following lines would each generate a compile error
std.debug.print("{}", .{b.a}); // would print 1 but for the compile error
std.debug.print("{}", .{b.b}); // would print 2 but for the compile error
std.debug.print("{}", .{b.c}); // would print 3 but for the compile error
} You could then apply this same typing system to a function definition as well fn needsFoo(power: u8, foo: useall Foo) u8 {
var fa = foo.a;
// error: variable foo.b is unused.
fa *|= power;
return fa;
}
fn fooOptional(power: u8, foo: useall Foo) u8 {
var fa = foo.a;
// we don't really need foo.b here, but my linter won't let me omit useall Foo
_ = foo.b;
fa *|= power;
return fa;
} edit number... I've lost count (sorry everyone else 😅 ) duplicating the original example would be fn add(options: useall struct { a: i32, b: i32 }) i32 {
return options.a;
// this would be a compile error because options.b is unused
} |
A compile error can be emitted on field addition using the following method in current zig (latest nightly). This code: const std = @import("std");
test "use all struct fields, give me a compile error when a struct field is added" {
const MyStruct = struct {
field1: []const u8 = "this is field1",
field2: []const u8 = "this is field2",
field3: []const u8 = "this is field3",
};
const MyStructFieldEnum = std.meta.FieldEnum(MyStruct);
const my_struct = MyStruct{};
inline for (std.meta.fields(MyStruct)) |struct_field| {
const current_field = std.enums.nameCast(MyStructFieldEnum, struct_field.name);
const current_value = @field(my_struct, struct_field.name);
switch (current_field) {
.field1 => std.debug.print("{s}\n", .{current_value}),
.field2 => std.debug.print("{s}\n", .{current_value}),
.field3 => std.debug.print("{s}\n", .{current_value}),
}
}
} Prints:
And when I add a new field: const MyStruct = struct {
field1: []const u8 = "this is field1",
field2: []const u8 = "this is field2",
field3: []const u8 = "this is field3",
new_field: []const u8 = "oops new field",
}; I get the following compile error:
|
I don't understand the restriction to having to use the next N lines to reference all fields through This would not disallow a simple IDE autocomlete to destructure in the next N lines and would notably help in the |
With the alternate semantics, this could be useful for printing AST and getting a compile error when a new feature is added: const FunctionDefinition = struct {
name: []const u8,
args: []FunctionArg,
inline: bool,
};
fn printFunctionDefinition(node: FunctionDeinition, out: *std.ArrayList(u8)) !void {
useall node;
if(node.inline) try out.writer().print("inline ");
try out.writer().print("function {s}(", .{node.name});
for(node.args) |arg| try printFunctionArg(arg, out);
try out.writer().print(");", .{});
} AST is updated with a new syntax feature: const FunctionDefinition = struct {
name: []const u8,
args: []FunctionArg,
inline: bool,
+ return_type: *Type,
}; Zig now gives an error, alerting you to the needed update: error: node.return_type is not used in this scope.
useall node;
^ |
Replying to my above comment: On 7th thought, the syntax in this case might better be represented as fn add(useall options: struct { a: i32, b: i32 }) i32 {
return options.a;
// this would be a compile error because options.b is unused
} The same way |
This comment was marked as duplicate.
This comment was marked as duplicate.
One potential issue I see with this is that normally the function signature communicates constraints to the caller ( |
I don't think I would use a feature like this much, but it seems like now is the time to bikeshed. Perhaps a block operator would fit existing syntax more: useall (bar()) |b| {
_ = b.a;
// etc
} This would also make the scope of the involved objects clear. All fields of I would also like to point out that with some minor syntactic contortions, this feature can already essentially be implemented in userspace: https://zig.godbolt.org/z/dqn8cfsjh |
Why a keyword and not a builtin? The use in the OP reminds me more of builtins like @setRunTimeSafety() or @setBranchEvalQuota(), where as @GrayHatter's suggestion feels more like a keyword, though @kj4tmp makes a very good point about it's use for function parameters. The OP example using a builtin could look like this test "example" {
const f = foo();
@useAll(f);
const foo_a = f.a;
const foo_b = f.b;
const b = bar();
@useAll(b);
const bar_a = b.b; // bug not caught by this proposal
const bar_b = b.a; // bug not caught by this proposal
} Or given it's use is to check fields have been used/accessed would a @allFieldsUsed() who's position within the scope matters have any advantage? e.g. test "example" {
const f = foo();
const foo_a = f.a;
const foo_b = f.b;
@allFieldsUsed(f);
const b = bar();
defer @allFieldsUsed(b);
const bar_a = b.b; // bug not caught by this proposal
const bar_b = b.a; // bug not caught by this proposal
} As a builtin it could support variadic arguments e.g. test "example" {
const f = foo();
const foo_a = f.a;
const foo_b = f.b;
const b = bar();
const bar_a = b.b; // bug not caught by this proposal
const bar_b = b.a; // bug not caught by this proposal
@allFieldsUsed(f, b);
} |
I agree with @cdurkin that Introducing a built-in function instead would allow for clear semantics while minimizing potential confusion. This built-in would enforce a compile-time error if any fields in a structure are not referenced in the current scope, but without requiring all fields to be explicitly declared in a particular order. This would also help avoid verbosity in larger structs, where fields are declared only to satisfy For example, consider the following usage: const std = @import("std");
const User = struct {
info: UserInfo,
key: []const u8,
};
const UserInfo = struct {
name: []const u8 = "no_name",
age: u8 = 0,
is_registered: bool = false,
};
fn registerNewAdultUser(user_info: UserInfo) ?User {
useall user_info; // enforced by the `useall` keyword
const name = user_info.name;
const is_registered = user_info.is_registered;
const age = user_info.age; // just a compiler error because it's not used elsewhere
if (is_registered) {
return null;
}
return .{
.info = user_info,
.key = name,
};
} In contrast, a built-in function like fn registerNewAdultUser(user_info: UserInfo) ?User {
@UseAllFields(user_info); // more flexible as it can be placed anywhere in the function
const name = user_info.name;
const is_registered = user_info.is_registered;
if (is_registered) {
return null;
}
// Compile-time error here if `age` is not referenced, as it is currently unreferened in.
// the current scope
return .{
.info = user_info,
.key = name,
};
} This approach allows the compiler to enforce that all fields are referenced within the function's current scope without requiring declaration order, promoting readability and flexibility, especially for large structures. Even for you the maintainers, if you realize later that the |
I think for something like pub const S = struct {
a: u32,
b: u64,
pub fn update(noalias s: *S, other: *const S) void {
// hypothetical syntax, replacement via a builtin
// would probably make more sense here
useall (s, .update);
s.a = other.a +% 2;
s.b = other.b *% 4;
// all fields were updated, works
}
}; Adding another field and not writing to it would then produce a compile error, even if the field is read from: --- a/a.zig
+++ b/b.zig
@@ -1,13 +1,15 @@
pub const S = struct {
a: u32,
b: u64,
+ c: bool,
- pub fn update(noalias s: *S, other: *const S) void {
+ pub fn update(noalias s: *S, other: *const S) bool {
// hypothetical syntax, replacement via a builtin
// would probably make more sense here
useall (s, .update);
s.a = other.a +% 2;
s.b = other.b *% 4;
+ // we're reading from `s.c` but forgot
+ // to update it, so an error is emitted
+ return s.c;
- // all fields were updated, works
}
};
The big issue I see with this approach is that if I also think the use-case for a feature like |
We could avoid adding a language feature altogether by simply implementing struct-destructuring in userland. Codeconst std = @import("std");
pub fn destructure(
/// `DestructureOrder(@TypeOf(value))`
comptime order: anytype,
value: anytype,
) Destructured(@TypeOf(value), order) {
const is_pointer = @typeInfo(@TypeOf(value)) == .pointer;
var result: Destructured(@TypeOf(value), order) = undefined;
inline for (order, 0..) |field_tag, i| {
const ptr = &@field(value, @tagName(field_tag));
result[i] = if (is_pointer) ptr else ptr.*;
}
return result;
}
pub fn DestructureOrder(T: type) type {
const Deref = switch (@typeInfo(T)) {
.pointer => |ptr_info| switch (ptr_info.size) {
.One => ptr_info.child,
else => T,
},
else => T,
};
return switch (@typeInfo(Deref)) {
.@"struct" => |info| [info.fields.len]std.meta.FieldEnum(Deref),
else => @compileError("Cannot destructure " ++ @typeName(Deref)),
};
}
pub fn Destructured(T: type, order: DestructureOrder(T)) type {
const Deref, //
const ptr_info //
= switch (@typeInfo(T)) {
.pointer => |ptr_info| switch (ptr_info.size) {
.One => .{ ptr_info.child, ptr_info },
else => unreachable,
},
else => .{ T, {} },
};
const is_pointer = @TypeOf(ptr_info) != void;
var fields: [order.len]std.builtin.Type.StructField = undefined;
for (
&fields,
order,
@typeInfo(@TypeOf(.{{}} ** order.len)).@"struct".fields,
) |*dst_field, field_tag, ref_field| {
const src_field = @typeInfo(Deref).@"struct".fields[@intFromEnum(field_tag)];
dst_field.* = src_field;
dst_field.name = ref_field.name;
dst_field.alignment = 0;
if (is_pointer) {
const dummy_ptr: T = @ptrFromInt(std.mem.alignBackward(usize, std.math.maxInt(usize), ptr_info.alignment));
dst_field.type = @TypeOf(&@field(dummy_ptr, @tagName(field_tag)));
if (dst_field.is_comptime) {
dst_field.default_value = @typeInfo(@TypeOf(.{&@field(dummy_ptr, @tagName(field_tag))})).@"struct".fields[0].default_value;
} else {
dst_field.default_value = null;
}
}
}
return @Type(.{ .@"struct" = .{
.layout = .auto,
.backing_integer = null,
.is_tuple = true,
.decls = &.{},
.fields = &fields,
} });
}
test "destructure by val" {
const b, const a = destructure(.{ .b, .a }, .{
.a = 3,
.b = std.testing.random_seed,
});
try std.testing.expectEqual(3, a);
try std.testing.expectEqual(std.testing.random_seed, b);
}
test "destructure by ref" {
const orig = .{
.a = 3,
.b = std.testing.random_seed,
};
const b, const a = destructure(.{ .b, .a }, &orig);
try std.testing.expectEqual(&orig.b, b);
try comptime std.testing.expectEqual(&orig.a, a);
} |
This is a member of a class of related problems: code validation. A partial solution using a keyword doesn't cut the domain at the joints. Short list of validations from my notes:
This is intentionally an attention-sparing subset focusing on easy things. The list is several times larger. The common factors are these:
So I think the durable solution involves the following:
I would add that the most important restriction is that validators cannot affect or inform the actual act of compilation. All that they do is affirm that the code meets the additional constraints or invariants dictated by the validation metadata. I feel that this would be of real value, solving both known problems, and unanticipated ones which will arise in the future. |
That kind of implicit magic absolutely gives me heartburn and I suspect would give IDEs similar grief. This should be absolutely be explicitly scoped via the syntax in some way. Something like
Also note, I prefer the builtin over the keyword. This should be opt-in as I can easily see things like a wrapped C or C++ library adding an extra field that then gets reflected and suddenly you have compile errors and have to fix 12 layers of code with interlocking versions that you don't own. |
I really like the idea of @mnemnion, I think this would be really cool if we had a framework akin to the test one where you could enforce some invariants by creating a block similar to test, I'm not sure if this would need a separate block or how it should work / how it should be organized but this would be really helpful, both for the writer, as this is essentially an advanced assert oriented way of programming, but also for readers to quickly get at a glance what is guaranteed by the code. const std = @import("std");
const MyStruct = struct {
a: usize,
b: usize,
pub fn init() MyStruct {
return .{
.a = 0,
.b = 0,
};
}
pub fn update(my_struct: *MyStruct, a: usize, b: usize) void {
my_struct.a = a;
my_struct.b = b;
}
};
const testing = std.testing;
test "MyStruct - fn init" {
const my_var: MyStruct = .init();
try testing.expect(my_var.a == 0 and my_var.b == 0);
}
test "MyStruct - fn update" {
var my_var: MyStruct = .init();
my_var.update(1, 1);
try testing.expect(my_var.a == 1 and my_var.b == 1);
}
const validation = std.validation;
validate "MyStruct - fn update properties" {
try validation.expectUseAllFields(MyStruct.update, MyStruct);
try validation.expectNoSysCall(MyStruct.update);
try validation.expectNoErrors(MyStruct.update);
try validation.expectPure(MyStruct.update);
} This would be a clean way to enforce invariants that would serve the writer/user and improve the maintainability I'm not saying what I'm suggesting is the way to do it, nor am I suggesting that it should be implemented, but the idea of being able to enforce things is great, tests do that to a certain extent, but there could be an entire way class of properties that could be enforce, and serve a lot of different purposes, one of documentation, one of maintainability, and maybe one of optimizations. EDIT: Now that I think about it this could even help with autodoc, to sort of add Tags to each function's documentation, where each properties could have a name and colour depending on it's properties. Which could be very useful. |
Problem Statement
It's too easy to forget to add logic when you add a new field to a struct. This is obvious particularly in the
deinit
function for a struct where some fields own resources. However, that's typically only a leak. Other mistakes are generally more insidious.The compile error for unused locals is very useful, but it's not available when accepting a parameter as a struct. By destructuring into locals, you can get an error for unused struct field.
To illustrate this, consider the following code change.
Now we transition the function to accept arguments as a struct:
No compile error! 😱
Zig programmers should not be punished for using structured data.
Prior Art
std.meta
function to make it a compile error if you forget to use a struct field #21730Problematic Solution
The obvious approach causes name shadowing:
Also structs may have many fields.
No Language Changes Solution
This is problematic because you have to repeat the names, which can lead to bugs:
In fact I put a bug in there. Did you spot it?
It's also a pain in the ass so people would not do it even if it was "recommended".
Linter-Based Solution
This is almost entirely solvable with a linter tool, something like this:
The idea behind this is that it emits errors for these:
What's nice about this is you could put in the lint comment only, and have it auto-populate the following lines based on type information.
The name mismatch bug from above could be detected:
However you could tell the linter it's OK:
Avoiding bugs is too useful to be limited to linter functionality however. We want people reading the codebase to know that if such code compiles, all fields are used. So let's try to come up with a way to solve this in the language.
Actual Proposal
I propose a compromise between language simplicity and bug avoidance.
My reasoning is that:
Semantics clarifications:
useall
keyword must be one statement per field of the struct.useall
statement set (ending after the Nth statement)&s.field
ors.field
. Complex expressions are not allowed.Alternate semantics (not proposed due to complexity and dubious value):
I think this additional language complexity is acceptable due to this careful balance:
useall
statement, the code behaves exactly the same. You can safely ignore the feature. In fact, an alternate Zig compiler could simply ignore this statement and emit correct code. Similarly, first and third party tooling can safely and correctly ignore this statement.useall foo;
and then having their tooling auto-generate the destructure lines.The text was updated successfully, but these errors were encountered: