Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use undefined for the ptr field of Allocator/Random instances #21760

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

squeek502
Copy link
Collaborator

@squeek502 squeek502 commented Oct 21, 2024

The use of undefined here meant that it was always unsafe/illegal to compare the ptr fields of Allocator/Random. However, this restriction is not mentioned anywhere and the ptr field is being compared in real code, e.g. std.process.Child.collectOutput, which can lead to undefined behavior in release modes (see #21756).

There are two ways to address this:

  1. Codify that comparing ptr is always illegal/unsafe, and remove all existing comparisons.
  2. Set ptr to a legal dummy value, similar to the value returned by Allocator.alloc when a zero-length slice is requested.

The first option seems like footgun central (at least until usage of undefined is fully safety-checked in safe modes [#211]), so I went with the second option.

Closes #21756
Closes #17704

@rohlem
Copy link
Contributor

rohlem commented Oct 21, 2024

Since this is common to all pointer types, a utility method might make sense - something like std.mem.validUnusedPointer(PointerType)?

Confirmed that the pointer values returned for 0-length allocations are unchanged. Tested the following:

```zig
allocator.alloc(u8, 0);
allocator.alignedAlloc(u8, 256, 0)
allocator.alignedAlloc(u8, 2048, 0)
allocator.create(u0)
```

and tested `realloc` with:

```zig
const slice = try allocator.alloc(u8, 1);
const realloced = try allocator.realloc(slice, 0);
```

and

```zig
const slice = try allocator.alignedAlloc(u8, 256, 1);
const realloced = try allocator.realloc(slice, 0);
```
The use of undefined here meant that it was always unsafe/illegal to compare the ptr fields of Allocator/Random. However, this restriction is not mentioned anywhere and the ptr field *is* being compared in real code, e.g. `std.process.Child.collectOutput`, which can lead to undefined behavior in release modes (see ziglang#21756).

There are two ways to address this:
1. Codify that comparing `ptr` is always illegal/unsafe, and remove all existing comparisons.
2. Set `ptr` to a legal dummy value, similar to the value returned by Allocator.alloc when a zero-length slice is requested.

The first option seems like footgun central (at least until usage of `undefined` is fully safety-checked in safe modes), so I went with the second option.

Closes ziglang#21756
@squeek502
Copy link
Collaborator Author

squeek502 commented Oct 21, 2024

Good idea, added that as std.mem.dummyPointer and used it for the ptr fields and the std.mem.Allocator implementation. From the dummyPointer commit message:

Confirmed that the pointer values returned for 0-length allocations are unchanged. Tested the following:

allocator.alloc(u8, 0);
allocator.alignedAlloc(u8, 256, 0)
allocator.alignedAlloc(u8, 2048, 0)
allocator.create(u0)

and tested realloc with:

const slice = try allocator.alloc(u8, 1);
const realloced = try allocator.realloc(slice, 0);

and

const slice = try allocator.alignedAlloc(u8, 256, 1);
const realloced = try allocator.realloc(slice, 0);

@jibal
Copy link

jibal commented Oct 22, 2024

As a general rule, undefined should only be used to save execution time when you know that all code paths will result in the value either being set or never used, and only when it has an actual impact as with initializing large buffers (or smaller buffers in long loops). It should never be used when it is visible externally, as was being done here.

(It might be a good idea to downright disallow undefined pub scalars.)

@rohlem
Copy link
Contributor

rohlem commented Oct 22, 2024

@jibal I do think it makes sense to have a global var initialized to undefined and initialize it before the first usage.
For const it's much less useful, but it's still correct if that particular struct field should never be used in code.

I think the main surprise is that assigning an undefined value is valid, and that after a = x; b = x readers may expect that a == b.
The two exceptions to this are NaN (which are specific to floating point types) which yields false,
and current undefined semantics (which currently apply to any type, as we currently have no maybeundefined attribute in the language) which is illegal behavior.
(Not saying this should change, just an observation.)

@jibal
Copy link

jibal commented Oct 22, 2024

Why have a pub field that should never be used? These are footguns and totally unnecessary UB. Anyway, I suppose having the compiler ban it is not the Zig way, but IMO library code should not be planting these mines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants