Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add safety checks for pointer casting #2414

Open
andrewrk opened this issue May 3, 2019 · 4 comments
Open

add safety checks for pointer casting #2414

andrewrk opened this issue May 3, 2019 · 4 comments
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented May 3, 2019

I'm excited about this one. This connects a lot of dots and is part of the unofficial Make The Safe Build Modes More Safe project (#2301).

Here are some of the features of Zig this depends on:

  • there is no such thing as zero initialization
  • all values must be explicitly initialized (but initialization to undefined is OK)
  • most types (e.g. non-packed non-extern structs) intentionally have no well-defined in-memory layout

The proposal is to add a secret safety field to types which have no well-defined in-memory layout, similar to how unions have a secret safety tag field. The secret safety field has an integer which denotes the type id. A unique integer id will be generated for every type across an entire compilation.

Next, augment the rules about undefined values (see #1947) with this: in safe build modes, the bit pattern of undefined shall be 0xaa (repeating) across the store size of the type and for types which have no well-defined in-memory layout, the bit pattern 0xaa repeated across the store size shall not match a valid state.

This makes it possible to add safety checks to @ptrCast, @intToPtr, and @fieldParentPtr. It will be detectable illegal behavior (see #2402) if the actual element type does not match the target type specified in the cast, or if the memory has an undefined value.

Sometimes it is desired to @ptrCast or @intToPtr when you know the memory is undefined. For these cases we introduce @ptrCastUndef and @intToPtrUndef which simultaneously cast and assign undefined to the memory. These functions allow the programmer to change the type of memory in a legal way.

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label May 3, 2019
@andrewrk andrewrk added this to the 0.6.0 milestone May 3, 2019
@andrewrk andrewrk added the accepted This proposal is planned. label Nov 29, 2019
@andrewrk andrewrk modified the milestones: 0.6.0, 0.7.0 Feb 10, 2020
@andrewrk andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 9, 2020
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021
@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 20, 2021
@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022
@iacore
Copy link
Contributor

iacore commented Aug 6, 2022

This could be done with "butterfly" data before the pointed address with allocator's help.

memory layout: v is pointer address

                  v  
other data before user data here

This is used in V8 (JS runtime) to allow fast indirection with attached metadata (infrequently accessed).

Basically, you store type info, undefined-ness to the left side of the butterfly

  • set type info on allocation
  • set undefined-ness on free. use-after-free is same as use-undefined
  • unset undefined-ness on dereference-write
  • check undefined-ness on dereference-read
  • check type info on @intToPtr

Problems

Interop with C may break

What if the user only define a struct partially? Have undefined-bit for each field?

Existing @ptrCast with C union will break (since type info doesn't match)

It's better to have generation + memory address. (to prevent use-after-free with memory reuse).

@vadim-za
Copy link

vadim-za commented Sep 29, 2024

What if I want to abuse the pointer casting to temporarily cast to a wrong type (for efficiency sake)?

E.g. imagine a sentinel-based doubly linked list (like intrusive lists in boost). Something like

const Item = struct {
    data1: Data1,
    data2: Data2,
    link: Link,
};
const Link = struct {
    next: *Item,
    prev: *Item,
};
const List = struct {
    sentinel: Link,
};

(in reality List and Link would be generic structs of course).

Instead of setting next and prev pointers to null at the ends of the list (as Zig's std implementation does), they would point to the sentinel (the main benefit is that this avoids branching in list modification operations, compared to using nulls).
However, the sentinel is only a Link, but not an Item. Strictly speaking, next and prev must have type *Link in order to be able to point to the sentinel, but that would generate extra unnecessary pointer arithmetic upon list iteration and traversal, since one would need to convert from *Link to *Item in order to return *Item to the caller. So, it is potentially more efficient to pretend that sentinel is a part of a larger imagined Item object. This also drastically simplifies the list inspection in a debugger, as one can easily see the entire contents of the list items (instead of only the link data) by following the pointers.

This proposal seems to invalidate the respective implementation. The "workaround" function @ptrCastUndef also doesn't help here at all. Would one need to give up on tricks like that?

Edit: actually maybe one doesn't need @ptrCast here, the implementation mostly woudl rely on @field and @fieldParentPtr, not sure if those two would be also subject to runtime checks. One might however need to use allowzero pointers for Items, as the formal Item pointer obtained for the sentinel might be zero, in which case some @ptrCasts would be necessary.

@vadim-za
Copy link

vadim-za commented Sep 29, 2024

Also what if I want to cast without knowing in advance, whether the memory is undefined or already contains precious data? What if I want a pointer to memory containing garbage (e.g. returned by an allocator), which is not equal to undefined?

@vadim-za
Copy link

vadim-za commented Nov 2, 2024

What if I want to abuse the pointer casting to temporarily cast to a wrong type (for efficiency sake)?

Maybe runtime safety checking of pointed to types can be disabled on case-by-case basis. In particular, I could imagine the following approach:

const S1 = struct { ... };
const S2 = struct { ... };

const S = struct { field1: S1, field2: S2 };
var var2: S2 = undefined;
const p: *punnable align(@alignOf(S2)) S = @fieldParentPtr("field2", &var2);

fn someFunc(s: *punnable align(@alignOf(S2)) S) void {
    const s2 = &s.field2;
    s2.doSomething();
}

fn someOtherFunc() void {
    someFunc(p);
}

The qualifier punnable indicates that the pointer should be exempt from runtime safety checking.

Note1. In theory the punnable qualifier can be used for bitcasting a part of a value to a type of a different (smaller) size, where one cannot use @bitCast
Note 2. If Zig ever gets TBAA (hopefully not), punnable can take the role of the may_alias specifier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
Status: To do
Development

No branches or pull requests

3 participants