Stack allocation often harder than heap allocation #13640

enderger · 2022-11-23T23:02:16Z

Zig Version

0.11.0-dev.184+c4f7663c9

Steps to Reproduce and Observed Behavior

Create a structure which contains a pointer to itself
Try to instantiate said structure without either allocating on the heap or creating an imperative list of substructures

Expected Behavior

The language heavily pushes for stack allocation to be easier than heap allocation, but currently doesn't provide any way to just allocate memory on the stack inline. This either means that you have to write out a load of disjointed variables to express recursive structures (as pointers cannot derive their mutability from their parent) or allocate large amounts of unmanaged memory on the heap. This is problematic, as you either need to write structures that take up a large heap memory and mechanistically free all memory at once (forcing far more data onto the heap than actually needed), have horribly disjointed and hard to follow initialization, or leak memory. To fix this, there probably should be a builtin which creates a pointer to a hidden variable initialized to a constant in the current scope (in other words, expanding the current stack frame and getting a pointer) so that stack based constructs are able to take advantage of struct initializer syntax without either calculating the size of the buffer needed for an allocator or having a massive number of variables.

nektro · 2022-11-23T23:06:38Z

const S = struct {
    field: i32,
    me: *S = undefined,
};

pub fn main() void {
    var the = S{ .field = 42 };
    the.me = &the;
}

enderger · 2022-11-23T23:11:04Z

More meant a case like this:

const SExpr = union(enum) {
  Atom: []const u8,
  Pair: struct {
     car: *SExpr,
     cdr: *SExpr,
  }
}

The problem arises when you try to initialize deeply nested pairs, since you now need a variable to hold every SExpr without allocating it on the heap.

enderger · 2022-11-23T23:14:45Z

Sorry for the oddly specific example, this problem has made quite a lot of trouble when trying to implement a parser for S expressions since every part of the tree has to be assigned to a named variable whose sole purpose is to hold one part without putting it on the heap.

dee0xeed · 2022-11-27T19:02:18Z

While doing some exercises I permanently bumped into similar (or not very similar, not sure) problems. With pure stack allocation we do not know actual address of an entity (and therefore addresses of all its' sub-entities) until return from .init(). This may significantly affect data structures design and also initialization sequence - they can be more simple in case of heap allocation.

SpexGuy · 2022-11-27T19:24:16Z

If it helps, you can use heap code to allocate out of a stack buffer with code like this:

const needed_sexprs = 40;
var mem: [@sizeOf(SExpr) * needed_sexprs]u8 align(@alignOf(SExpr)) = undefined;
var fba = std.heap.FixedBufferAllocator.init(&mem);
const tree = buildTree(fba.allocator()) catch @panic("stack mem not big enough");

mlugg · 2022-12-01T03:06:23Z

While doing some exercises I permanently bumped into similar (or not very similar, not sure) problems. With pure stack allocation we do not know actual address of an entity (and therefore addresses of all its' sub-entities) until return from .init(). This may significantly affect data structures design and also initialization sequence - they can be more simple in case of heap allocation.

There are proposed language features to deal with this issue. Improved result location semantics (#2765) will allow you to access the return address of a function, meaning you effectively get a pointer to it for free, and pinned structs (#7769) will allow you to enforce that the struct is never directly copied (to make sure the result location stuff is being used properly and the value is never accidentally copied, invalidating the pointers). Until those are implemented, of course, you can just pass out pointers into init functions if needed.

dee0xeed · 2022-12-01T12:23:10Z

Improved result location semantics (#2765) will allow you to access the return address of a function, meaning you effectively get a pointer to it for free

I guess this is organized in a way similar to something like this:

on the caller side
** reserve a space for the result (on stack, as if it is another one local var)
** push parameters on stack
** make a call
on the callee side:
** we know the location of the result and just fill it

Hence addresses are known from the very beginning and are valid until the caller returns.
Did I get the idea right?

mlugg · 2022-12-04T06:57:25Z

<snip>
Did I get the idea right?

Kind of, yeah. What happens is that the caller passes as an implicit argument a pointer to the memory the return value should occupy - this actually probably already happens internally in basically any standard imperative language, it's also how most (all?) C calling conventions handle large return values.

Passing a pointer rather than just implicitly writing to memory below the stack frame allows result locations to be chained through multiple calls. For instance, if you have this code:

fn f() MyStruct {
    return .{
        // ...
    };
}

fn g() MyStruct {
    return f();
}

pub fn main() void {
    const foo = g();
    // ...
}

Result location semantics guarantee zero intermediate copies, since g and then f will be internally passed a pointer to foo's stack memory. The improved semantics would allow f to directly access that pointer, likely through a builtin, so foo could be self-referential.

enderger added the bug Observed behavior contradicts documented or intended behavior label Nov 23, 2022

andrewrk added use case Describes a real use case that is difficult or impossible, but does not propose a solution. and removed bug Observed behavior contradicts documented or intended behavior labels Jan 23, 2023

andrewrk added this to the 1.0.0 milestone Jan 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stack allocation often harder than heap allocation #13640

Stack allocation often harder than heap allocation #13640

enderger commented Nov 23, 2022

nektro commented Nov 23, 2022

enderger commented Nov 23, 2022 •

edited

Loading

enderger commented Nov 23, 2022

dee0xeed commented Nov 27, 2022

SpexGuy commented Nov 27, 2022 •

edited

Loading

mlugg commented Dec 1, 2022

dee0xeed commented Dec 1, 2022

mlugg commented Dec 4, 2022 •

edited

Loading

Stack allocation often harder than heap allocation #13640

Stack allocation often harder than heap allocation #13640

Comments

enderger commented Nov 23, 2022

Zig Version

Steps to Reproduce and Observed Behavior

Expected Behavior

nektro commented Nov 23, 2022

enderger commented Nov 23, 2022 • edited Loading

enderger commented Nov 23, 2022

dee0xeed commented Nov 27, 2022

SpexGuy commented Nov 27, 2022 • edited Loading

mlugg commented Dec 1, 2022

dee0xeed commented Dec 1, 2022

mlugg commented Dec 4, 2022 • edited Loading

enderger commented Nov 23, 2022 •

edited

Loading

SpexGuy commented Nov 27, 2022 •

edited

Loading

mlugg commented Dec 4, 2022 •

edited

Loading