Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for struct-like datatypes #2858

Open
Mudloop opened this issue Jun 26, 2024 · 8 comments
Open

Proposal for struct-like datatypes #2858

Mudloop opened this issue Jun 26, 2024 · 8 comments

Comments

@Mudloop
Copy link

Mudloop commented Jun 26, 2024

Feature suggestion

Hi,

I had this idea for C#-style structs that I believe could be implemented using a Transform.
I know structs have been brought up a couple of times, so there's clearly some demand.

Take this code :

@struct export class Vector2 {
	x: f32 = 0;
	y: f32 = 0;
	constructor(x: f32, y: f32) {
		x = x;
		y = y;
	}
}
export class Test {
	position: Vector2 = new Vector2(0, 0);
}

That could be transformed into this :

@final @unmanaged export class Vector2 {
	x: f32 = 0;
	y: f32 = 0;
	@inline constructor(x: f32, y: f32) {
		let ret = changetype<Vector2>(memory.data(sizeof<Vector2>()));
		ret.x = x;
		ret.y = y;
		return ret;
	}
}
export class Test {
	private _position_x: f32 = 0;
	private _position_y: f32 = 0;
	@inline get position(): Vector2 {
		return new Vector2(this._position_x, this._position_y);
	}
	set position(value: Vector2) {
		this._position_x = value.x;
		this._position_y = value.y;
	}
}

Which essentially makes the Vector2 class allocate on the stack when instantiated, while still being able to assign it to objects.

Obviously it gets more complex with nested structs and generics involved, and things like offsetof("position") would also need to be transformed, and respecting the original's constructor code would become tricky.

Also, I believe it would need to turn things like this :

let v1 = new Vector2(0, 0);
let v2 = v1;

Into something like this :

let v1 = new Vector2(0, 0);
let v2 = changetype<Vector2>(memory.data(sizeof<Vector2>()));
memory.copy(changetype<usize>(v1), changetype<usize>(v2), sizeof<Vector2>());

Not entirely sure how to deal with passing it around as a parameter or returning it, because my understanding of how memory works in AssemblyScript is a bit limited.

Would love to hear some thoughts on this idea.

@Mudloop
Copy link
Author

Mudloop commented Jun 26, 2024

For nested structs, something like this :

@struct export class StructA {
	val: f32 = 0;
	constructor(val: f32) {
		this.val = val;
	}
}
@struct export class StructB {
	val: f32 = 0;
	nested: StructA;
	constructor(val: f32, nested: StructA) {
		this.val = val;
		this.nested = nested;
	}
}
export class Tester {
	struct: StructB = new StructB(0, new StructA(0));
}

Could become :

@final @unmanaged export class StructA {
	val: f32 = 0;
	/* @ts-ignore */
	@inline constructor(val: f32) {
		let ret = changetype<StructA>(memory.data(sizeof<StructA>()));
		ret.val = val;
		return ret;
	}
}
@final @unmanaged export class StructB {
	_val: f32 = 0;
	_nested_val: f32 = 0;
	@inline
	get val(): f32 { return this._val; }
	set val(value: f32) { this._val = value; }
	@inline
	get nested(): StructA { return new StructA(this._nested_val); }
	set nested(value: StructA) { this._nested_val = value.val; }
	/* @ts-ignore */
	@inline constructor(val:f32, nested: StructA) {
		let ret = changetype<StructB>(memory.data(sizeof<StructB>()));
		ret.val = val;
		ret.nested = nested;
		return ret;
	}
}
export class Tester {
	_struct_val: f32 = 0;
	_struct_nested_val: f32 = 0;
	@inline get struct(): StructB {
		return new StructB(this._struct_val, new StructA(this._struct_nested_val));
	}
	set struct(value: StructB) {
		this._struct_val = value.val;
		this._struct_nested_val = value.nested.val;
	}
}

Makes my head spin a little, might have made some mistakes, so consider this pseudo-code, and I hope the concept is clear.

Edit : it probably shouldn't call constructors for getting the encapsulated "structs", and rather do the memory stuff directly, because we wouldn't want to call the constructor every time a struct gets copied.

@CountBleck
Copy link
Member

One caveat is that there is no stack in AS, except for the shadow stack used for garbage collection.

@JairusSW
Copy link
Contributor

JairusSW commented Jun 26, 2024

@CountBleck, I suppose you could allocate a page or two and call that the stack like https://github.com/fabricio-p/as-malloc does

@CountBleck
Copy link
Member

I believe this was discussed elsewhere and a long while back, but multi-value would probably be better suited for AS. Binaryen implements multi-value using a special tuple type, so I believe you can have tuples as local variables (that get exploded into their constituent variables).

That likely won't solve this use case though, since it would be pass-by-value and not pass-by-reference, and I'm not sure whether nesting tuples is supported at all.

Another good fit would be GC types, which are pass-by-reference and support nesting...but they can't be stored to regular classes since they're opaque.

@JairusSW you could definitely use memory.data(N) to preallocate a page or two and have a transform that instruments the allocations and resets a global stack pointer on function returns...returning structs might be a bit more involved though :P

@Mudloop
Copy link
Author

Mudloop commented Jun 26, 2024

One caveat is that there is no stack in AS, except for the shadow stack used for garbage collection.

Wait, that's confusing me - then what is __stack_pointer for?
If you call memory.data, I thought that reserved some memory on the stack, which would get freed once the current function is exited. And in contrast, if you call heap.alloc, that permanently reserves some memory (until manually freed).
Is that wrong?

@CountBleck
Copy link
Member

__stack_pointer is for that shadow stack I mentioned. The shadow stack is there so managed objects in local variables don't get prematurely garbage collected.

memory.data(123) reserves a block of memory at compile-time, not unlike a global uint8_t some_data[123] = {0}; declaration in C.

@Mudloop
Copy link
Author

Mudloop commented Jun 27, 2024

__stack_pointer is for that shadow stack I mentioned. The shadow stack is there so managed objects in local variables don't get prematurely garbage collected.

memory.data(123) reserves a block of memory at compile-time, not unlike a global uint8_t some_data[123] = {0}; declaration in C.

Oh ok, it's slowly starting to make sense.

Correct me if I'm wrong, but for the usecase I showed, I think this wouldn't actually be a problem since the constructor is inlined, so the "reinterpreted" Vector2 it's returning will still be unique.

My main reason for wanting something like this (besides GC / performance) is that without value types, doing enemy1.position = enemy2.position would lock their positions together unless that's handled by setters. Can easily lead to bugs.

Tuples / multi values sounds like it would work, but I'm unclear on whether that's implemented in AssemblyScript at this point, or if it's planned?

@Mudloop
Copy link
Author

Mudloop commented Jun 29, 2024

Oh I see the flaw with this approach. If I would call the inlined constructor in a loop, and add them to an array, they will all be the same references / pointers. Unless of course that would get transformed too, but yeah, complexity adds up quickly.

Guess I’ll just wait for tuple support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants