RTS: Self-contained heap representation for bignums#2280
RTS: Self-contained heap representation for bignums#2280mergify[bot] merged 9 commits intomasterfrom
Conversation
To simplify GCs, we store the libtommath bignums as self-contained objects on the heap (so for the purposes of the GC, they are just data). The idea is that the `TAG_BIGINT` object stores both the `mp_int` record and the `mp_digit *` array. See `rts/motoko-rts/src/bigint.rs` for more details.
|
In terms of gas, 1 tests regressed, 2 tests improved and the mean change is -1.4%. |
| #[no_mangle] | ||
| pub unsafe extern "C" fn bigint_add(a: SkewedPtr, b: SkewedPtr) -> SkewedPtr { | ||
| let r = bigint_alloc(); | ||
| let mut i = tmp_bigint(); |
There was a problem hiding this comment.
There seems to be a pattern here of alloc then persist - @osa1 would it be just as efficient, bug safer, to use a higher-order function here like your with_mp_int(f) functions?
There was a problem hiding this comment.
(I may be mis-remembering the name)
There was a problem hiding this comment.
I was worried about the compiler not reliably inlining a with_mp_int function, and suddenly function pointers being passed around, so I shied away from it and wrote the code I expect to see in the wasm. But maybe I should give it a try… is there a way to guarantee that it will be inlined?
There was a problem hiding this comment.
Yeah, I was worried about that too with with_mp_int, but @osa1 proved me wrong by looking at the generated code. But probably not worth the hassle TBH. Also, I think you've got one or two place that don't follow the pattern - not every tmp_BigInt is persisted - or is that maybe a bug?
There was a problem hiding this comment.
No, not a bug, these are indeed temporary (as the same says).
I think it's fine: not every bigint_alloc needs to be persisted, and the types ensure that you don't forget to call persist if you mean to use it. So we'd be trading two simple functions for two(?) higher order functions.
There was a problem hiding this comment.
OK, as long the GC can safely traverse and collect the unpersisted ones.
The trade-off is less about the number of functions but the number of calls you need to make (or could forget to make) at each use site, I think. But I'm ok with either approach.
There was a problem hiding this comment.
The unpersited mp_int lives on the heap. The allocated thing on the heap is safe (and must be even without persist, e.g. for the tmp values)
You can't forget, the type won't allow it. Else I'd agree that a safer idiom might be advisable.
Co-authored-by: Claudio Russo <claudio@dfinity.org>
|
If osa is MIA, I'm happy to take another look and approve if you are confident. |
|
Confident enough, with a second look from you |
| unsafe extern "C" fn mp_calloc(n_elems: usize, elem_size: Bytes<usize>) -> *mut libc::c_void { | ||
| debug_assert_eq!(elem_size.0, core::mem::size_of::<mp_digit>()); | ||
| let size = Bytes((n_elems * elem_size.0) as u32); // Overflow check? | ||
| let payload = mp_alloc(size) as *mut u32; |
There was a problem hiding this comment.
Can the zeroing loop below be done more efficiently with a memset or something?
There was a problem hiding this comment.
I’ll check if the compiler does something smart. But memset itself can’t do more than write a word of zero at a time.
| } | ||
|
|
||
| #[no_mangle] | ||
| unsafe extern "C" fn bigint_div(a: SkewedPtr, b: SkewedPtr) -> SkewedPtr { |
There was a problem hiding this comment.
Aside: Do we have divrem that returns both at once?
| /// libtommath function that tries to change it. For example, we cannot confuse input and | ||
| /// output paramters of mp_add() this way. | ||
| pub unsafe fn mp_int_ptr(self: *mut BigInt) -> *const mp_int { | ||
| (*self).mp_int.dp = self.payload_addr(); |
There was a problem hiding this comment.
would avoiding the fix up dirty fewer pages? Can this just be done during evacuation?
There was a problem hiding this comment.
It could be done there, but I guess a point of this exercise was to make the GC oblivious of the bigint stuff. And this is safer. So not sure.
crusso
left a comment
There was a problem hiding this comment.
But see comments - will leave changes up to you
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
To simplify GCs, we store the libtommath bignums as self-contained
objects on the heap (so for the purposes of the GC, they are just data).
The idea is that the
TAG_BIGINTobject stores both themp_intrecordand the
mp_digit *array. Seerts/motoko-rts/src/bigint.rsfor moredetails.