Skip to content

Comments

RTS: Self-contained heap representation for bignums#2280

Merged
mergify[bot] merged 9 commits intomasterfrom
joachim/bignum-object
Feb 3, 2021
Merged

RTS: Self-contained heap representation for bignums#2280
mergify[bot] merged 9 commits intomasterfrom
joachim/bignum-object

Conversation

@nomeata
Copy link
Contributor

@nomeata nomeata commented Jan 29, 2021

To simplify GCs, we store the libtommath bignums as self-contained
objects on the heap (so for the purposes of the GC, they are just data).

The idea is that the TAG_BIGINT object stores both the mp_int record
and the mp_digit * array. See rts/motoko-rts/src/bigint.rs for more
details.

To simplify GCs, we store the libtommath bignums as self-contained
objects on the heap (so for the purposes of the GC, they are just data).

The idea is that the `TAG_BIGINT` object stores both the `mp_int` record
and the `mp_digit *` array. See `rts/motoko-rts/src/bigint.rs` for more
details.
@nomeata nomeata requested a review from osa1 January 29, 2021 10:24
@nomeata nomeata mentioned this pull request Jan 29, 2021
1 task
@dfinity-ci
Copy link

dfinity-ci commented Jan 29, 2021

In terms of gas, 1 tests regressed, 2 tests improved and the mean change is -1.4%.
In terms of size, 3 tests regressed and the mean change is +0.4%.

#[no_mangle]
pub unsafe extern "C" fn bigint_add(a: SkewedPtr, b: SkewedPtr) -> SkewedPtr {
let r = bigint_alloc();
let mut i = tmp_bigint();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a pattern here of alloc then persist - @osa1 would it be just as efficient, bug safer, to use a higher-order function here like your with_mp_int(f) functions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I may be mis-remembering the name)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about the compiler not reliably inlining a with_mp_int function, and suddenly function pointers being passed around, so I shied away from it and wrote the code I expect to see in the wasm. But maybe I should give it a try… is there a way to guarantee that it will be inlined?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was worried about that too with with_mp_int, but @osa1 proved me wrong by looking at the generated code. But probably not worth the hassle TBH. Also, I think you've got one or two place that don't follow the pattern - not every tmp_BigInt is persisted - or is that maybe a bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not a bug, these are indeed temporary (as the same says).

I think it's fine: not every bigint_alloc needs to be persisted, and the types ensure that you don't forget to call persist if you mean to use it. So we'd be trading two simple functions for two(?) higher order functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, as long the GC can safely traverse and collect the unpersisted ones.
The trade-off is less about the number of functions but the number of calls you need to make (or could forget to make) at each use site, I think. But I'm ok with either approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unpersited mp_int lives on the heap. The allocated thing on the heap is safe (and must be even without persist, e.g. for the tmp values)

You can't forget, the type won't allow it. Else I'd agree that a safer idiom might be advisable.

Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks promising, but @osa1 is the one to review properly

@crusso
Copy link
Contributor

crusso commented Feb 2, 2021

If osa is MIA, I'm happy to take another look and approve if you are confident.

@nomeata
Copy link
Contributor Author

nomeata commented Feb 3, 2021

Confident enough, with a second look from you

unsafe extern "C" fn mp_calloc(n_elems: usize, elem_size: Bytes<usize>) -> *mut libc::c_void {
debug_assert_eq!(elem_size.0, core::mem::size_of::<mp_digit>());
let size = Bytes((n_elems * elem_size.0) as u32); // Overflow check?
let payload = mp_alloc(size) as *mut u32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the zeroing loop below be done more efficiently with a memset or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll check if the compiler does something smart. But memset itself can’t do more than write a word of zero at a time.

}

#[no_mangle]
unsafe extern "C" fn bigint_div(a: SkewedPtr, b: SkewedPtr) -> SkewedPtr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: Do we have divrem that returns both at once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, not yet

/// libtommath function that tries to change it. For example, we cannot confuse input and
/// output paramters of mp_add() this way.
pub unsafe fn mp_int_ptr(self: *mut BigInt) -> *const mp_int {
(*self).mp_int.dp = self.payload_addr();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would avoiding the fix up dirty fewer pages? Can this just be done during evacuation?

Copy link
Contributor Author

@nomeata nomeata Feb 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be done there, but I guess a point of this exercise was to make the GC oblivious of the bigint stuff. And this is safer. So not sure.

Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But see comments - will leave changes up to you

@nomeata nomeata added the automerge-squash When ready, merge (using squash) label Feb 3, 2021
@mergify mergify bot merged commit f9112c8 into master Feb 3, 2021
@mergify mergify bot deleted the joachim/bignum-object branch February 3, 2021 14:14
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Feb 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants