New GC: in-place compaction by osa1 · Pull Request #2223 · caffeinelabs/motoko

osa1 · 2021-01-05T16:48:23Z

This implement the GC algorithm "A fast garbage compaction algorithm" by
Jonkers, also described in The GC Handbook section 3.3.

This algorithm allows fast bump allocation and efficient use of Wasm linear
memory as before, without copying live data to a new space.

Bitmap and mark stack for marking is allocated after the heap pointer. Bitmap
has a constant size (heap_size / word_size_in_bits) so we allocate it first.
Mark stack dynamically grows (initial size is 64 words). Right before marking
the heap looks like

  | static data | heap | bitmap | mark stack |

Because we don't allocate anything during marking it's possible to grow mark
stack by just bumping the heap pointer.

After marking the algorithm makes two passes over the bitmap and moves live
objects to the beginning of the heap. See the references for details.

In this algorithm it's difficult to support interior pointers (even if the
offset is a known constant), so BigInt representation is refactored. Data
pointers now point to Blob headers rather than Blob payloads. When calling a
tommath function we stack allocate a mp_int struct with the data pointer
pointing to a blob payload and pass that to the library function. On return we
update BigInt fields with the values in the stack allocated struct.

Bitmap and mark stack implementations are tested using quickcheck, and can be
used in other GC implementations.

Perf test results:

In terms of gas, 2 tests regressed and the mean change is +3.1%.
In terms of size, 2 tests regressed and the mean change is +5.0%.

Some other numbers from the perf tests:

test	max heap (words)	max bitmap (words)	max mark stack (words)	Wasm pages
reversi	5,372	168	64 (default)	3
qr	13,453,988	420,438	64 (default)	850

Note that allocation for bookkeeping (mainly bitmap, mark stack never grows
above 64 words in the current tests) is heap_size / 32 (3.12%). When the
residency is lower than that, the bitmap allocation exceeds the copied data
size in the current collector. In those cases this collector allocates more.

It turns out this case is very common, as residency is extremely low in most
programs (I think mainly because we only do GC between messages). For example,
in "qr", when the heap size is 13,456,988 words, live data size is only 442
words. That's extremely low, 0.003%.

As a result, in the collection described above, current GC allocates 442 words
to copy the live data, but the new collector allocates 420,438 words for the
bitmap. In terms of Wasm pages, the current collector allocates 827 pages while
the new one allocates 850.

In general, whenever residency is smaller than 3.12%, the new collector will
allocate more than the current one.

TODOs:

Growing the mark stack doesn't work on CI because the implementation requires that the allocations for mark stack will be consecutive heap location, which doesn't work when we use malloc/free-style allocation in the test suite. Not sure how to fix this yet.

Native uses of alloc_array are duplicated in the native test files. Those will be removed when the tests are ported to Rust.

This also changes how we store the next free index a little bit. The constant `FULL` is gone, when the next free location is equal to the array length that's how we know the table is full now. Also, `FREE_SLOT` no longer next free location shifted left, it holds the next free location directly (not shifted).

- Introduced `as_blah` methods to convert a SkewedPtr or *Obj to other object types. These methods check the type in debug mode. - Improve error reporting, we know show details in assertion failures - Temporarily link with debug runtime to enable sanity checking by default for now

this is easier to build on all platforms, compared to building for 32bits. It also means we are testing things closer to what we really run.

Run the RTS tests in wasmtime/WASI

Commit 8ba767c on origin/master has the same tree as commit 8cf18d5.

nomeata · 2021-01-29T11:04:01Z

This PR both has a GC rewrite and something about Bigint. The latter I have now put in its own PR (#2280, written from scratch), so beware when updating this onto the latest master.

nomeata · 2021-05-06T10:13:45Z

Thanks for picking this up again. Do you want to mark it as draft while you work on it? (If only to help me to make sense of my inbox)

osa1 · 2021-05-06T10:17:48Z

Done.

crusso · 2021-05-06T11:15:54Z

Cool - I guess you've already adapted the BigInt layout?

osa1 · 2021-05-06T11:29:21Z

Cool - I guess you've already adapted the BigInt layout?

Tbh I don't how BigInts are implemented now, but I checked the current collector (in the master branch) and adapted the mark-compact collector based on that. I should study the new BigInt implementation.

Related PR: #2522

nomeata · 2021-05-06T11:52:32Z

The current bigint layout need no special handling from GC

osa1 · 2021-06-10T12:44:54Z

This will be merged with #2522.

This merges #2223 and master and enables the new collector with a new moc flag `--compacting-gc`. The goal is to (1) avoid bitrot in #2223 (2) allow easy testing and benchmarking of different collectors. We include both collectors in the Wasms. Binary sizes grow between 0.8% (CanCan backend) to 1.8% (in simple tests in `run-drun`). Some benchmark results here: #2033 (comment) An improvement to the compacting GC is currently being implemented in branch `osa1/compacting_gc_3`.

osa1 added 30 commits December 21, 2020 10:04

Move alloc_array to Rust

b8aeb64

Native uses of alloc_array are duplicated in the native test files. Those will be removed when the tests are ported to Rust.

Commit debug stuff

5eb1618

Move buf to Rust

822497d

[WIP] Trying to make lib compilable to native or wasi

cc5a954

Intro another Cargo.toml for generating rlib for RTS

8ff636b

Move closure table tests to Rust

61a77c0

Move alloc_blob to Rust, disable native tests for now

48cb96a

Move text functions to Rust [1/n]

f0f068d

Move text_iter_next to Rust

63e2b87

Move rest of text.c to Rust

3d6cb55

Move char to Rust

6db438e

Minor refactor

541c146

Move blob to Rust

1128ee5

Move utf8 stuff to Rust

fd013f2

Move leb128 stuff to Rust

6d214c0

Move version string, crc gen to Rust

bed0bcd

Move (s)leb128 encoding to Rust

19972d9

Move float functions to Rust

496a63e

Comments in alloc_blob

f0aabe0

Move tommath alloc functions to Rust

8cb19e2

Refactor Cargo files

3fc2b54

[WIP] Moving bigint functions to Rust

e7187ff

Fix BigInt GC broken in previous commits

1a4941b

Move rest of the bigint functions to Rust

aae504a

Minor bug fix, refactor

2819827

Moving principal id functions to Rust [1/n]

1eaabbd

Finish principal id stuff, start with idl

3f59326

Refactor trap functions, fix warnings

977aadf

nomeata added 8 commits January 19, 2021 12:22

Undo unrelated changes to shell

4023191

Run the RTS tests in wasmtime/WASI

c06a141

this is easier to build on all platforms, compared to building for 32bits. It also means we are testing things closer to what we really run.

Use llvm-ar?

06fc768

Typo

05fa886

More typo

18d5351

Simplify Makefile

1d09658

Merge pull request #2265 from dfinity/joachim/rts-tests-via-wasm

0773ebc

Run the RTS tests in wasmtime/WASI

Merge branch 'master' into osa1/rts_rust_port

8cf18d5

Base automatically changed from osa1/rts_rust_port to master January 19, 2021 15:13

nomeata added 2 commits January 20, 2021 10:58

Merge remote-tracking branch 'origin/pr/2210' into osa1/compacting_gc_2

6f4a629

Post-squash merge of origin/master

853ac0d

Commit 8ba767c on origin/master has the same tree as commit 8cf18d5.

nomeata removed their request for review March 9, 2021 16:48

osa1 added 4 commits May 5, 2021 18:16

Merge remote-tracking branch 'origin/master' into osa1/compacting_gc_2

9b035e6

Disable mark stack tests, they only work on Wasm

54b9fd9

Clarify why mark_stack tests were disabled

695a2ff

Merge branch 'master' into osa1/compacting_gc_2

6dd58ef

osa1 marked this pull request as draft May 6, 2021 10:17

osa1 mentioned this pull request May 6, 2021

New GC: in-place compaction #2522

Merged

osa1 added 3 commits June 2, 2021 12:00

Merge remote-tracking branch 'origin/master' into osa1/compacting_gc_2

f56e917

Turn an assert_eq into debug_assert_eq

cbba64f

Fix an assertion

f159302

osa1 closed this Jun 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

New GC: in-place compaction#2223

New GC: in-place compaction#2223
osa1 wants to merge 145 commits intomasterfrom
osa1/compacting_gc_2

osa1 commented Jan 5, 2021 •

edited

Loading

Uh oh!

nomeata commented Jan 29, 2021

Uh oh!

nomeata commented May 6, 2021

Uh oh!

osa1 commented May 6, 2021

Uh oh!

crusso commented May 6, 2021

Uh oh!

osa1 commented May 6, 2021

Uh oh!

nomeata commented May 6, 2021

Uh oh!

osa1 commented Jun 10, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

osa1 commented Jan 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nomeata commented Jan 29, 2021

Uh oh!

nomeata commented May 6, 2021

Uh oh!

osa1 commented May 6, 2021

Uh oh!

crusso commented May 6, 2021

Uh oh!

osa1 commented May 6, 2021

Uh oh!

nomeata commented May 6, 2021

Uh oh!

osa1 commented Jun 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

osa1 commented Jan 5, 2021 •

edited

Loading

osa1 commented Jun 10, 2021 •

edited

Loading