Skip to content

Comments

New GC: in-place compaction#2223

Closed
osa1 wants to merge 145 commits intomasterfrom
osa1/compacting_gc_2
Closed

New GC: in-place compaction#2223
osa1 wants to merge 145 commits intomasterfrom
osa1/compacting_gc_2

Conversation

@osa1
Copy link
Contributor

@osa1 osa1 commented Jan 5, 2021

This implement the GC algorithm "A fast garbage compaction algorithm" by
Jonkers, also described in The GC Handbook section 3.3.

This algorithm allows fast bump allocation and efficient use of Wasm linear
memory as before, without copying live data to a new space.

Bitmap and mark stack for marking is allocated after the heap pointer. Bitmap
has a constant size (heap_size / word_size_in_bits) so we allocate it first.
Mark stack dynamically grows (initial size is 64 words). Right before marking
the heap looks like

  | static data | heap | bitmap | mark stack |

Because we don't allocate anything during marking it's possible to grow mark
stack by just bumping the heap pointer.

After marking the algorithm makes two passes over the bitmap and moves live
objects to the beginning of the heap. See the references for details.

In this algorithm it's difficult to support interior pointers (even if the
offset is a known constant), so BigInt representation is refactored. Data
pointers now point to Blob headers rather than Blob payloads. When calling a
tommath function we stack allocate a mp_int struct with the data pointer
pointing to a blob payload and pass that to the library function. On return we
update BigInt fields with the values in the stack allocated struct.

Bitmap and mark stack implementations are tested using quickcheck, and can be
used in other GC implementations.

Perf test results:

In terms of gas, 2 tests regressed and the mean change is +3.1%.
In terms of size, 2 tests regressed and the mean change is +5.0%.

Some other numbers from the perf tests:

test max heap (words) max bitmap (words) max mark stack (words) Wasm pages
reversi 5,372 168 64 (default) 3
qr 13,453,988 420,438 64 (default) 850

Note that allocation for bookkeeping (mainly bitmap, mark stack never grows
above 64 words in the current tests) is heap_size / 32 (3.12%). When the
residency is lower than that, the bitmap allocation exceeds the copied data
size in the current collector. In those cases this collector allocates more.

It turns out this case is very common, as residency is extremely low in most
programs (I think mainly because we only do GC between messages). For example,
in "qr", when the heap size is 13,456,988 words, live data size is only 442
words. That's extremely low, 0.003%.

As a result, in the collection described above, current GC allocates 442 words
to copy the live data, but the new collector allocates 420,438 words for the
bitmap. In terms of Wasm pages, the current collector allocates 827 pages while
the new one allocates 850.

In general, whenever residency is smaller than 3.12%, the new collector will
allocate more than the current one.


TODOs:

  • Growing the mark stack doesn't work on CI because the implementation requires that the allocations for mark stack will be consecutive heap location, which doesn't work when we use malloc/free-style allocation in the test suite. Not sure how to fix this yet.

osa1 added 30 commits December 21, 2020 10:04
Native uses of alloc_array are duplicated in the native test files.
Those will be removed when the tests are ported to Rust.
This also changes how we store the next free index a little bit. The
constant `FULL` is gone, when the next free location is equal to the
array length that's how we know the table is full now. Also, `FREE_SLOT`
no longer next free location shifted left, it holds the next free
location directly (not shifted).
- Introduced `as_blah` methods to convert a SkewedPtr or *Obj to other
  object types. These methods check the type in debug mode.

- Improve error reporting, we know show details in assertion failures

- Temporarily link with debug runtime to enable sanity checking by
  default for now
this is easier to build on all platforms, compared to building for 32bits.

It also means we are testing things closer to what we really run.
Base automatically changed from osa1/rts_rust_port to master January 19, 2021 15:13
@nomeata
Copy link
Contributor

nomeata commented Jan 29, 2021

This PR both has a GC rewrite and something about Bigint. The latter I have now put in its own PR (#2280, written from scratch), so beware when updating this onto the latest master.

@nomeata nomeata removed their request for review March 9, 2021 16:48
@nomeata
Copy link
Contributor

nomeata commented May 6, 2021

Thanks for picking this up again. Do you want to mark it as draft while you work on it? (If only to help me to make sense of my inbox)

@osa1 osa1 marked this pull request as draft May 6, 2021 10:17
@osa1
Copy link
Contributor Author

osa1 commented May 6, 2021

Done.

@osa1 osa1 mentioned this pull request May 6, 2021
@crusso
Copy link
Contributor

crusso commented May 6, 2021

Cool - I guess you've already adapted the BigInt layout?

@osa1
Copy link
Contributor Author

osa1 commented May 6, 2021

Cool - I guess you've already adapted the BigInt layout?

Tbh I don't how BigInts are implemented now, but I checked the current collector (in the master branch) and adapted the mark-compact collector based on that. I should study the new BigInt implementation.

Related PR: #2522

@nomeata
Copy link
Contributor

nomeata commented May 6, 2021

The current bigint layout need no special handling from GC

@osa1
Copy link
Contributor Author

osa1 commented Jun 10, 2021

This will be merged with #2522.

@osa1 osa1 closed this Jun 10, 2021
mergify bot pushed a commit that referenced this pull request Jun 15, 2021
This merges #2223 and master and enables the new collector with a new moc flag
`--compacting-gc`. The goal is to (1) avoid bitrot in #2223 (2) allow easy
testing and benchmarking of different collectors.

We include both collectors in the Wasms. Binary sizes grow between 0.8% (CanCan
backend) to 1.8% (in simple tests in `run-drun`).

Some benchmark results here:
#2033 (comment)

An improvement to the compacting GC is currently being implemented in branch
`osa1/compacting_gc_3`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants