Mark phase prefetching. #73375

PeterSolMS · 2022-08-04T14:33:25Z

This adds prefetching to the mark phase.

The idea is that once we have established that an object is in one of the generations we want to collect, we prefetch its memory before we determine whether we have marked it already. This is because the mark bit is in the object itself, and thus requires accessing the object's memory.

As the prefetching will take some time to take effect, we park the object in a queue (see type mark_queue_t below). We then retrieve an older object from the queue, and test whether it has been marked. This should be faster, because we have issued a prefetch for this older object's memory a while back.

In quite a few places we now need to drain the queue to ensure correctness - see calls to drain_mark_queue().

ghost · 2022-08-04T14:33:56Z

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

This adds prefetching to the mark phase.

The idea is that once we have established that an object is in one of the generations we want to collect, we prefetch its memory before we determine whether we have marked it already. This is because the mark bit is in the object itself, and thus requires accessing the object's memory.

As the prefetching will take some time to take effect, we park the object in a queue (see type mark_queue_t below). We then retrieve an older object from the queue, and test whether it has been marked. This should be faster, because we have issued a prefetch for this older object's memory a while back.

In quite a few places we now need to drain the queue to ensure correctness - see calls to drain_mark_queue().

Author:	PeterSolMS
Assignees:	-
Labels:	`area-GC-coreclr`
Milestone:	-

adityamandaleeka · 2022-08-04T16:35:27Z

src/coreclr/gc/gc.cpp

+
+// retrieve a newly marked object from the queue
+// returns nullptr if there is no such object
+uint8_t* mark_queue_t::drain()


nit: Should this be renamed to something that better implies that it just marks one object and returns it (maybe mark_next or something)? drain implies completely emptying it out IMO.

I think of this method as draining but just needs to return if there's still objects to mark. it does drain the slot_table at the end when all slots become null (and that's the end goal, to have all slots become null).

How about get_next_marked, slight variation on Aditya's suggestion?

then I would probably do get_next_to_mark since you are getting an object to do the mark work on.

Well, the object returned is already marked, so get_next_to_mark doesn't seem entirely right either. Can't think of anything better than get_next_marked.

jkotas · 2022-08-04T22:33:50Z

src/coreclr/gc/gc.cpp

+#endif
+    _mm_prefetch((const char*)addr, _MM_HINT_T0);
+#elif defined(TARGET_ARM64) && defined(TARGET_WINDOWS)
+    __prefetch((const char*)addr);


__builtin_prefetch should work on non-Windows

https://clang.llvm.org/docs/LanguageExtensions.html describes the arguments.

ah right, on linux we should use __buildin_prefetch.

Thanks Jan - it looks like calling it with the default arguments should be just fine for our purposes.

Maoni0 · 2022-08-05T06:56:07Z

some results I measured on a 1st party prod workload, I had 3 machines -

baseline is without the change and 2 machines (prefetch0 and prefetch1) were with the change.
the workload runs with Server GC 48 heaps, samples/mb is # of CPU samples in mark_phase to promote an mb so lower is better.

gen0 GCs		samples/mb	diff %
run0	baseline0	5.49
	prefetch0	5.29	-3.64%
	prefetch1	5.21	-5.10%
run1	baseline1	5.75
	prefetch0	5.47	-4.87%
	prefetch1	5.52	-4.00%

gen1 GCs		samples/mb	diff %
run0	baseline0	3.46
	prefetch0	3.3	-4.62%
	prefetch1	3.25	-6.07%
run1	baseline1	3.6
	prefetch0	3.35	-6.94%
	prefetch1	3.39	-5.83%

…ating systems. Checkin tests showed issue traced to missing drain_mark_queue() call in WKS version of scan_dependent_handles.

…_t::get_next_marked.

Mark phase prefetching.

a9d0a86

PeterSolMS requested review from cshung, Maoni0, mangod9 and mrsharm August 4, 2022 14:33

dotnet-issue-labeler bot added the area-GC-coreclr label Aug 4, 2022

ghost assigned PeterSolMS Aug 4, 2022

Fix Linux arm64 build issue with __prefetch.

38b4775

adityamandaleeka reviewed Aug 4, 2022

View reviewed changes

Maoni0 approved these changes Aug 4, 2022

View reviewed changes

jkotas reviewed Aug 4, 2022

View reviewed changes

Rework implementation of Prefetch on the different architectures/oper…

9418dec

…ating systems. Checkin tests showed issue traced to missing drain_mark_queue() call in WKS version of scan_dependent_handles.

This was referenced Aug 8, 2022

Infra improvements for Helix #68176

Closed

GC/API/GC/GetGCMemoryInfo/GetGCMemoryInfo.sh test failing intermittently on CoreCLR Linux ARM32 #73247

Closed

PeterSolMS added 2 commits August 9, 2022 10:51

Adress code review feedback: rename mark_queue_t::drain to mark_queue…

37f4808

…_t::get_next_marked.

Merge with main.

63390f8

PeterSolMS merged commit 4570911 into dotnet:main Aug 9, 2022

jkotas mentioned this pull request Aug 10, 2022

slot_table[slot_index] == nullptr - dynamic atexit destructor for 'WKS::gc_heap::mark_queue #73679

Closed

This was referenced Aug 11, 2022

Regression from "Mark phase prefetching" #73782

Closed

[Perf] Windows/x64: 188 Regressions from GC changes #74014

Closed

AndyAyersMS mentioned this pull request Aug 30, 2022

Regressions in System.Xml.Linq.Perf_XName (FullPGO) #64626

Closed

ghost locked as resolved and limited conversation to collaborators Sep 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mark phase prefetching. #73375

Mark phase prefetching. #73375

PeterSolMS commented Aug 4, 2022

ghost commented Aug 4, 2022

adityamandaleeka Aug 4, 2022 •

edited

Loading

Maoni0 Aug 4, 2022

PeterSolMS Aug 5, 2022

Maoni0 Aug 5, 2022

PeterSolMS Aug 9, 2022

jkotas Aug 4, 2022

jkotas Aug 4, 2022

Maoni0 Aug 5, 2022

PeterSolMS Aug 5, 2022

Maoni0 commented Aug 5, 2022

Mark phase prefetching. #73375

Mark phase prefetching. #73375

Conversation

PeterSolMS commented Aug 4, 2022

ghost commented Aug 4, 2022

adityamandaleeka Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Maoni0 commented Aug 5, 2022

adityamandaleeka Aug 4, 2022 •

edited

Loading