-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Don't use write barriers for frozen objects #76135
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis PR implements @jkotas's idea #76112 (comment) to skip write-barriers for frozen objects, e.g.: string field;
void Test(string[] array)
{
array[0] = "str1";
field = "str";
} codegen diff: https://www.diffchecker.com/J1FH2GlX jit-diffs/spmi diffs are promising (up to TODO:
I decide to file a separate PR since it's not directly related to #76112 and also, it's not possible to run CI diffs in that PR due to JIT-EE changes.
|
Does this have an impact on the memory model @VSadov just documented? |
I assume it's purely an implementation detail as the frozen objects live forever, there is a thread going on where it's being discussed whether to expose it as a concept for users or not (hopefully, nobody will ever rely on string objects to be pinned by default always). Or you ask about this case: myIntField = 42;
myObjField = myObj;
myIntFieldInited = true; that the middle assignment is suppose to work as a full memory barrier (as a side-effect)? |
Do we have any runs with GC stress runs with |
cc @dotnet/gc |
It is about this paragraph: https://github.com/dotnet/runtime/blob/aaeadfc1e1078139b657037dafd386212e06b94b/docs/design/specs/Memory-model.md#object-assignment I do not think that what's written in this paragraph is going to hold for mutable frozen objects with this optimization. We have mutable frozen object in native AOT only currently, but we may want to have them in CoreCLR as well. @VSadov Thoughts? |
we need to ensure that the thread which creates an immortal object performs a write fence before making the obj available to other threads. The actual write barrier is unnecessary. |
I personally don't see any pipelines but I assume GC team knows better, as for local testing - it always throws |
I thought that the constraint is - no writeable ref fields/elements. Also the part that we do not trace into immortal objects implies that they cannot root regular objects. |
Right. Specifically: "Object assignment to a location potentially accessible by other threads is a release with respect to write operations to the instance’s fields and metadata" |
Good question, technically that statement is violated here, right @VSadov (I assume when JIT optimizes user's code and replaces |
SPMI diffs are ready: |
field = null; is ok to do as ordinary assignment as null is not an object - it can’t miss a field assignment or a method table |
I meant the initial user's code where it's not null, so user wrote code like this: o.a = 42;
o.b = GetMyObject();
o.c = true; where assignment to b is used as a commit (referring your doc) but after jit optimizations it turns out that GetMyObject is inlined as null |
I vaguely remember the shadow heap for write barrier debugging is not implemented for regions, I might be wrong, it has been quite a while. |
Right, it attempts to reserve 256Gb of memory via VirtualAlloc, I'll check with the standalone one |
o.a = 42;
o.b = GetMyObject();
o.c = true; I am confused - what is returned by We do not recommend to rely on sideeffects of object assignments for general purpose ordering as compiler might optimize and potentially reorder or coalesce ordinary assignments. if |
@VSadov thanks for the explanation, this, basically, answers @stephentoub's question that this PR doesn't change anything in the memory model and still satisfies all the statements, am I right? Regarding HEAPVERIFY_BARRIERCHECK - I tried to run several OSS projects I have locally with standalone gc and these: DOTNET_HeapVerify=2 |
Yes, as of now, we do not allow mutable GC references on frozen objects for various reasons. In a test case, I commented on the restriction of having references on Frozen segments. runtime/src/tests/GC/API/Frozen/Frozen.cs Lines 163 to 172 in 2600909
In view of this, we do have an opportunity. With regions only - we have a single immutable reserve range. We might be able to lift the first constraint with minimal effort. The impact could be significant, imagine someone could put their entire cache object graph frozen segments and thus basically ignored during any GC collections, that could be a huge saving for Gen2 time for those types of applications. |
It looks like the whole string interning deal happens under |
FrozenObjectHeapManager also has its own lock
And CoreCLR currently doesn't allocate objects with mutable GC references on FOH ( |
/azp run runtime-coreclr outerloop, runtime-coreclr gcstress0x3-gcstress0xc, runtime-extra-platforms |
Azure Pipelines successfully started running 3 pipeline(s). |
That may not be helpful if it protects only creation of the string as it would not guarantee order in which object and its content are committed. (so if you read something without a lock you cannot assume that having an instance reference implies the instance is consistent) However |
NativeAOT freezes a readonly graph of frozen objects. They may refer to each other. |
Yes. We are ok here. A string literal becomes shared with other threads when it is added to the literal map - that happens even before the string is exposed to managed code. Other threads may already use the string ref when JIT-ting other methods. We are ok with this as the literal map is accessed under a lock. The first thing the runtime/src/coreclr/vm/stringliteralmap.cpp Line 162 in dc0e2d9
Note: it does not matter if a string was allocated as immortal or ordinary pinned. The same requirements and the same reason why we are ok. |
Prior to this PR, we wouldn't treat those writes into the array as having release semantics? We've never documented our memory model (until your doc in flight now). As a result devs rely on observed behavior as well as correctly or incorrectly conveyed word of mouth. |
Yes it might, because the example relies on implementation details that are not a part of the contract. Many optimizations have the same problem (dependence on implementation details) and we should look carefully at the potential for breakage. |
My point is there is no contract because we never wrote one down. |
I think if the doc existed before, we would still be in the same situation - thinking whether this might break a lot of code. I think we are ok, since we allowed similar optimizations in the past (i.e. assignments of This seems similar to when we made readonly statics not modifiable. It also had potential to break some corner cases, but did we document that as a breaking change? |
I think we are in agreement on main points, but correct me if I am wrong:
|
We did not have a formal breaking change documentation process at that time. We just tagged the breaking change PRs with a label. The readonly statics PR did have that label. |
Added When you commit this breaking change:
Tagging @dotnet/compat for awareness of the breaking change. |
BTW: The GC has this code since forever:
It suggests that this optimization was implemented in the some existing runtime that used frozen objects (.NET Framework 2.0 or .NET Native for UWP). |
I would like to see the memory model spec that you are working on updated to be more precise in this area. Another example:
It may be better to write the breaking change for the memory model spec instead of this detail. Say something like that there was a lot of ambiguity around the guarantees here and that the runtimes will only provide the guarantees written down going forward. |
Sounds good. |
I see. We assume that a frozen obj is immutable and thus we can elide the fence, but what if we allow mutations? Also what about doing |
Right. The current change in this PR is not doing that. It will need augmenting to make it work. I think we will need a new method on the JIT interface that returns whether the frozen object is immutable.
These are always full fences. I do not see a problem with these. |
So if I understand it correctly, you want me to add a new method something like PS: Correct me if I'm wrong but currently there are no frozen objects JIT works with which are mutable. |
It would not need to be in this change as we do not have mutable frozen objects yet. Yes, a new API is needed to detect mutables, once we have them. It should only affect the emit - should be stlr on arm64, dmb ishst on arm32, no effect on x64. |
There are no mutable frozen objects exposed to the JIT as of this PR, so this PR is fine. You do not need to introduce the new JIT/EE interface as part of this PR. There will be mutable frozen objects exposed to the JIT after #76112 (comment). For example, you can try #76135 (comment) . We will need the new JIT/EE interface method as part that change. |
CI failures:
|
Merging since #76112 depends on it (to introduce "is mutable" helper there) Going to file the breaking-change note as was requested after. Hopefully, will be able to attach performance improvements from https://github.com/dotnet/perf-autofiling-issues/issues in a week. |
This particular PR does not need to be tracked as a breaking change, right?
|
This reverts commit 315bdd4.
…en objects" (#76649) This PR reverts #76235 with a few manual modifications: The main issue why the initial PRs (#75573 and #76135) were reverted has just been resolved via #76251: GC could collect some associated (with frozen objects) objects as unreachable, e.g. it could collect a SyncBlock, WeakReferences and Dependent handles associated with frozen objects which could (e.g. for a short period of time) be indeed unreachable but return back to life after. Co-authored-by: Jan Kotas <[email protected]> Co-authored-by: Jakob Botsch Nielsen <[email protected]>
This PR implements @jkotas's idea #76112 (comment) to skip write-barriers for frozen objects, e.g.:
codegen diff: https://www.diffchecker.com/J1FH2GlX
jit-diffs/spmi diffs are promising (up to
Total bytes of delta: -719913 (-0.62 % of base)
)https://dev.azure.com/dnceng-public/public/_build/results?buildId=28997&view=ms.vss-build-web.run-extensions-tab
TODO:
I decided to file a separate PR since it's not directly related to #76112 and also, it's not possible to run CI diffs in that PR due to JIT-EE changes.