-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Use faster write barriers when we know the target address is on the heap #97953
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsCloses #97534 TL;DR: JIT has an optimization that tries to simplify/remove write barriers, but that optimization happens to late to rely on VN/SSA/Assertions, so CSE can easily ruin it by moving addresses or values to locals and forcing that opt to give up and use the checked write barrier. This PR assists that optimization from assertion prop, Example: public struct Slot
{
public object Item;
public int SequenceNumber;
}
static void Test(Slot[] arr, object o)
{
arr[0].Item = o;
arr[0].SequenceNumber = 1;
} Codegen for ; Method LdtokenRepro:Test(Slot[],System.Object) (FullOpts)
G_M35662_IG01: ;; offset=0x0000
push rbx
sub rsp, 32
G_M35662_IG02: ;; offset=0x0005
cmp dword ptr [rcx+0x08], 0
jbe SHORT G_M35662_IG04
lea rbx, bword ptr [rcx+0x10]
mov rcx, rbx
- call CORINFO_HELP_CHECKED_ASSIGN_REF
+ call CORINFO_HELP_ASSIGN_REF
mov dword ptr [rbx+0x08], 1
G_M35662_IG03: ;; offset=0x001E
add rsp, 32
pop rbx
ret
G_M35662_IG04: ;; offset=0x0024
call CORINFO_HELP_RNGCHKFAIL
int3
; Total bytes of code: 42
|
@AndyAyersMS @jakobbotsch cc @SingleAccretion @dotnet/jit-contrib PTAL. Diffs. Diffs aren't too big because Closes #97534 |
ping @AndyAyersMS - I presume it should help you with CSE work since I analyzed jit-diff on top of BCL and this PR removed ~2400 checked write barriers. |
if (arg1Type != GCInfo::WriteBarrierForm::WBF_BarrierUnknown) | ||
{ | ||
return arg1Type; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: this has an assumption that "addressOfLocal + addressOfHeap" is UB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it a bit more conservative (it didn't affect diffs) by requiring the 2nd argument to be a non-handle constant. So, if we see address being GT_ADD(op1, op2) and one of the arguments is either address-of-local or address-within-heap, the other argument must be a non-handle constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for fixing this.
We could / should have perf score reflect the cost of helper calls (likewise with gtCostEx).
I agree, I assume currently all unknown calls have the same cost (assuming same arguments)? Do we want to make all helpers cheaper than unknown calls or more expensive? I presume this could result in some diffs |
Failures are known + PR passed all checks before adding debug JITDUMPs |
Closes #97534
TL;DR: JIT has an optimization that tries to simplify/remove write barriers, but that optimization happens too late to rely on VN/SSA/Assertions, so CSE can easily ruin it by moving addresses or values to locals and forcing that opt to give up and use the checked write barrier. This PR assists that optimization from assertion prop (#97901 did it for nongc objects as values for such indirs), Example:
Codegen for
Test
: