Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: don't dead store untracked OSR locals based on ref count #89743

Merged
merged 2 commits into from
Aug 1, 2023

Conversation

AndyAyersMS
Copy link
Member

The local var ref count for OSR locals can be misleading, since some of the appearances may be in the "tier0" parts of the methods and won't be visible once we trim the OSR method down to just the part we're going to compile.

So, we can't rely on ref counts to enable a dead store of an OSR local.

Fixes #89666.

The local var ref count for OSR locals can be misleading, since some
of the appearances may be in the "tier0" parts of the methods and
won't be visible once we trim the OSR method down to just the part
we're going to compile.

So, we can't rely on ref counts to enable a dead store of an OSR local.

Fixes dotnet#89666.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 31, 2023
@ghost ghost assigned AndyAyersMS Jul 31, 2023
@ghost
Copy link

ghost commented Jul 31, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

The local var ref count for OSR locals can be misleading, since some of the appearances may be in the "tier0" parts of the methods and won't be visible once we trim the OSR method down to just the part we're going to compile.

So, we can't rely on ref counts to enable a dead store of an OSR local.

Fixes #89666.

Author: AndyAyersMS
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@jakobbotsch PTAL
cc @dotnet/jit-contrib

@AndyAyersMS
Copy link
Member Author

We will want to port this back to 7.0.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Jul 31, 2023

No diffs locally, no diffs in CI, no TP impact.

// For OSR locals we can't rely on ref counts this way, since some of the appearances
// of the local may be in the now-unseen Tier0 portion.
//
if (isDef && compRationalIRForm && (varDsc.lvRefCnt() == 1) && !varDsc.lvPinned && !varDsc.lvIsOSRLocal)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the problem exists only for address-exposed locals (since otherwise we can see all uses, even in the OSR version). Add it as a condition to make the problem more self-evident? (I had to look at the test to understand why this would be illegal.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interaction between OSR and flow opts here seems a little bit scary. We very much rely on a local like this one being address exposed in the OSR version, yet that only happens here because we import all IL and nothing optimizes the flow until after local morph. If we had any earlier flow opts (or if we ever tried to be smarter in local morph, e.g. via some RPO walk) then it seems like we would be able to notice that the code that exposes the local is statically unreachable.

It seems like long term it would be more robust to "lie" to the frontend of the JIT and wait with updating the flow until at least local morph has run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the problem exists only for address-exposed locals (since otherwise we can see all uses, even in the OSR version). Add it as a condition to make the problem more self-evident? (I had to look at the test to understand why this would be illegal.)

Yeah, I can add that.

Another option is to mark any OSR exposed local as implicitly referenced; we do that for pinned locals where we have similar issues not seeing all of the IR.

if ((corInfoTypeWithMod & CORINFO_TYPE_MOD_PINNED) != 0)
{
if ((corInfoType == CORINFO_TYPE_CLASS) || (corInfoType == CORINFO_TYPE_BYREF))
{
JITDUMP("Setting lvPinned for V%02u\n", varNum);
varDsc->lvPinned = 1;
if (opts.IsOSR())
{
// OSR method may not see any references to the pinned local,
// but must still report it in GC info.
//
varDsc->lvImplicitlyReferenced = 1;
}
}
else
{
JITDUMP("Ignoring pin for non-GC type V%02u\n", varNum);
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using lvImplicitlyReferenced seems likely to be a bit more robust so I think I'll change to using that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the code in liveness also need to check for lvImplicitlyReferenced? Or does something guarantee that lvRefCount for implicitly referenced locals is something indeterminately large?

Overall I guess I would assume that all address exposed locals already have to effectively be considered implicitly referenced (whether or not we set that bit), but this seems fine with me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ref count for implicit locals will be one greater than the number of actual occurrences, so will be at least 2 in this case.

@AndyAyersMS
Copy link
Member Author

This revised version will have a few diffs; the implicit weight changes the overall local weights, and this sometimes perturbs LSRA.

@jakobbotsch
Copy link
Member

This revised version will have a few diffs; the implicit weight changes the overall local weights, and this sometimes perturbs LSRA.

I suppose it is a bit conservative/unnecessary to mark these as implicitly referenced; if we prove in the OSR version that it is not address exposed, then that should effectively override the conservative tier0 exposure information, and we can then safely reason about all defs/uses.

@AndyAyersMS
Copy link
Member Author

Errors are known. Diffs

@AndyAyersMS AndyAyersMS merged commit 11126f8 into dotnet:main Aug 1, 2023
132 of 135 checks passed
@AndyAyersMS
Copy link
Member Author

/backport to release/7.0-staging

@github-actions
Copy link
Contributor

github-actions bot commented Aug 1, 2023

Started backporting to release/7.0-staging: https://github.com/dotnet/runtime/actions/runs/5732393141

@github-actions
Copy link
Contributor

github-actions bot commented Aug 1, 2023

@AndyAyersMS backporting to release/7.0-staging failed, the patch most likely resulted in conflicts:

$ git am --3way --ignore-whitespace --keep-non-patch changes.patch

Applying: JIT: don't dead store untracked OSR locals based on ref count
Applying: use implicit reference instead
error: sha1 information is lacking or useless (src/coreclr/jit/liveness.cpp).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 use implicit reference instead
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128

Please backport manually!

@github-actions
Copy link
Contributor

github-actions bot commented Aug 1, 2023

@AndyAyersMS an error occurred while backporting to release/7.0-staging, please check the run log for details!

Error: git am failed, most likely due to a merge conflict.

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this pull request Aug 2, 2023
Backport of dotnet#89743 to 7.0.

The local var ref count for OSR locals can be misleading, since some
of the appearances may be in the "tier0" parts of the methods and
won't be visible once we trim the OSR method down to just the part
we're going to compile.

Fix by giving OSR-exposed locals an implicit ref count.

Fixes dotnet#89666.
AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this pull request Aug 2, 2023
Backport of dotnet#89743 to 7.0.

The local var ref count for OSR locals can be misleading, since some
of the appearances may be in the "tier0" parts of the methods and
won't be visible once we trim the OSR method down to just the part
we're going to compile.

Fix by giving OSR-exposed locals an implicit ref count.

Fixes dotnet#89666.
JulieLeeMSFT pushed a commit that referenced this pull request Aug 10, 2023
Backport of #89743 to 7.0.

The local var ref count for OSR locals can be misleading, since some
of the appearances may be in the "tier0" parts of the methods and
won't be visible once we trim the OSR method down to just the part
we're going to compile.

Fix by giving OSR-exposed locals an implicit ref count.

Fixes #89666.
@ghost ghost locked as resolved and limited conversation to collaborators Sep 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Span<byte> from local variable bug in .net 7.x Release
2 participants