-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split up "regular" and AOT DSO caches #9117
Conversation
5cc7999
to
b11debd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main thing missing here is some kind of test. Was there a SkiaSharp sample that was crashing?
Is there some permutation of this test that would have made it crash?
public void SkiaSharpCanvasBasedAppRuns ([Values (true, false)] bool isRelease, [Values (true, false)] bool addResource) |
The problem is that it stopped crashing after @daltzctr reinstalled dotnet... :) A test is definitely something we need, but I'd like to wait for @lambdageek's code to land on the runtime side, as it will simplify what we need to process for AOT libraries.
I'll take a look at it. |
There were two suggestions in dotnet/runtime#104397 a. dotnet/runtime#104397 (comment) - make mono not crash when it opens a DSO but it doesn't contain a I'm not actively working on either - I was just helping to diagnose the issue for the runtime mobile team. I think (a) can be done in .NET 9, but (b) feels too big. |
@jonathanpeppers I tried to make the sample crash, but couldn't :( Considering that the #9031 original repro doesn't appear to crash anymore, I don't think we have a reliable way to test for this condition. Regardless, this PR is a good change since it separates two concerns, eliminating the possibility of a hash clash (making stuff faster at the same time, since the @lambdageek understood :) Runtime changes aren't required for this PR to work, so as much as it would be nice to have |
60dcd07
to
7a962fd
Compare
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
We implemented the |
7a962fd
to
94d5dd4
Compare
94d5dd4
to
841daa1
Compare
static DSOCacheEntry* find_only_aot_cache_entry (hash_t hash) noexcept | ||
{ | ||
constexpr bool IsAotCache = true; | ||
return find_dso_cache_entry_common<IsAotCache> (hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fond of a recurring pattern in this PR:
constexpr bool IsAotCache = true;
return find_dso_cache_entry_common<IsAotCache> (hash);
because it's hiding important differences. find_only_aot_cache_entry()
and find_only_dso_cache_entry()
both contain the expression find_dso_cache_entry_common<IsAotCache>(hash)
, but they do "opposite" things, and that isn't clear from the callsite, as it requires "additional context" in the form of the preceding line.
I think it would be clearer if we did one of two things instead:
- "Manually" inline, e.g.
find_dso_cache_entry_common<true>(hash)
vs.find_dso_cache_entry_common<false>(hash)
makes it immediately obvious that they're different, or - Use a name that makes it clearer what is going on:
constexpr bool UseAotCache = true; return find_dso_cache_entry_common<UseAotCache> (hash); // vs constexpr bool DoNotUseAotCache = false; return find_dso_cache_entry_common<DoNotUseAotCache> (hash);
Different things should look different at a glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use this pattern to make it obvious what the <true>
in template instantiation means and the name of the constant reflects the role of the template parameter in question. What you suggest:
constexpr bool DoNotUseAotCache = false;
Reads to me that the call is not using the AOT cache (if I just look at the name of the constant) but if I notice its value, then it's confusing - because !DoNotUseAotCache
logically means use AOT cache
, and that's not what the call would do. IsAotCache = true
or IsAotCache = false
, OTOH, is less confusing in that the name + the value are "in agreement".
I agree that the whole pattern is not perfect, but it is trying to provide for "self documenting" code or, at least, something that's easier to understand without having to find the definition of find_dso_cache_entry_common
@@ -112,7 +159,7 @@ namespace xamarin::android::internal | |||
|
|||
hash_t name_hash = xxhash::hash (name, strlen (name)); | |||
log_debug (LOG_ASSEMBLY, "monodroid_dlopen: hash for name '%s' is 0x%zx", name, name_hash); | |||
DSOCacheEntry *dso = find_dso_cache_entry (name_hash); | |||
DSOCacheEntry *dso = use_aot_cache ? find_any_dso_cache_entry (name_hash) : find_only_dso_cache_entry (name_hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If use_aot_cache
is true, shouldn't this invoke find_only_aot_cache_entry()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my later comment, this line is probably fine, it's just that the name use_aot_cache
was misleading/confusing me.
Assuming my later comment is correct, that use_aot_cache
means "prefer the AOT cache" instead of an either/or interpretation, then I think removing find_any_dso_cache_entry()
entirely might be useful, then updating this callsite to:
DSOCacheEntry *dso = nullptr;
if (prefer_aot_cache) {
dso = find_only_aot_cache_entry (name_hash);
}
if (dso == nullptr) {
dso = find_only_dso_cache_entry (name_hash);
}
which would help emphasize this is a "prefer AOT/look there first" vs. an "either/or" scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
use_aot_cache
is true, shouldn't this invokefind_only_aot_cache_entry()
?
No, because the AOT context is less, hmm, defined. The DSO cache entry context is 100% clear - it is always used from within the p/invoke resolution code, so we know the caller (MonoVM) won't be interested in anything else but libraries that contain p/invokable code. When monodroid_dlopen
is called, OTOH, the runtime might want either the AOT shared libraries or standard shared libraries and we don't know which.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the runtime might want either the AOT shared libraries or standard shared libraries and we don't know which.
which is part of my confusion: yes, MonoVM wants either/or, but from our perspective it means we need to check both, in a particular order.
Maybe it's just me -- it's probably just me -- but "either/or" in combination with "use_aot_cache" implies to me that either the AOT cache should be used, or the DSO cache, but not both.
Meanwhile, we do want "both"! In a particular preferential order! Hence my suggestion to rename "use_aot_cache" to "prefer_aot_cache".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's prefer_aot_cache
now.
@@ -19,7 +19,9 @@ namespace xamarin::android { | |||
void *lib_handle = dso_handle == nullptr ? nullptr : *dso_handle; | |||
|
|||
if (lib_handle == nullptr) { | |||
lib_handle = internal::MonodroidDl::monodroid_dlopen (library_name, MONO_DL_LOCAL, nullptr, nullptr); | |||
// We're being called as part of the p/invoke mechanism, we don't need to look in the AOT cache | |||
constexpr bool USE_AOT_CACHE = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto: this should probably be named DO_NOT_USE_AOT_CACHE=false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the pattern by using an enum to parametrize the template instead. Hopefully it's less confusing.
@@ -103,7 +150,7 @@ namespace xamarin::android::internal | |||
|
|||
public: | |||
[[gnu::flatten]] | |||
static void* monodroid_dlopen (const char *name, int flags, char **err, [[maybe_unused]] void *user_data) noexcept | |||
static void* monodroid_dlopen (const char *name, int flags, char **err, bool use_aot_cache) noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think part of my confusion here is thinking that "use aot cache" is an either/or interpretation: either (only) AOT cache is used, or AOT cache is not used. (It's a boolean! Is this not what boolean means?)
While the logic appears to instead be: "when true, check AOT first, then fallback to non-AOT cache."
Consequently, a better name than use_aot_cache
could be prefer_aot_cache
or check_aot_cache_first
.
841daa1
to
7efda5c
Compare
AOT cache not used yet, so applications using AOT will crash at runtime at this point, pending a Mono runtime fix.
...and caused the ILStrip tests to fail. Doh.
7efda5c
to
66140b5
Compare
Fixes: https://github.com/dotnet/android/issues/9081
Context: https://github.com/dotnet/runtime/issues/104397
Context: https://github.com/dotnet/runtime/pull/106026
Context: https://github.com/dotnet/android/issues/9081#issuecomment-2209064439
Context: c227042b6e58565e21dcd5e4ff5ea20fc4e9367c
("Compatibility is fun")
Consider a P/Invoke method declaration:
[DllImport("libSkiaSharp")]
static extern void gr_backendrendertarget_delete(IntPtr rendertarget);
Historically, when attempting to resolve this method, Mono would try
loading the following native libraries:
* `libSkiaSharp`
* `libSkiaSharp.so`
* `liblibSkiaSharp`
* `liblibSkiaSharp.so`
.NET for Android would further permute these names, *removing* the
`lib` prefix, for attempted compatibility in case there is a P/Invoke
into `"SkiaSharp"`.
The unfortunate occasional result would be an *ambiguity*: when told
to resolve "SkiaSharp", what should we return? The information for
`libSkiaSharp.so`, or for the *AOT'd image* of the assembly
`SkiaSharp.dll`, by way of `libaot-SkiaSharp.dll.so`?
%struct.DSOCacheEntry {
i64 u0x12e73d483788709d, ; from name: SkiaSharp.so
i64 u0x3cb282562b838c95, ; uint64_t real_name_hash
i1 false, ; bool ignore
ptr @.DSOCacheEntry.23_name, ; name: libaot-SkiaSharp.dll.so
ptr null; void* handle
}, ; 71
%struct.DSOCacheEntry {
i64 u0x12e73d483788709d, ; from name: SkiaSharp.so
i64 u0x43db119dcc3147fa, ; uint64_t real_name_hash
i1 false, ; bool ignore
ptr @.DSOCacheEntry.7_name, ; name: libSkiaSharp.so
ptr null; void* handle
}, ; 72
If we return the wrong thing, then the app may crash or otherwise
behave incorrectly.
Fix this by:
* Splitting up the DSO cache into AOT-related `.so` files and
everything else.
* Updating `PinvokeOverride::load_library_symbol()` so that the AOT
files are *not* consulted when resolving P/Invoke libraries.
* Updating `MonodroidDl::monodroid_dlopen()` -- which is called by
MonoVM via `mono_dl_fallback_register()` -- so that the AOT files
*are* consulted *first* when resolving AOT images.
When dotnet/runtime#104397 is fixed, it will make the AOT side of the
split more efficient as we won't have to permute the shared library
name as many times as now. |
Fixes: #9081
Context: dotnet/runtime#104397
Splits up "regular" DSO cache and AOT DSO cache into separate collections.
It will not only speed up regular DSO usage (e.g. for
p/invoke
calls) butwill also disambiguate like-named shared libraries from the two sets.
When dotnet/runtime#104397 is fixed, it will make the AOT side of the split
more efficient as we won't have to permute the shared library name as many
times as now.