Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java.Interop] Add JniMemberInfoLookup #1208

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jonpryor
Copy link
Member

Context: c6c487b
Context: 312fbf4
Context: 2197579
Context: dotnet/android#7276

There is a desire to remove the "marshal-ilgen" component from .NET Android, which is responsible for all non-blittable type marshaling within P/Invoke (and related) invocations.

The largest source of such non-blittable parameter marshaling was with string marshaling: JNIEnv::GetFieldID() was "wrapped" by java_interop_jnienv_get_field_id:

JI_API jfieldID java_interop_jnienv_get_field_id (JNIEnv *env, jthrowable *_thrown, jclass type, const char* name, const char* signature);

which was P/Invoked within JniEnvironment.g.cs:

partial class NativeMethods {
    internal static extern unsafe IntPtr java_interop_jnienv_get_field_id (IntPtr jnienv, out IntPtr thrown, jobject type, string name, string signature);
}

and string parameter marshaling is not blittable.

Turns out™ that this particular usage of non-blittable parameter marshaling was fixed and rendered moot by:

  • 312fbf4: C#9 function pointer backend for JNIEnv invocations
  • c6c487b: "Standalone" build config to use C#9 function pointers
  • 2197579: Standalone build config is now the default

That said, this code path felt slightly less than ideal: the "top-level abstraction" for member lookups is an "encoded member", a string containing the name of the member, a ., and the JNI signature of the member, e.g.:

_members.InstanceFields.GetBooleanValue("propogateFinallyBlockExecuted.Z", this)

The "encoded member" would need to be split on ., and with c6c487b the name and signature would be separately passed to Marshal.StringToCoTaskMemUTF8(), which performs a memory allocation and converts the UTF-16 string to UTF-8.

Meanwhile, C# 11 introduced UTF-8 string literals, which allows the compiler to deal with UTF-8 conversion and memory allocation.

Enter `JniMemberInfoLookup``:

public ref struct JniMemberInfoLookup {
    public  string                  EncodedMember   {get;}
    public  ReadOnlySpan<byte>      MemberName      {get;}
    public  ReadOnlySpan<byte>      MemberSignature {get;}

    public JniMemberInfoLookup (string encodedMember, ReadOnlySpan<byte> memberName, ReadOnlySpan<byte> memberSignature);
}

JniMemberInfoLookup removes the need to call
Marshal.StringToCoTaskMemUTF8() entirely, at the cost of a more complicated member invocation:

// Old and busted:
bool value = _members.InstanceFields.GetBooleanValue("propogateFinallyBlockExecuted.Z", this);

// Eventual new hawtness:
var lookup = new JniMemberInfoLookup (
	"propogateFinallyBlockExecuted.Z",
	"propogateFinallyBlockExecuted"u8,
	"Z"u8);
bool value = _members.InstanceFields.GetBooleanValue(lookup, this);

Is It Worth It™? Maybe; see the new
JniFieldLookupTiming.FieldLookupTiming() test, which allocates a new JniPeerMembers instance and invoke
members.InstanceFields.GetFieldInfo(string) and
members.InstanceFields.GetFieldInfo(JniMemberInfoLookup). (A new JniPeerMembers instance is required because GetFieldInfo() caches the field lookup.) Using JniMemberInfoLookup is about 4% faster.

# FieldLookupTiming Timing: looking up JavaTiming.instanceIntField 10000 times
#   .InstanceMethods.GetFieldInfo(string):              00:00:02.2780667
#   .InstanceMethods.GetFieldInfo(JniMemberInfoLookup): 00:00:02.2016146

I'm not sure if this is actually worth it, especially as this will imply an increase in code size.

TODO:

  • Update JniPeerMembers.*.cs to use JniMemberInfoLookup, so that e.g. the above _members.InstanceFields.GetBooleanValue() overload exists.

  • generator changes to use JniMemberInfoLookup

jonpryor and others added 7 commits July 5, 2024 15:01
Context: c6c487b
Context: 312fbf4
Context: 2197579
Context: dotnet/android#7276

There is a desire to remove the "marshal-ilgen" component from
.NET Android, which is responsible for all non-blittable type
marshaling within P/Invoke (and related) invocations.

The largest source of such non-blittable parameter marshaling was
with string marshaling: `JNIEnv::GetFieldID()` was "wrapped" by
`java_interop_jnienv_get_field_id`:

	JI_API jfieldID java_interop_jnienv_get_field_id (JNIEnv *env, jthrowable *_thrown, jclass type, const char* name, const char* signature);

which was P/Invoked within `JniEnvironment.g.cs`:

	partial class NativeMethods {
	    internal static extern unsafe IntPtr java_interop_jnienv_get_field_id (IntPtr jnienv, out IntPtr thrown, jobject type, string name, string signature);
	}

and `string` parameter marshaling is *not* blittable.

Turns out™ that this particular usage of non-blittable parameter
marshaling was fixed and rendered moot by:

  * 312fbf4: C#9 function pointer backend for `JNIEnv` invocations
  * c6c487b: "Standalone" build config to use C#9 function pointers
  * 2197579: Standalone build config is now the default

That said, this code path felt slightly less than ideal: the
"top-level abstraction" for member lookups is an "encoded member",
a string containing the name of the member, a `.`, and the JNI
signature of the member, e.g.:

	_members.InstanceFields.GetBooleanValue("propogateFinallyBlockExecuted.Z", this)

The "encoded member" would need to be split on `.`, and with c6c487b
the name and signature would be separately passed to
`Marshal.StringToCoTaskMemUTF8()`, which performs a memory allocation
and converts the UTF-16 string to UTF-8.

Meanwhile, [C# 11 introduced UTF-8 string literals][0], which allows
the compiler to deal with UTF-8 conversion and memory allocation.

Enter `JniMemberInfoLookup``:

	public ref struct JniMemberInfoLookup {
	    public  string                  EncodedMember   {get;}
	    public  ReadOnlySpan<byte>      MemberName      {get;}
	    public  ReadOnlySpan<byte>      MemberSignature {get;}

	    public JniMemberInfoLookup (string encodedMember, ReadOnlySpan<byte> memberName, ReadOnlySpan<byte> memberSignature);
	}

`JniMemberInfoLookup` removes the need to call
`Marshal.StringToCoTaskMemUTF8()` entirely, at the cost of a more
complicated member invocation:

	// Old and busted:
	bool value = _members.InstanceFields.GetBooleanValue("propogateFinallyBlockExecuted.Z", this);

	// Eventual new hawtness:
	var lookup = new JniMemberInfoLookup (
		"propogateFinallyBlockExecuted.Z",
		"propogateFinallyBlockExecuted"u8,
		"Z"u8);
	bool value = _members.InstanceFields.GetBooleanValue(lookup, this);

Is It Worth It™?  *Maybe*; see the new
`JniFieldLookupTiming.FieldLookupTiming()` test, which allocates a new
`JniPeerMembers` instance and invoke
`members.InstanceFields.GetFieldInfo(string)` and
`members.InstanceFields.GetFieldInfo(JniMemberInfoLookup)`.
(A new `JniPeerMembers` instance is required because `GetFieldInfo()`
caches the field lookup.)  Using `JniMemberInfoLookup` is about
4% faster.

	# FieldLookupTiming Timing: looking up JavaTiming.instanceIntField 10000 times
	#   .InstanceMethods.GetFieldInfo(string):              00:00:02.2780667
	#   .InstanceMethods.GetFieldInfo(JniMemberInfoLookup): 00:00:02.2016146

I'm not sure if this is *actually* worth it, especially as this will
imply an increase in code size.

TODO:

  * Update `JniPeerMembers.*.cs` to use `JniMemberInfoLookup`, so
    that e.g. the above `_members.InstanceFields.GetBooleanValue()`
    overload exists.

  * `generator` changes to use `JniMemberInfoLookup`

[0]: https://learn.microsoft.com/dotnet/csharp/whats-new/csharp-11#utf-8-string-literals
Commit d1a9951 showed that *for just field lookup*, the idea of a
`ref struct JniMemberInfoLookup` *might* be a good idea.

Now that we've expanded `JniMemberInfoLookup` plumbing to include
*method* lookup, we can throw it into `TimingTests.cs` and see how it
compares!

The answer is that `ref struct`s and `Span<T>` are *not* magical
performance sauce with magical JIT support, and this is with CoreCLR!

	Method Lookup + Invoke Timing:
	           Traditional: 00:00:00.0175778
	            No caching: 00:00:00.0202369
	          Dict w/ lock: 00:00:00.0181357
	        ConcurrentDict: 00:00:00.0220411
	        JniPeerMembers: 00:00:00.0209174
	            JPM+Lookup: 00:00:00.0186421

	              (I)I virtual+traditional: 00:00:00.0000600
	           (I)I virtual+JniPeerMembers: 00:00:00.0000588
	               (I)I virtual+JPM+Lookup: 00:00:00.0007137

The new timings are `JPM+Lookup` and `virtual+JPM+Lookup`.

	// JniPeerMembers
	return _members.InstanceMethods.InvokeVirtualObjectMethod("toString.()Ljava/lang/String;", this, null);

	// JPM+Lookup
	var member = new JniMemberInfoLookup("toString.()Ljava/lang/String;", "toString"u8, "()Ljava/lang/String;"u8);
	ReadOnlySpan<JniArgumentValue> args = null;
	return _members.InstanceMethods.InvokeVirtualObjectMethod (member, this, args);

We see that JPM+Lookup is 11% *faster* when no arguments are involved.
Nice!

Throw an argument into the mix:

	// (I)I virtual+JniPeerMembers
	var args = stackalloc JniArgumentValue [1];
	args [0] = new JniArgumentValue (value);
	return _members.InstanceMethods.InvokeVirtualInt32Method ("VirtualIntMethod1Args.(I)I", this, args);

	// (I)I virtual+JPM+Lookup
	var member = new JniMemberInfoLookup (
			"VirtualIntMethod1Args.(I)I",
			"VirtualIntMethod1Args"u8,
			"(I)I"u8
	);
	var args = stackalloc JniArgumentValue [1];
	args [0] = new JniArgumentValue (value);
	return _members.InstanceMethods.InvokeVirtualInt32Method (member, this, new ReadOnlySpan<JniArgumentValue> (args, 1));

and we're now a *whole order of magnitude worse*, taking 12.1x longer.

Which quickly makes this idea as-is unworkable.

Maybe it's the ReadOnlySpan<T> usage, and if I went back to straight
`JniArgumentValue*` values it would be better?
@jonpryor jonpryor force-pushed the dev/jonp/jonp-zero-alloc-member-lookup branch from abfe931 to 840f04a Compare July 9, 2024 17:40
@jonpryor
Copy link
Member Author

jonpryor commented Jul 9, 2024

There isn't much point in continuing this effort, other than to show the idea and -- more importantly -- how it performs; see 840f04a. Trying to use JniMemberInfoLookup for method invocations results in an order of magnitude performance degradation. It's non-viable.

What's here uses JniMemberInfoLookup alongside ReadOnlySpan<JniArgumentValue>. Locally, on the thought the performance loss was due to ReadOnlySpan<JniArgumentValue>, I changed it all to instead be JniArgumentValue* (which is what the other e.g. Invoke*Method() methods use). This didn't help much: performance was still roughly an order of magnitude worse, e.g.

Method Lookup + Invoke Timing:
           Traditional: 00:00:00.0135348
            No caching: 00:00:00.0145592
          Dict w/ lock: 00:00:00.0132410
        ConcurrentDict: 00:00:00.0165738
        JniPeerMembers: 00:00:00.0150361
            JPM+Lookup: 00:00:00.0145659

              (I)I virtual+traditional: 00:00:00.0000447
           (I)I virtual+JniPeerMembers: 00:00:00.0000439
               (I)I virtual+JPM+Lookup: 00:00:00.0004603

0.0000439 to 0.0004603 is a 10.4x increase in time: an order of magnitude.

Without pulling out a profiler, my guess is that this is due to the use of u8 strings, or due to code size increase. Timing_ToString_JniPeerMembers() is 24 bytes, while Timing_ToString_JniPeerMembers_Lookup() is more than double that, at 55 bytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant