-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
panama-backend - basic integration with jextract #499
Conversation
tool/src/java/mod.rs
Outdated
|
||
let mut lib_file = File::create(&lib_path)?; | ||
for include in include_files { | ||
writeln!(lib_file, "#include \"{include}\"")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jextract suggests loading all headers into a single library file. This gets transformed into an enormous monster of java
tool/src/lib.rs
Outdated
attr_validator.support.renaming = true; | ||
attr_validator.support.disabling = true; | ||
attr_validator.support.iterators = true; | ||
attr_validator.support.iterables = true; | ||
attr_validator.support.indexing = true; | ||
attr_validator.support.constructors = true; | ||
attr_validator.support.named_constructors = true; | ||
attr_validator.support.memory_sharing = true; | ||
attr_validator.support.accessors = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just enabled everything for generating some code to start off with. Will adjust later
I now manually implemented an opaque type with cleanup as well. This will be the model for the template. I also added a benchmark where I just instantiate the opaque struct because I was quite curious what the cost of GC was. With GC on the benchmark throughput more than halved. going from ~2.7M/s to 1.2 M/s. A slow down from about 350ns to ~800ns, which is just interesting. For comparison the opaque formatter in the example took ~8000ns, so this could be about a 10x speedup. All in all this seems worth pursuing and my biggest open questions for this have been resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is neat but i'm kind of surprised it works??
tool/src/java/mod.rs
Outdated
) -> std::io::Result<FileMap> { | ||
let files = FileMap::default(); | ||
let mut context = c2::CContext::new(tcx, files, false); | ||
context.run(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that after #495, the C2 backend can now be directly invoked per-type in a cleaner fashion, if you decide that's more useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to implement that as a followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually implemented that. Biggest problem is that I need the C runtime too, which I made pub (crate)
somelib_h.Opaque_destroy(this.internal); | ||
} | ||
|
||
public void assertStruct(MyStruct struct) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: wait, jextract generated this from the C header files? How did it know that these C header files correspond to a class with methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly no, that would be amazing 😃. That’s just a hand rolled wrapper for a more ergonomic type. I’m using this to build the templating for opaque types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything in the ntv
package was generated using jextract from the c backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! Okay, carry on!
import java.nio.charset.StandardCharsets; | ||
|
||
|
||
public class Opaque { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the tool doesn't actually generate code, the class body has now been been copied from an insta snapshot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
☝️ no longer valid 😝
import java.lang.ref.Cleaner; | ||
|
||
|
||
public class Opaque { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the snapshot from where we copied the body
tool/src/java/mod.rs
Outdated
// We can probably do boxed returns by just relying on jna | ||
// public double[] asBoxedSLice() { | ||
// try (var arena = Arena.ofConfined()) { | ||
// var nativeVal = somelib_h.Float64Vec_as_boxed_slice(arena, internal); | ||
// var data = dev.diplomattest.somelib.ntv.DiplomatF64View.data(nativeVal); | ||
// var len = dev.diplomattest.somelib.ntv.DiplomatF64View.len(nativeVal); | ||
// var returnVal = data.asSlice(0, len * JAVA_DOUBLE.byteSize()).toArray(JAVA_DOUBLE); | ||
// Native.free(data.address()); | ||
// return returnVal; | ||
// } | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what a boxed slice would look like using jna. It should be the only place where we'd need to use it.
import org.openjdk.jmh.runner.options.Options; | ||
import org.openjdk.jmh.runner.options.OptionsBuilder; | ||
|
||
public class OpaqueBench { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a shell command to actually run this yet. I only ran it from IntelliJ, but want to keep it as it'll be nice to have to compare to the kotlin backend once the icu4x example can be made to work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of files seem to be generated by default. It seems to be possible to pass in every single function and struct to jextract to specify which ones should be used to generate code, but I can't see a way to only forbid these standard constructions which are causing a lot of noise. I'd like to move that investigation to a followup.
@@ -17,6 +17,7 @@ mod ffi { | |||
Box::new(Self(v.to_string())) | |||
} | |||
|
|||
#[diplomat::attr(supports = memory_sharing, disable)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Manishearth I had an issue with passing owned strings in java as then I needed to pass an arena around which felt contradictory to the idea of an owned parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah you need to instead call diplomat_alloc and write to that string when you see this kind of thing.
well, you need to do that for all strings, the difference with owned strings is that you don't free it afterwards
@@ -33,6 +33,7 @@ pub type DiplomatStr = [u8]; | |||
pub type DiplomatStr16 = [u16]; | |||
|
|||
/// Like [`u8`], but interpreted explicitly as a raw byte as opposed to a numerical value. | |||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this came from a clippy fix, but I'm not sure
@@ -38,15 +38,15 @@ pub(crate) fn attr_support() -> BackendAttrSupport { | |||
a | |||
} | |||
|
|||
#[derive(askama::Template)] | |||
#[template(path = "c/runtime.h.jinja", escape = "none")] | |||
pub(crate) struct Runtime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need access to the c runtime for the java backend so that jextract can process it.
converted_value: Cow<'cx, str>, | ||
} | ||
|
||
mod arena { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted param conversions to not have to be aware of the context wherein they are done, and so introduced a kind of inversion of control, where if a conversion needs an arena it returns a variant that can only be accessed by providing an arena.
|
||
impl<'a, 'cx> TyGenContext<'a, 'cx> { | ||
#[allow(unused)] | ||
fn java_to_native<T: TyPosition>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per your comment last review @Manishearth I tried to harmonise the param conversion to just a language conversion, so it can be reused in e.g. the struct conversion. It feels a lot cleaner but I admit the control inversion above is a little weird.
Okay @Manishearth I think this could do with another review. I substantially redid the native <-> java conversion code. Also added the ci steps. I'm still missing a ci step that actually runs the tests instead of skipping, but I'd like to wrap this up, and focus on smaller pieces at a time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, never posted this
@@ -0,0 +1,47 @@ | |||
// Generated by jextract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we shouldn't need FILE
@@ -0,0 +1,219 @@ | |||
// Generated by jextract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we don't need any of these either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment was supposed to be on the threading files below this
@@ -17,6 +17,7 @@ mod ffi { | |||
Box::new(Self(v.to_string())) | |||
} | |||
|
|||
#[diplomat::attr(supports = memory_sharing, disable)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah you need to instead call diplomat_alloc and write to that string when you see this kind of thing.
well, you need to do that for all strings, the difference with owned strings is that you don't free it afterwards
// /path/to/mylib/include/mylib.h | ||
|
||
let package = format!("{domain}.{lib_name}.ntv"); | ||
let mut command = std::process::Command::new("jextract"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: would be nice to accept the jextract binary location from the CLI/config
Alright, so my biggest concern with this is making the JDK a part of the I would like for the Java backend to be a part of this repo, however. We have a couple routes in front of us, and I'd like to know what you think:
The last one is my preferred end state, honestly: when we were considering Panama we largely intended for the JEP 454 A hybrid route I find somewhat appealing is to take option 4 or 5 above ("don't check in the full bindings"), and over time move to a diplomat that doesn't need jextract. A particularly cool option would be for diplomat to generate an @jcrist1 what do you think? Ultimately you're the one doing the work here, so I don't want to dictate anything, but I want to try and resolve the tradeoff with the impact on other backends, and try to understand your own preferences here. |
@Manishearth thanks for the feedback. I'll probably just implement a filter to remove the extraneous generated files from jextract ... depending on the conclusion of what to do with this backend
write once run... something something ... I forget the rest 😛 Honestly the jextract thing bothers me from a rust devx perspective as well as an end user perspective. I agree that generating the JEP 454 binding code will probably be the way to go longer term, but I gotta be honest, I'm not super enthusiastic about doing that myself, not right now at least. While jextract generates a lot of superfluous stuff and is a bit of a pain to setup, it also seems like the easiest way to get functional bindings. So for that moving a java backend to its own repo in the diplomat project is a nice compromise. We could get a working backend, and slowly transition to diplomat generated JEP 454 code. Having a repo in the rust-diplomat project would give the panama backend a bit more legitimacy (and hopefully direct other contributors to it) while signalling that the support level is lower and also allowing a more customised dev-setup. We could definitely open an issue around setting up the JEP 454 generation but it's not the most motivating kind of coding for me so progress there will be slow. I would like to pin the diplomat dependency to a specific commit and maybe set up an alert for when the latest main breaks it. That way people can still have a working tool, but I could get notified to address incompatible changes to the diplomat dependency so maintenance wouldn't fall on your shoulders. |
@Manishearth If we're agreed I can try and set up a standalone bin project tomorrow, if you can create the |
Closing in favor of rust-diplomat/diplomat-java#1 |
I've implemented a basic integration with jextract for a pure java backend. This is just a baseline to start building on, that generates the C2 backend, and uses that with jextract to create the java interface to the native code. There's no code generation yet for ergonomic interfaces. I feel like enums and structs will be relatively straightforward, so once again I will start off building the wrappers for opaque type and cleanup. Also I don't actually know any java yet 🤪, but no time like the present to learn. This is as discussed in #144
Edit (2024-07-11):
I've been struggling a bit to make time for this lately as other priorities have taken precedence but I've managed another big chunk today.
I'm still missing the actual complete codegen pipeline. For the time being I'm just copy pasting generated code from the insta snapshots into the project, to test that it works, as I don't want to implement all of the features yet, and it's more convenient to just panic. Otherwise, this is starting to come together, so I'm going to add the checklist again to keep track of what I need to do still.
owned slices I think I would need to integrate with jna again to support these as I can't find a way to free memory in panama (it seems to happen with closeables and a dedicated scoped allocator)I am doing some stuff with owned slices but it feels a little be dangerous. It feels like we could accidentally get a user after free somehow. But I haven't thought too deeply about it.