-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Several questions about using Cranelift as a backend for a programming language #5141
Comments
Yes, you can jump to functions defined in C.
You should probably be using
You can use the cranelift-frontend crate to generate SSA form on the fly.
For line info if you use cranelift-frontend you can use set_srcloc with a unique value for every source location. You can then map the SourceLoc you passed in back to the source locations after compilation and use gimli to produce the line tables. See for example https://github.com/bjorn3/rustc_codegen_cranelift/tree/master/src/debuginfo. As for getting debuginfo for local variables there is technically support for it, but not very good.
You probably have to detect it manually for now. There are instructions like iadd_cout, but they are not supported for all integer sizes on all backends. I would like to see it implemented universally myself.
These heaps are mostly for use in the translation from wasm to clif ir. You don't have to use them. The load and store instructions can access the entire address space just fine.
No, Cranelift will always put a prologue first before doing anything. One option I guess would be to insert raw bytes representing the call before the function. You can use
https://bytecodealliance.zulipchat.com/#narrow/stream/217117-cranelift might be a better place. |
Hi @yorickpeterse -- thanks for the questions and interest! Apologies for the state of our documentation: we need to do an update pass, and probably develop more tutorial-like material, but we haven't had the time to do that. In any case, we're happy to answer questions here (or on our Zulip in the Cranelift stream), so please keep them coming if you have any more. (On preview, I see that bjorn3 has answered a bunch so I'll just add to some of my thoughts here)
We currently don't have a way of injecting assembly at all -- #1041 goes over some of the reasons why. Basically, while it seems conceptually simple ("just emit the assembly I give you!"), it actually is enormously complex for all the reasons listed there (see especially this comment). When you say "jump to a function", not directly; we can call one though, which is actually probably what you'd want. (Otherwise, how does this snippet know where to jump back to?) Perhaps you were thinking the jump would somehow inline the snippet (we don't have this)? Popping up a level (in the X-Y problem sense): the goal here is stack-switching and/or segmented stacks, goroutine-style, no? If so, I'll say that we've had good luck in Wasmtime (built on top of Cranelift) doing fiber-style stack switching for async function calls, and that didn't need anything special in Cranelift. (At a yield point in the runtime when called from Cranelift-generated code, call the "switch back to host stack" helper; it pushes registers, switches SP, and returns on the other stack.) We also have explicit support in Cranelft for stack bounds checks. Combining the two, I could see a small modification to the Cranelift prologue generator that would say "if limit reached, call this runtime function" and that function would essentially allocate a new stack, switch to it, place a little trampoline on the stack to return back to the previous segment on return, and then jump back to the return address in the function that hit the limit. I'd be happy to help design this feature if you're interested.
In general, no; we just added some opcodes to help us with lowering Wasm heap checks, but for maximal flexibility I'd suggest inserting checks with explicit IR. We have a set of trap instructions you can use for this:
This gets back to the inline-assembly question (see above); prologues (and epilogues) in particular are even more special because the compiler generates them to set up its stack frame in a careful way, so even if you could insert instructions somehow, we'd prefer that not to be the case (it would create a brittle coupling between our stackframe and prologue design and your inserted code). We actually had a version of this problem earlier with Cranelift's embedding in SpiderMonkey -- we had a mode to generate code without pro/epilogues and the embedder built them and glued things together, but it was not maintainable. In general I like the philosophy of adding useful primitives/building blocks so that they come into the world of the compiler IR/settings and we can understand the interactions with the rest of our codegen. So, as above, I think we can design something for the "new stack segment if hit limit" case. |
@bjorn3 Thanks for the answers so far!
In the documentation of cranelift_codegen the closest I could find is https://docs.rs/cranelift-codegen/latest/cranelift_codegen/ir/trait.InstBuilder.html#method.jump, but this jumps to a block and not a random memory address. Should I perhaps look in a different crate for this?
Looking at the example on https://docs.rs/cranelift-frontend/latest/cranelift_frontend/#example, it seems to use jumps but leaves out the arguments. Am I correct in thinking that if you use
Does Cranelift construct the graph such that it passes
I suspect this would probably get a bit tricky, and maybe break across versions. If I just insert a function call as the first instruction after the prologue, is there anything I can do to prevent Cranelift from moving it around, i.e. is there something like a compiler barrier? If so I think that should be good enough.
I thought about the Zulip, but it's too easy to lose track of things in a chat channel unfortunately. But if desired I can ask things over there instead 😃 |
Correct. Cranelift-frontend automatically appends basic block arguments as necessary to handle SSA form. |
I had indeed seen the issue, and I figured it wouldn't be as easy as just dumping assembly in the middle of the IR 😄
Ah yeah good point, I forgot about actually having to jump back; heh. Either way, the function call cost isn't really something to worry about so calling a helper function should be fine.
Pretty much. I was planning on taking the approach outlined in https://graphitemaster.github.io/fibers/ where I'd just define a bunch of routines in assembly (instead of using setjmp/longjmp), then call those in the appropriate places.
I can see this being useful, though I currently don't know enough about Cranelift or how exactly I would implement the routine in a cross-platform way. I think it would probably be best to first implement this in my language, then maybe come back to Cranelift to see how/if it makes sense for it to provide a generalised solution. |
For segmented stacks following the same abi as https://gcc.gnu.org/wiki/SplitStacks may work. LLVM also implements this and Rust itself used to use it back when it still used green threads. |
As other folks have said, welcome! Although Cranelift is co-developed with Wasmtime which gives Wasmtime a little bit of a privileged position, we explicitly want to support projects like yours. So we're excited to hear from you! I especially want to emphasize that we're open to making changes to Cranelift to support new use cases. We have to balance that against making sure we have the resources for ongoing maintenance of every change we merge. But if you find that it's impossible to do something you need using Cranelift, just keep in mind that fixing Cranelift is an option.
To clarify, "as necessary" is doing double-duty here. Cranelift allows you to use a value in one block that was defined in another block, as long as the definition dominates the use. So
I believe, and @cfallin can correct me if I'm wrong, that currently 1) Cranelift won't move side-effecting instructions, and 2) it treats all function calls as side-effecting. It may move pure instructions past effectful ones, but I think it'll never move a pure instruction any earlier in the function. So I think the current state is that if a function call is the first instruction, it will still be the first instruction after all optimizations, and only the prologue will precede it in the final machine code. Whether you can rely on that behavior in the future is another question, of course. I don't entirely understand what you're trying to do, but I'm curious whether the |
@jameysharp Thanks for the reply!
Essentially I'd need routines for the following:
This is basically what https://graphitemaster.github.io/fibers/#user-space-context-switching discusses, but in Rust/Cranelift instead of C. |
On second though, if Cranelift were to provide this it would be able to only save the registers actually in use at that point, rather that pessimistically having to assume any of them may be used, and thus having to save all of them. The inverse would also be true: if we have known wake-up points, Cranelift would know which registers are in use afterwards, and only restore those. This may be less useful though, as based on the register allocator and the code that follows it might just be easier to restore all of them. |
That linked code only saves callee saved registers which have to be saved anyway for ABI reasons. |
I'm looking into using Cranelift as a backend for a programming language that I'm working on. Other options I have been looking into are compiling to C or using LLVM, but both come with their own set of trade-offs I'm not a fan of. For example, LLVM is generally quite slow, while C compilers tend to not give you enough control (i.e. they may reorder things unless you carefully insert compiler barriers). Cranelift is promising because of the amount of control it gives, and because it's written in Rust.
Unfortunately, the Cranelift documentation is limited, to a point where I'm overwhelmed/confused about where to even start. In this issue I'll be asking several questions to hopefully clear this up. If there's a better place for asking long-form questions like this, please let me know 😃
To set the stage: Inko is a language similar to Go/Erlang in that it has lightweight processes, and these can suspend at more or less any place (technically these spots are known, e.g. when performing IO, but to the user it may as well be "anywhere"). This means I need to be able to maintain and swap the stacks, like stackful coroutines. I also need to be able to grow the stack, and maybe shrink it (though figuring out when to do that is really tricky). Bare-metal performance is not a main goal, rather I want something that compiles fast and performs better than a well optimised interpreter (without a JIT), something I think one should be able to achieve with Cranelift.
Which brings me to the questions I have, in no particular order:
Cranelift offers no way of injecting assembly directly into the IR. Instead it seems you have to define a C/assembly function then bind to that. Is Cranelift able to jump to a naked function defined in C and exposed using
#[no_mangle]
andextern "C"
? The idea here is to make these stack fiddling routines naked functions so they don't introduce extra assembly overhead, and to give you full control. By jumping to them directly they essentially act as assembly macros, avoiding the function call overhead.Defining function names in Cranelift is a bit confusing.
UserFuncName
basically takes two arbitraryu32
values, and it's not clear what you'd use those for. Then there'sModule.declare_function
which takes a&str
, and looking at some existing Cranelift projects this seems to be what they use. So which one should I use if I want to expose my function names (potentially mangled)?EBBs take variables as arguments, but it's not clear how you'd determine what variables to pass down to sub blocks. For example, if I have a graph where blocks
A
andB
both jump toC
, andC
depends on a result produced by these blocks, how would I know what variable to pass? My own IR is already a graph but not in SSA, thus blocks can just use variables defined in some earlier block. My graph does ensure that in the above case the blocksA
andB
write to any variable needed byC
where its value can differ per branch.How would one associate debugger information with their generated code? For example, I would like for a debugger to know about my source code, line numbers, etc. The Cranelift API doesn't seem to offer anything to do things like "map these instructions to lines X, Y, Z in the source code". Do I have to fiddle with the likes of e.g. gimli (and whatever the equivalent is for Windows)?
Does Cranelift have any routines for detecting integer overflows? Instructions such as
iadd
seem to wrap around on overflow, so detecting it manually is doable if needed; I just don't want to reinvent this if already provided by Cranelift.Cranelift seems to have this notion of heaps for functions, and the ability to restrict functions to a certain portion of the heap. For my language there are no such restrictions (i.e. processes don't use fixed size heaps. How would I express this in Cranelift?
Related to that, I'm seeing various hints about Cranelift performing heap bounds checking in various places. It's not clear to me if this would be needed in my case, and if not if there's a way of opting out (i.e. I don't want redundant heap bounds checking in my generated code). Is there more information on this somewhere?
Related to swapping the stacks: if I insert an assembly routine in a function prologue (i.e. the function just starts with a call/jump to such a routine), is there a way to guarantee Cranelift won't move/inject anything before the call/jump? The assembly that might be needed for the system's calling convention should be fine as I can just add some padding to my stacks to account for that, but I don't want Cranelift to think my routine has no side effects and what not and just move it elsewhere.
Thanks for answering any of these questions, and again let me know if it's better to post this elsewhere! 😃
The text was updated successfully, but these errors were encountered: