-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cranelift: Remove support for implicitly adding a return area pointer #9510
Comments
@bjorn3 and I talked about this a bit offline -- I was initially skeptical for the reason that supporting arbitrary return values is ergonomic, fits well with the CLIF abstraction level in general, and we have it already. However, if we're facing some brokenness in our implementation already with respect to what the ABI says should happen, it may not be a bad idea to say that the Cranelift embedder (Wasmtime or cg_clif or other) needs to build their own mechanism for multiple return values. After all, "N args, 1 return" is also a perfectly consistent abstraction level and one that we live with in many other languages. Practically speaking in this repo, this means a good amount of code removal in the ABI code as bjorn3 mentions, and then building the equivalent in What do others think? We can talk about this in the next CL meeting if needed, too. |
I've opened #9511 to gate the support behind an option. This should help with removing it in the future and reduces likelihood of accidentally hitting the ABI issue in the short term. |
Clarification: Who is creating this stack slot? The caller of any function that returns multiple arguments? And then for tail calls, the caller would forward the return pointer it had to have been given for this tail call to type check? Idea: we could also stuff a buffer within the vmctx and avoid burning a register on the return pointer. It does mean we would need to eagerly reload values out of the buffer before any subsequent calls, since the buffer would presumably be too small to use as a full stack, but I think that's what you were thinking we'd do anyways. |
Yes
That would work provided that you know an upper bound on the size of the return area. You could also put the return area pointer in the vmctx instead and keep the return area on the stack. That would avoid expensive copies entirely at the cost of an extra indirection. That said, I don't think multi value return with more values than fit in registers is common enough to warrant avoiding a return area pointer register. The register only needs to be reserved if a return area is necessary at all. For the Tail and Winch call convs we could add more return registers than the native call conv to reduce the likelihood of needing a return area. |
Very true, and this is definitely the first thing we should do if we find ourselves optimizing the return of many values.
We can do that, since the Wasm spec allows implementations to impose limits on many things including number of results. We try to match the JS/Web limits for consistency and portability, and the JS limit on number of results is 1000. In general, I'd prefer embedding the return area space within the vmctx over pointing to the return area space from the vmctx just so we could avoid the indirection. If we get to the point where it makes sense for us to optimize this stuff by saving a register, we might as well go all the way and avoid the indirection as well. FWIW, the use case where I could imagine these kinds of calling convention micro optimizations actually mattering is when we are dealing with basic blocks that have been compiled down to branchless functions that all tail call each other, and so every function/block in the program takes (say) 6 arguments and generates 6 returns and function calls are frequent. |
Feature
When there are more returns than available return registers, the return value has to be returned through the introduction of a return area pointer. This can be done both explicitly using a
ArgumentPurpose::StructReturn
parameter (as cg_clif does) or implicitly by Cranelift. The support for implicitly adding it adds a lot of complexity to the ABI handling code in Cranelift (which is already way too complex) and has two ABI incompatibilities:Given that there is a lot of complexity in Cranelift caused by this and it is effectively broken anyway for native calling conventions, I propose removing support for implicit return area pointers, requiring the frontend to use explicit return area pointers instead.
Benefit
The text was updated successfully, but these errors were encountered: