[Prototype & Benchmark] Support directly writing to output block builder for scalar functions#9638
[Prototype & Benchmark] Support directly writing to output block builder for scalar functions#9638wenleix wants to merge 2 commits intoprestodb:masterfrom
Conversation
|
Nice! Alternatively for scalars the returned object could be passed as |
|
Thank you @sopel39 ! :)
|
That's correct. Specifically I was thinking about fast decimal, which is represented by |
|
@sopel39 : With this new return convention , we can also support writing slice into output block :). So for outmost function that returns long decimal, it can avoid allocating new Slice and get all the saves (and even avoid the copying). For function in the middle of chained calls (e.g. the |
|
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the task, make sure you've addressed reviewer comments, and rebase on the latest master. Thank you for your contributions! |
Introduction
Today the return convention for scalar function is always to return on stack, and the callee will append the results value on stack into the result BlockBuilder(for out-most function call) or use it to invoke other functions (for inner/nested function call like
f(g(x)))While this return convention works well for primitive types, it's not optimal for structural types since it always has to copy the result block.
Proposed Solution
To address this inefficiency, one idea is to introduce a new return convention that directly writes to the output block builder. This requires two part of the work:
We need to support compiling a
RowExpressionto write to output block builder directly when presented. A prototype is done in [WIP] Compile RowExpression to directly write to output block builder #8747.Update: This new return convention is supported via Implement PROVIDED_BLOCKBUILDER return place convention for scalar function #12166.
Since the callee might expect certain return convention (
return on stackvs.direct output write), we need be able to adapt between different return convention. While adapting fromreturn on stacktodirect output writeis trivial, the other direction of adaption is more involved, as many functions leverageCachedInstanceBinderto maintain function state, and to avoid repeatedly allocating memory for result (see [Prototype & Benchmark] Support directly writing to output block builder for scalar functions #9638 (comment)).An preliminary proof-of-concept of the
InvocationAdaptercan be found in commit wenleix@d0fb108Benchmark Result
In this PR we prototyped the preliminary support to directly write to output block builder and benchmark the potential performance gain. This implementation is not a full support since the adaption for a caller expect
return on stackbehavior while callee providesdirectly write to blockbehavior requires more work .To see the potential performance gain, we add a fake
array_identityfunction which copies the array (see the second commit), and benchmark its performance:We see about 2x performance gain by directly writing to output block builder (instead of putting the result block on stack and copy to the final block).
This is based on #8747.