Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make make_scalar_function() result candidate for inlining, by removing the Arc #12477

Conversation

findepi
Copy link
Member

@findepi findepi commented Sep 15, 2024

make_scalar_function serves as a template to implement functions, abstracting away processing of scalar values. While it's recommended to implement ScalarUDFImpl directly, a few functions still use the make_scalar_function helper, perhaps because it reduces code verbosity.

While make_scalar_function is a template function, it's not eligible for effective inlining because of the Arc<Fn> return type.

This commit removes Arc and unlocks inlining. The change impact was
verified manually using cargo rustc -p datafusion-functions-nested -- --emit asm -C opt-level=2 and comparing the generated for
ArrayDistance::invoke exemplary function that uses
make_scalar_function. Only after the changes can
ArrayDistance::invoke call the array_distance_inner directly.

The change saves some small overhead on each invocation of a UDF, but doesn't improve per-row performance.

`make_scalar_function` serves as a template to implement functions,
abstracting away processing of scalar values. While it's recommended to
implement `ScalarUDFImpl` directly, a few functions still use the
`make_scalar_function` helper, perhaps because it reduces code
verbosity.

While `make_scalar_function` is a template function, it's not eligible
for effective inlining because of the `Arc<Fn>` return type.

This commit removes `Arc` and unlocks inlining. The change impact was
verified manually using `cargo rustc -p datafusion-functions-nested   --
--emit asm -C opt-level=2` and comparing the generated for
`ArrayDistance::invoke` exemplary function that uses
`make_scalar_function`. Only after the changes can
`ArrayDistance::invoke` call the `array_distance_inner` directly.

The change saves some small overhead on each invocation of a UDF, but
doesn't improve per-row performance.
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- thank you @findepi

What do you think about removing ScalarFunctionImplementation entirely? Or is it still important to have a typedef around the Arc 🤔

@alamb alamb added the api change Changes the API exposed to users of the crate label Sep 16, 2024
@alamb alamb changed the title Make make_scalar_function() result candidate for inlining Make make_scalar_function() result candidate for inlining, by removing the Arc Sep 16, 2024
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @findepi

@comphead
Copy link
Contributor

Makes sense to me -- thank you @findepi

What do you think about removing ScalarFunctionImplementation entirely? Or is it still important to have a typedef around the Arc 🤔

I think there some more usages of this type

@findepi
Copy link
Member Author

findepi commented Sep 17, 2024

I think we might be able to get there eventually, but @comphead is right, we're not there yet.
Some more stuff coming such as #12505, #12506.

@alamb alamb merged commit 16a0eae into apache:main Sep 17, 2024
25 checks passed
@alamb
Copy link
Contributor

alamb commented Sep 17, 2024

Thanks again @findepi and @comphead

@findepi findepi deleted the findepi/make-make-scalar-function-result-candidate-for-inlining-bcdb1f branch September 17, 2024 18:47
@pacak
Copy link

pacak commented Sep 28, 2024

The change impact was verified manually using cargo rustc -p datafusion-functions-nested -- --emit asm -C opt-level=2 and comparing the generated for

FWIW with cargo-show-asm you could type something like this

cargo asm -p datafusion-functions-nested ArrayDistance::invoke
cargo asm -p datafusion-functions-nested make_scalar_function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants