Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions datafusion/functions/benches/signum.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ use arrow::{
util::bench_util::create_primitive_array,
};
use criterion::{Criterion, criterion_group, criterion_main};
use datafusion_common::ScalarValue;
use datafusion_common::config::ConfigOptions;
use datafusion_expr::{ColumnarValue, ScalarFunctionArgs};
use datafusion_functions::math::signum;
Expand Down Expand Up @@ -88,6 +89,51 @@ fn criterion_benchmark(c: &mut Criterion) {
)
})
});

// Scalar benchmarks (the optimization we added)
let scalar_f32_args =
vec![ColumnarValue::Scalar(ScalarValue::Float32(Some(-42.5)))];
let scalar_f32_arg_fields =
vec![Field::new("a", DataType::Float32, false).into()];
let return_field_f32 = Field::new("f", DataType::Float32, false).into();

c.bench_function(&format!("signum f32 scalar: {size}"), |b| {
b.iter(|| {
black_box(
signum
.invoke_with_args(ScalarFunctionArgs {
args: scalar_f32_args.clone(),
arg_fields: scalar_f32_arg_fields.clone(),
number_rows: 1,
return_field: Arc::clone(&return_field_f32),
config_options: Arc::clone(&config_options),
})
.unwrap(),
)
})
});

let scalar_f64_args =
vec![ColumnarValue::Scalar(ScalarValue::Float64(Some(-42.5)))];
let scalar_f64_arg_fields =
vec![Field::new("a", DataType::Float64, false).into()];
let return_field_f64 = Field::new("f", DataType::Float64, false).into();

c.bench_function(&format!("signum f64 scalar: {size}"), |b| {
b.iter(|| {
black_box(
signum
.invoke_with_args(ScalarFunctionArgs {
args: scalar_f64_args.clone(),
arg_fields: scalar_f64_arg_fields.clone(),
number_rows: 1,
return_field: Arc::clone(&return_field_f64),
config_options: Arc::clone(&config_options),
})
.unwrap(),
)
})
});
}
}

Expand Down
30 changes: 29 additions & 1 deletion datafusion/functions/src/math/signum.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ use arrow::array::{ArrayRef, AsArray};
use arrow::datatypes::DataType::{Float32, Float64};
use arrow::datatypes::{DataType, Float32Type, Float64Type};

use datafusion_common::{Result, exec_err};
use datafusion_common::{Result, ScalarValue, exec_err, internal_err};
use datafusion_expr::sort_properties::{ExprProperties, SortProperties};
use datafusion_expr::{
ColumnarValue, Documentation, ScalarFunctionArgs, ScalarUDFImpl, Signature,
Expand Down Expand Up @@ -98,6 +98,34 @@ impl ScalarUDFImpl for SignumFunc {
}

fn invoke_with_args(&self, args: ScalarFunctionArgs) -> Result<ColumnarValue> {
let arg = &args.args[0];
Comment thread
kumarUjjawal marked this conversation as resolved.
Outdated

// Scalar fast path for float types - avoid array conversion overhead
if let ColumnarValue::Scalar(scalar) = arg {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to repeat this comment about fast paths each time (not to mention specifying it for "float types" is confusing considering the function signature already limits the inputs to float types). So it can actually be a bit misleading as it might imply we omit fast path for non-float types. We're better off removing the comment.

Copy link
Copy Markdown
Contributor Author

@kumarUjjawal kumarUjjawal Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out.

if scalar.is_null() {
return ColumnarValue::Scalar(ScalarValue::Null)
.cast_to(args.return_type(), None);
}

match scalar {
ScalarValue::Float64(Some(v)) => {
let result = if *v == 0.0 { 0.0 } else { v.signum() };
return Ok(ColumnarValue::Scalar(ScalarValue::Float64(Some(result))));
}
ScalarValue::Float32(Some(v)) => {
let result = if *v == 0.0 { 0.0 } else { v.signum() };
return Ok(ColumnarValue::Scalar(ScalarValue::Float32(Some(result))));
}
_ => {
return internal_err!(
"Unexpected scalar type for signum: {:?}",
scalar.data_type()
);
}
}
}

// Array path
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might as well change the if let to a match statement, and inline the contents of signum here to avoid use of make_scalar_function to simplify the code

make_scalar_function(signum, vec![])(&args.args)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have handled that optimization, no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my interpretation is correct, you are asking: To add scalar optimization inside make_scalar_function? To do that we would need to change the signature to also accept a scalar function, which would be a larger refactor. If you meant that Doesn't make_scalar_function already handle scalar optimization? Then no we still need to convert scalars to arrays first. We have used the inline path in other parts of the optimization too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can technically use make_scalar_function with the correct hints, but we might be trying to move away from that function, see:

}

Expand Down