Skip to content

Conversation

@dharanad
Copy link
Contributor

@dharanad dharanad commented Jun 26, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Implemented ArrayMin Scalar function

Are these changes tested?

Yes, Add logic tests as well

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Jun 26, 2025
@dharanad dharanad marked this pull request as ready for review June 26, 2025 16:00
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jun 26, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @dharanad -- this looks very nice.

I suspect this function could be substantially sped up with a specialized implementation but we can defer that to when someone is interested in doing so

// specific language governing permissions and limitations
// under the License.

use crate::utils::make_scalar_function;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend putting this in the same file as https://github.com/apache/datafusion/blob/main/datafusion/functions-nested/src/max.rs

perhaps call it https://github.com/apache/datafusion/blob/main/datafusion/functions-nested/src/min_max.rs 🤔

You may be able to share some code, but more importantly I think it will be clearer that min and max are basically the same function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, should be quick. Let me move refac

@dharanad
Copy link
Contributor Author

Thank you @dharanad -- this looks very nice.

I suspect this function could be substantially sped up with a specialized implementation but we can defer that to when someone is interested in doing so

"specialized implementation" sounds interesting, can you please ellaborate more i can try working on it on a sperate issue ?

@alamb
Copy link
Contributor

alamb commented Jun 26, 2025

Thank you @dharanad -- this looks very nice.
I suspect this function could be substantially sped up with a specialized implementation but we can defer that to when someone is interested in doing so

"specialized implementation" sounds interesting, can you please ellaborate more i can try working on it on a sperate issue ?

Well the first thing would be to make a benchmark calling array_min for different types of ListArrays (like LIst and List) with different length list elements

The profile with these instructions https://datafusion.apache.org/library-user-guide/profiling.html#profile-the-benchmark

My guess (needs to be verified by profiling):

@dharanad
Copy link
Contributor Author

Thanks for the details @alamb I will try this out and share my findings

@alamb alamb merged commit 3839736 into apache:main Jun 27, 2025
50 of 51 checks passed
@alamb
Copy link
Contributor

alamb commented Jun 27, 2025

Thanks again @dharanad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support array_min scalar function

2 participants