Skip to content

array_min/array_max throw if first element is an inner Array with a null#23323

Merged
xiaoxmeng merged 1 commit intoprestodb:masterfrom
kevinwilfong:array_min
Aug 15, 2024
Merged

array_min/array_max throw if first element is an inner Array with a null#23323
xiaoxmeng merged 1 commit intoprestodb:masterfrom
kevinwilfong:array_min

Conversation

@kevinwilfong
Copy link
Contributor

@kevinwilfong kevinwilfong commented Jul 29, 2024

Description

array_min and array_max will always throw an exception if the first element is an inner array with a null.

Motivation and Context

array_min and array_max always compare the first element with itself. Besides being unnecessary this produces inconsistent behavior where if the first element is an array (or there is an array recursively in a complex type) that contains a null the function will throw "ARRAY comparison not supported for arrays with null elements".

This is because the element is of course equal to itself, but everything in it needs to be compared to verify this.

For other elements, the comparison will exit early as soon as it can be confirmed that it is less than/greater than the current candidate, so it may not throw even if it contains an array with a null element.

Impact

This should slightly improve performance by avoiding an unnecessary comparison, but will also cause the function to throw only if it needs to compare a null element of an inner array in the first element with the corresponding value in a later element.

Test Plan

Added unit tests to verify a null in the first element does not necessarily cause the function to throw, but it does throw if it is necessary to compare that null with another value.

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

== RELEASE NOTES ==

General Changes
* Fix array_min/array_max to not necessarily throw if the first element recursively contains an inner array with a null element. :pr:`23323`

@kevinwilfong kevinwilfong requested a review from a team as a code owner July 29, 2024 19:40
@kevinwilfong kevinwilfong requested a review from presto-oss July 29, 2024 19:40
@kevinwilfong kevinwilfong marked this pull request as draft July 29, 2024 21:09
@kevinwilfong kevinwilfong marked this pull request as ready for review July 30, 2024 01:00
Copy link
Contributor

@elharo elharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these functions documented somewhere? If so, does that need to be updated?

assertFunction("ARRAY_MIN(ARRAY ['puppies', 'kittens'])", createVarcharType(7), "kittens");
assertFunction("ARRAY_MIN(ARRAY [TRUE, FALSE])", BOOLEAN, false);
assertFunction("ARRAY_MIN(ARRAY [NULL, FALSE])", BOOLEAN, null);
assertFunction("ARRAY_MIN(ARRAY [ARRAY[1, NULL], ARRAY[2, NULL]])", new ArrayType(INTEGER), Lists.newArrayList(1, null));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have new test methods instead please? Makes it easier to debug failures.

@kevinwilfong
Copy link
Contributor Author

@elharo Thanks for the review!

The functions are documented here

.. function:: array_max(x) -> x
Returns the maximum value of input array.
.. function:: array_min(x) -> x
Returns the minimum value of input array.

But it does not say anything about this behavior that would need to be updated.

@kevinwilfong
Copy link
Contributor Author

The failing check appears to be some TestHudiIntegration tests that are broken in trunk #22031

@kevinwilfong kevinwilfong requested a review from elharo July 31, 2024 17:36
@amitkdutta amitkdutta requested a review from rschlussel August 5, 2024 16:39
}

@Test
public void testArrayMaxWithNulls()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be four separate test methods so they can pass or fail independently.

rschlussel
rschlussel previously approved these changes Aug 13, 2024
elharo
elharo previously approved these changes Aug 13, 2024
@xiaoxmeng xiaoxmeng merged commit 330fa47 into prestodb:master Aug 15, 2024
@tdcmeehan tdcmeehan mentioned this pull request Aug 23, 2024
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants