Skip to content

Add warning for IGNORE NULL with Window agggregates#23325

Merged
auden-woolfson merged 1 commit intoprestodb:masterfrom
auden-woolfson:window_agg_error
Aug 9, 2024
Merged

Add warning for IGNORE NULL with Window agggregates#23325
auden-woolfson merged 1 commit intoprestodb:masterfrom
auden-woolfson:window_agg_error

Conversation

@auden-woolfson
Copy link
Contributor

@auden-woolfson auden-woolfson commented Jul 29, 2024

Description

Added a warning when an IGNORE NULL clause is used on any non lag, lead, first or last value function. Presto will not throw an error in other cases, but it also will not ignore null values. See issue #21304

== RELEASE NOTES ==

General Changes

  • Add a warning when an IGNORE NULL clause is used on any non lag, lead, first, last, or nth value function. :pr:23325

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 29, 2024

CLA Signed

  • ✅login: auden-woolfson / (64cb28b)

The committers listed above are authorized under a signed CLA.

@auden-woolfson auden-woolfson linked an issue Jul 29, 2024 that may be closed by this pull request
@auden-woolfson
Copy link
Contributor Author

Easy CLA should be resolved once I make/push more changes. Had to set local git config

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Some suggestions.

}

if (node.isIgnoreNulls()) {
ImmutableList<String> allowedIgnoreNullFunctionNames = ImmutableList.of("lead", "lag", "first_value", "last_value");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not hardcode this. Instead, let's create a new method in FunctionResolution that identifies a window value function. I think it will be OK to hardcode it, but inside there, make it a constant. Also, I think you're missing one, Nth value. Double check this list by looking at all subclasses of ValueWindowFunction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Made all the updates you suggested. I added the list of window value functions to FunctionResolution. Is this the right place for it to live?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's good. Next let's look for a place to put some tests. We should be able to construct some simple queries that show a warning is emitted for functions which are not window value functions, and no warning is emitted for window value functions.

if (node.isIgnoreNulls()) {
ImmutableList<String> allowedIgnoreNullFunctionNames = ImmutableList.of("lead", "lag", "first_value", "last_value");
if (!allowedIgnoreNullFunctionNames.contains(node.getName().toString())) {
warningCollector.add(new PrestoWarning(SEMANTIC_WARNING, "IGNORE NULLS does not work for aggregate and ranking window functions"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also mention that it will start failing in the future.

implements StandardFunctionResolution
{
private final FunctionAndTypeResolver functionAndTypeResolver;
private final ImmutableList<QualifiedObjectName> windowValueFunctions = ImmutableList.of(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private final ImmutableList<QualifiedObjectName> windowValueFunctions = ImmutableList.of(
private final List<QualifiedObjectName> windowValueFunctions = ImmutableList.of(

}

if (node.isIgnoreNulls()) {
FunctionResolution functionResolution = new FunctionResolution(functionAndTypeResolver);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the resolution a member variable of the class

"(ORDER BY b)\n FROM (VALUES (1, 1, 3), (1, 2, null), (1, 4, 2)) AS t(a, b, c)"));
}

assertHasWarning(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more tests with various aggregate functions and those in this list https://prestodb.io/docs/current/functions/window.html#ranking-functions

}
}

if (node.isIgnoreNulls()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the window function semantic checks happen here https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/sql/analyzer/ExpressionAnalyzer.java#L981. Any particular reason we didn't add this check after the frame validation in that block ?

@tdcmeehan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There wasn't any reason before that I had placed it on that exact line, but it needs to be after the function variable has been initialized. I moved it back to line 1122.

@aditi-pandit
Copy link
Contributor

@auden-woolfson : Thanks for working on this issue. Please can you merge all your commits into a single one.

@aditi-pandit aditi-pandit changed the title Window agg error Add warning for IGNORE NULL with Window agggregates Jul 30, 2024
Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits


if (node.isIgnoreNulls()) {
if (!functionResolution.isWindowValueFunction(function)) {
String warningMessage = createWarningMessage(node, "IGNORE NULLS does not work for aggregate and ranking window functions, this will cause queries to fail in future versions");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String warningMessage = createWarningMessage(node, "IGNORE NULLS does not work for aggregate and ranking window functions, this will cause queries to fail in future versions");
String warningMessage = createWarningMessage(node, "IGNORE NULLS is not used for aggregate and ranking window functions. This will cause queries to fail in future versions.");

}

@Test
void testIgnoreNullWarning()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void testIgnoreNullWarning()
public void testIgnoreNullWarning()

@tdcmeehan
Copy link
Contributor

Code looks good. Let's reword the commit, something like

Warn when a non-window value function is specified to ignore nulls

Copy link
Member

@jaystarshot jaystarshot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this code path executed only once? Can there be multiple warnings for a single query

@auden-woolfson
Copy link
Contributor Author

Is this code path executed only once? Can there be multiple warnings for a single query

I just tested this with the following query....

SELECT a, array_agg(c) IGNORE NULLS OVER ( PARTITION BY a ORDER BY b ROWS BETWEEN 1 PRECEDING AND CURRENT ROW ) AS agg_1, array_agg(c) IGNORE NULLS OVER ( PARTITION BY a ORDER BY b ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS agg_2 FROM (VALUES (1, 1, 3), (1, 2, NULL), (1, 4, 2)) AS t(a, b, c);

I did get two errors, it will error out for each individual "invalid" ignore null.

@tdcmeehan tdcmeehan self-assigned this Aug 2, 2024
@auden-woolfson auden-woolfson force-pushed the window_agg_error branch 2 times, most recently from 796bf82 to a7f070a Compare August 2, 2024 20:43
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an issue to follow up and convert this warning into a failure after the next release has been cut.

@tdcmeehan
Copy link
Contributor

@auden-woolfson please add a release note. See other PRs and our contributing guide for examples.

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed text of warning, looks good to me!

@steveburnett
Copy link
Contributor

For information on writing and formatting a release note entry, see the Release Notes Guidelines. As a starting point, maybe something like this following draft:

== RELEASE NOTES ==

General Changes
* Add a warning when an IGNORE NULL clause is used on any non lag, lead, first or last value function.  :pr:`23325`

@auden-woolfson auden-woolfson merged commit 23477ad into prestodb:master Aug 9, 2024
@tdcmeehan tdcmeehan mentioned this pull request Aug 23, 2024
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IGNORE NULLs in Window aggregates doesn't seem to work

5 participants