Skip to content

Avoid pushdown IS (NOT) NULL expression on numeric types to Glue#13138

Merged
ebyhr merged 2 commits intomasterfrom
ebi/hive-glue-null
Jul 12, 2022
Merged

Avoid pushdown IS (NOT) NULL expression on numeric types to Glue#13138
ebyhr merged 2 commits intomasterfrom
ebi/hive-glue-null

Conversation

@ebyhr
Copy link
Copy Markdown
Member

@ebyhr ebyhr commented Jul 11, 2022

Description

Avoid pushdown IS (NOT) NULL expression on numeric types to Glue
Fixes #13124

Documentation

(x) No documentation is needed.

Release notes

(x) Release notes entries required with the following suggested text:

# Hive, Iceberg, Delta Lake
* Fix failure when pushing `IS NULL` and `IS NOT NULL` predicates on numeric types in Glue. ({issue}`13124`)

@ebyhr ebyhr force-pushed the ebi/hive-glue-null branch from 593c944 to 89c29f1 Compare July 11, 2022 07:49
@ebyhr ebyhr requested a review from raunaqmorarka July 11, 2022 08:22
}

// Glue throws an exception (e.g. input string: "__HIVE_D" is not an integer)
// for column <> '__HIVE_DEFAULT_PARTITION__' or column = '__HIVE_DEFAULT_PARTITION__' expression on numeric types
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there is any specific reason on not using NOT or IS NULL logical operator provided by Glue ? It was mentioned here - but any reason on not using it https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-catalog-partitions.html#aws-glue-api-catalog-partitions-GetPartitions

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findinpath and I confirmed the expression returns incorrect result. Let me leave a code comment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar behavior is shown even for numeric type ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it behaves same regardless of the types. (We already shared this issue with AWS engineers)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Praveen2112 Added a comment. Please take another look.

Copy link
Copy Markdown
Member

@Praveen2112 Praveen2112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally can we track the issue i.e pushdown is not being applied when we try to fetch the partitions for IS (NOT) NULL on numeric types. So once AWS fixes the issue we could use NULL operator.

@ebyhr
Copy link
Copy Markdown
Member Author

ebyhr commented Jul 12, 2022

Filed #13150

@ebyhr ebyhr merged commit e28f639 into master Jul 12, 2022
@ebyhr ebyhr deleted the ebi/hive-glue-null branch July 12, 2022 09:47
@ebyhr ebyhr mentioned this pull request Jul 12, 2022
@github-actions github-actions bot added this to the 390 milestone Jul 12, 2022
// https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-catalog-partitions.html#aws-glue-api-catalog-partitions-GetPartitions
if ((domain.getValues().isAll() || domain.getValues().isNone()) && !isQuotedType(domain.getType())) {
return Optional.empty();
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code below doesn't handle the IS NULL OR IS some value case correctly.
-- It's apparent from the fact it "recognizes" IS NULL case by inspecting valueSet.isNone() (which is quite implicit), instead of checking domain.isNullAllowed(). Thus, it doesn't discern "some values" and "some values OR NULL" cases.

#13214 should make it clear (it doesn't solve the problem just yet)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your explanation. I was going to handle IS NULL OR IS some value case in #13122 after this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Glue partition predicates with IS (NOT) NULL fail on some types when assume-canonical-partition-keys is true

4 participants