Skip to content

Reintroduce fix for dynamic pruning for with null keys in hive partition#16261

Merged
pettyjamesm merged 2 commits intoprestodb:masterfrom
pettyjamesm:fix-dynamic-filtering-nulls-again
Jun 21, 2021
Merged

Reintroduce fix for dynamic pruning for with null keys in hive partition#16261
pettyjamesm merged 2 commits intoprestodb:masterfrom
pettyjamesm:fix-dynamic-filtering-nulls-again

Conversation

@pettyjamesm
Copy link
Contributor

@pettyjamesm pettyjamesm commented Jun 14, 2021

Fixes dynamic pruning when null values are present by reintroducing changes from #15470 (previously reverted in #16061).

#16061 appears to have chosen to revert the previous fix because partitions with a non-null value of "\N" in the name could accidentally be interpreted as null which they should not have been.

This change avoids converting "__HIVE_DEFAULT_PARTITION__" to "\N" in the HivePartitionKey constructor and propagates Optional<String> as the value instead. Callers must now perform a context-appropriate conversion for null strings at their usage site.

== RELEASE NOTES ==

Hive Changes
* Fix dynamic pruning for null keys in hive partition

@pettyjamesm pettyjamesm force-pushed the fix-dynamic-filtering-nulls-again branch from b64316d to df62610 Compare June 14, 2021 19:12
Copy link
Contributor

@NikhilCollooru NikhilCollooru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

When encoded as partition names, null column values in Hive use a
magic string: __HIVE_DEFAULT_PARTITION__. In other places though,
null values are represented with the sequence \N. This creates an
unfortunate ambiguity whereby a non-null partition value of "\N"
could subsequently be interpreted as null in some value contexts
such as prefilled values or for the purposes of dynamic filtering.

Instead of converting "__HIVE_DEFAULT_PARTITION__" to "\N" in its
constructor, HivePartitionKey will instead carry Optional<String>
for its value and rely on callers to use the appropriate null
string for their context.
@pettyjamesm pettyjamesm force-pushed the fix-dynamic-filtering-nulls-again branch from df62610 to dd8e73b Compare June 17, 2021 18:59
@pettyjamesm pettyjamesm merged commit a95a2e6 into prestodb:master Jun 21, 2021
@pettyjamesm pettyjamesm deleted the fix-dynamic-filtering-nulls-again branch June 21, 2021 13:19
@ajaygeorge ajaygeorge mentioned this pull request Jul 7, 2021
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants