Reintroduce fix for dynamic pruning for with null keys in hive partition#16261
Merged
pettyjamesm merged 2 commits intoprestodb:masterfrom Jun 21, 2021
Merged
Conversation
b64316d to
df62610
Compare
rschlussel
approved these changes
Jun 17, 2021
This reverts commit 72a7afc.
When encoded as partition names, null column values in Hive use a magic string: __HIVE_DEFAULT_PARTITION__. In other places though, null values are represented with the sequence \N. This creates an unfortunate ambiguity whereby a non-null partition value of "\N" could subsequently be interpreted as null in some value contexts such as prefilled values or for the purposes of dynamic filtering. Instead of converting "__HIVE_DEFAULT_PARTITION__" to "\N" in its constructor, HivePartitionKey will instead carry Optional<String> for its value and rely on callers to use the appropriate null string for their context.
df62610 to
dd8e73b
Compare
fgwang7w
approved these changes
Jun 18, 2021
highker
approved these changes
Jun 21, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes dynamic pruning when null values are present by reintroducing changes from #15470 (previously reverted in #16061).
#16061 appears to have chosen to revert the previous fix because partitions with a non-null value of "\N" in the name could accidentally be interpreted as null which they should not have been.
This change avoids converting
"__HIVE_DEFAULT_PARTITION__"to"\N"in theHivePartitionKeyconstructor and propagatesOptional<String>as the value instead. Callers must now perform a context-appropriate conversion for null strings at their usage site.