[SPARK-28054][SQL] Fix error when insert Hive partitioned table dynamically where partition name is upper case#24886
Closed
viirya wants to merge 1 commit intoapache:masterfrom
Closed
[SPARK-28054][SQL] Fix error when insert Hive partitioned table dynamically where partition name is upper case#24886viirya wants to merge 1 commit intoapache:masterfrom
viirya wants to merge 1 commit intoapache:masterfrom
Conversation
…ame is upper case.
|
LGTM |
|
Test build #106559 has finished for PR 24886 at commit
|
Member
Author
|
cc @cloud-fan |
Member
|
Looks correct to me. |
cloud-fan
approved these changes
Jun 22, 2019
Member
|
Merged to master. |
Member
Author
|
Thanks @HyukjinKwon @cloud-fan |
gatorsmile
reviewed
Jun 26, 2019
|
|
||
| test("SPARK-28054: Unable to insert partitioned table when partition name is upper case") { | ||
| withTable("spark_28054_test") { | ||
| sql("set hive.exec.dynamic.partition.mode=nonstrict") |
Member
There was a problem hiding this comment.
Should we set it back?
Use withSQLConf ?
Also do we need to set the case sensitivity conf?
Member
Author
There was a problem hiding this comment.
This set follows other tests in same suite. Using withSQLConf is good, yes.
The case sensitivity conf has no effect on this, I think it is fine.
gatorsmile
reviewed
Jun 26, 2019
| // we also need to lowercase the column names in written partition paths. | ||
| // scalastyle:off caselocale | ||
| val hiveCompatiblePartitionColumns = partitionAttributes.map { attr => | ||
| attr.withName(attr.name.toLowerCase) |
Member
Author
There was a problem hiding this comment.
oops..will fix in a follow-up. Thanks.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
When we use upper case partition name in Hive table, like:
Then,
insert into tablequery doesn't workAs Hive metastore is not case preserving and keeps partition columns with lower cased names, we lowercase column names in partition spec before passing to Hive client. But we write upper case column names in partition paths.
However, when calling
loadDynamicPartitionsto doinsert into tablefor dynamic partition, Hive calculates full path spec for partition paths. So it calculates a partition spec like{ds=, DS=1}in above case and fails partition column validation. This patch is proposed to fix the issue by lowercasing the column names in written partition paths for Hive partitioned table.This fix touchs
saveAsHiveFilemethod, which is used inInsertIntoHiveDirCommandandInsertIntoHiveTablecommands. Among them, onlyInsertIntoHiveTablepassespartitionAttributesparameter. So I think this change only affectsInsertIntoHiveTablecommand.How was this patch tested?
Added test.