Skip to content

Commit 717ec5e

Browse files
turboFeicloud-fan
authored andcommitted
[SPARK-29295][SQL][FOLLOWUP] Dynamic partition map parsed from partition path should be case insensitive
### What changes were proposed in this pull request? This is a follow up of #25979. When we inserting overwrite an external hive partitioned table with upper case dynamic partition key, exception thrown. like: ``` org.apache.spark.SparkException: Dynamic partition key P1 is not among written partition paths. ``` The root cause is that Hive metastore is not case preserving and keeps partition columns with lower cased names, see details in: https://github.com/apache/spark/blob/ddd8d5f5a0b6db17babc201ba4b73f7df91df1a3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L895-L901 https://github.com/apache/spark/blob/e28914095aa1fa7a4680b5e4fcf69e3ef64b3dbc/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala#L228-L234 In this PR, we convert the dynamic partition map to a case insensitive map. ### Why are the changes needed? To fix the issue when inserting overwrite into external hive partitioned table with upper case dynamic partition key. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? UT. Closes #28765 from turboFei/SPARK-29295-follow-up. Authored-by: turbofei <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
1 parent de91915 commit 717ec5e

File tree

2 files changed

+18
-1
lines changed

2 files changed

+18
-1
lines changed

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
2929
import org.apache.spark.sql.catalyst.catalog._
3030
import org.apache.spark.sql.catalyst.expressions.Attribute
3131
import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
32+
import org.apache.spark.sql.catalyst.util.CaseInsensitiveMap
3233
import org.apache.spark.sql.execution.SparkPlan
3334
import org.apache.spark.sql.execution.command.CommandUtils
3435
import org.apache.spark.sql.hive.HiveExternalCatalog
@@ -225,9 +226,12 @@ case class InsertIntoHiveTable(
225226
ExternalCatalogUtils.unescapePathName(splitPart(1))
226227
}.toMap
227228

229+
val caseInsensitiveDpMap = CaseInsensitiveMap(dpMap)
230+
228231
val updatedPartitionSpec = partition.map {
229232
case (key, Some(value)) => key -> value
230-
case (key, None) if dpMap.contains(key) => key -> dpMap(key)
233+
case (key, None) if caseInsensitiveDpMap.contains(key) =>
234+
key -> caseInsensitiveDpMap(key)
231235
case (key, _) =>
232236
throw new SparkException(s"Dynamic partition key $key is not among " +
233237
"written partition paths.")

sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2544,6 +2544,19 @@ abstract class SQLQuerySuiteBase extends QueryTest with SQLTestUtils with TestHi
25442544
assert(e.getMessage.contains("Cannot modify the value of a static config"))
25452545
}
25462546
}
2547+
2548+
test("SPARK-29295: dynamic partition map parsed from partition path should be case insensitive") {
2549+
withTable("t") {
2550+
withSQLConf("hive.exec.dynamic.partition" -> "true",
2551+
"hive.exec.dynamic.partition.mode" -> "nonstrict") {
2552+
withTempDir { loc =>
2553+
sql(s"CREATE TABLE t(c1 INT) PARTITIONED BY(P1 STRING) LOCATION '${loc.getAbsolutePath}'")
2554+
sql("INSERT OVERWRITE TABLE t PARTITION(P1) VALUES(1, 'caseSensitive')")
2555+
checkAnswer(sql("select * from t"), Row(1, "caseSensitive"))
2556+
}
2557+
}
2558+
}
2559+
}
25472560
}
25482561

25492562
class SQLQuerySuite extends SQLQuerySuiteBase with DisableAdaptiveExecutionSuite

0 commit comments

Comments
 (0)