-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Open
Labels
type:bugBug reports and fixesBug reports and fixes
Description
Bug Description
What happened:
Running df.saveAsTable operations in Append mode for a MANAGED table results in "Can not create the managed table('$catalogTableName') The associated location('$tableLocation') already exists".
https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala#L280-L281
What you expected:
Append mode should work for managed tables.
Steps to reproduce:
import org.apache.hudi.DataSourceWriteOptions
import org.apache.spark.sql.SaveMode
val df1 = Seq(
("100", "2015-01-01", "event_name_900", "2015-01-01T13:51:39.340396Z", "type1"),
("101", "2015-01-01", "event_name_546", "2015-01-01T12:14:58.597216Z", "type2")
).toDF("event_id", "event_date", "event_name", "event_ts", "event_type")
val tableName = "<table_name>"
val databaseName = "<db_name>"
df1.write.format("hudi")
.option("hoodie.metadata.enable", "true")
.option("hoodie.table.name", tableName)
.option("hoodie.database.name", databaseName)
.option("hoodie.datasource.write.operation", "upsert")
.option("hoodie.datasource.write.table.type", "COPY_ON_WRITE")
.option("hoodie.datasource.write.recordkey.field", "event_id")
.option("hoodie.datasource.write.precombine.field", "event_ts")
.option("hoodie.datasource.write.keygenerator.class", "org.apache.hudi.keygen.NonpartitionedKeyGenerator")
.option("hoodie.datasource.hive_sync.enable", "true")
.option("hoodie.datasource.meta.sync.enable", "true")
.option("hoodie.datasource.hive_sync.mode", "hms")
.option("hoodie.datasource.hive_sync.database", databaseName)
.option("hoodie.datasource.hive_sync.table", tableName)
.mode(SaveMode.Append)
.saveAsTable(s"$databaseName.$tableName")
Environment
Hudi version: 1.0.2
Query engine: (Spark/Flink/Trino etc) Spark
Relevant configs:
Logs and Stack Trace
org.apache.spark.sql.AnalysisException: Can not create the managed table('spark_catalog.<db_nam>.<table_name>'). The associated location('<PATH>/<table_name>') already exists.
Metadata
Metadata
Assignees
Labels
type:bugBug reports and fixesBug reports and fixes