Skip to content

Conversation

@dongkelun
Copy link
Contributor

@dongkelun dongkelun commented Nov 4, 2022

In 0.9.0 Version,It's ok,But now failed

import spark.implicits._
val partitionValue = "2022-11-05"
val df = Seq((1, "a1", 10, 1000, partitionValue)).toDF("id", "name", "value", "ts", "dt")
val tableName = "test_hudi_table"
// Write a table by spark dataframe.
df.write.format("hudi")
.option(HoodieWriteConfig.TBL_NAME.key, tableName)
.option(TABLE_TYPE.key, MOR_TABLE_TYPE_OPT_VAL)
// .option(HoodieTableConfig.TYPE.key(), MOR_TABLE_TYPE_OPT_VAL)
.option(RECORDKEY_FIELD.key, "id")
.option(PRECOMBINE_FIELD.key, "ts")
.option(PARTITIONPATH_FIELD.key, "dt")
.option(KEYGENERATOR_CLASS_NAME.key, classOf[SimpleKeyGenerator].getName)
.option(HoodieWriteConfig.INSERT_PARALLELISM_VALUE.key, "1")
.option(HoodieWriteConfig.UPSERT_PARALLELISM_VALUE.key, "1")
.partitionBy("dt")
.mode(SaveMode.Overwrite)
.saveAsTable(tableName)
Can't find primaryKey `uuid` in root
 |-- _hoodie_commit_time: string (nullable = true)
 |-- _hoodie_commit_seqno: string (nullable = true)
 |-- _hoodie_record_key: string (nullable = true)
 |-- _hoodie_partition_path: string (nullable = true)
 |-- _hoodie_file_name: string (nullable = true)
 |-- id: integer (nullable = false)
 |-- name: string (nullable = true)
 |-- value: integer (nullable = false)
 |-- ts: integer (nullable = false)
 |-- dt: string (nullable = true)
.
java.lang.IllegalArgumentException: Can't find primaryKey `uuid` in root
 |-- _hoodie_commit_time: string (nullable = true)
 |-- _hoodie_commit_seqno: string (nullable = true)
 |-- _hoodie_record_key: string (nullable = true)
 |-- _hoodie_partition_path: string (nullable = true)
 |-- _hoodie_file_name: string (nullable = true)
 |-- id: integer (nullable = false)
 |-- name: string (nullable = true)
 |-- value: integer (nullable = false)
 |-- ts: integer (nullable = false)
 |-- dt: string (nullable = true)
.
    at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:40)
    at org.apache.spark.sql.hudi.HoodieOptionConfig$$anonfun$validateTable$1.apply(HoodieOptionConfig.scala:201)
    at org.apache.spark.sql.hudi.HoodieOptionConfig$$anonfun$validateTable$1.apply(HoodieOptionConfig.scala:200)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at org.apache.spark.sql.hudi.HoodieOptionConfig$.validateTable(HoodieOptionConfig.scala:200)
    at org.apache.spark.sql.catalyst.catalog.HoodieCatalogTable.parseSchemaAndConfigs(HoodieCatalogTable.scala:256)
    at org.apache.spark.sql.catalyst.catalog.HoodieCatalogTable.initHoodieTable(HoodieCatalogTable.scala:171)
    at org.apache.spark.sql.hudi.command.CreateHoodieTableAsSelectCommand.run(CreateHoodieTableAsSelectCommand.scala:99)

Change Logs

[HUDI-5160] Spark df saveAsTable failed with CTAS

Impact

[HUDI-5160] Spark df saveAsTable failed with CTAS

Risk level (write none, low medium or high below)

none

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@nsivabalan nsivabalan added engine:spark Spark integration priority:high Significant impact; potential bugs labels Nov 7, 2022
@hudi-bot
Copy link
Collaborator

hudi-bot commented Nov 9, 2022

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@xushiyan xushiyan self-assigned this Nov 11, 2022
@xushiyan xushiyan added area:sql SQL interfaces and removed engine:spark Spark integration labels Nov 11, 2022
@nsivabalan nsivabalan added priority:blocker Production down; release blocker release-0.12.2 Patches targetted for 0.12.2 and removed priority:high Significant impact; potential bugs labels Dec 6, 2022
@xushiyan xushiyan removed priority:blocker Production down; release blocker release-0.12.2 Patches targetted for 0.12.2 labels Dec 13, 2022
@xushiyan
Copy link
Member

@dongkelun thanks for making the patch. the root cause here is we did not support converting data source write config into table config while using saveAsTable(), where table was not created yet and hudi catalog table should handle this conversion. made this patch #7448

@xushiyan
Copy link
Member

closing in favor of #7448

@xushiyan xushiyan closed this Dec 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:sql SQL interfaces

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants