-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Closed
Labels
area:writerWrite client and core write operationsWrite client and core write operationspriority:highSignificant impact; potential bugsSignificant impact; potential bugsrelease-0.12.2Patches targetted for 0.12.2Patches targetted for 0.12.2
Description
Related JIRA ticket: https://issues.apache.org/jira/browse/HUDI-5271
Describe the problem you faced
When using Spark to create a hudi table with these config,
- INMEMORY or CONSISTENT_HASHING BUCKET index
- Decimal data type in schema
- MOR Hudi table
- Use Spark catalog
It will trigger an exception when query this table.
ERROR org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader [] - Got exception when reading log file
org.apache.avro.AvroTypeException: Found hoodie.test_mor_tab.test_mor_tab_record.new_test_col.fixed, expecting union
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) ~[avro-1.8.2.jar:1.8.2]
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:209)
...
To Reproduce
Steps to reproduce the behavior:
class TestInsertTable extends HoodieSparkSqlTestBase {
test("Test Insert Into MOR table") {
withTempDir { tmp =>
val tableName = "test_mor_tab"
// Create a partitioned table
spark.sql(
s"""
|create table $tableName (
| id int,
| dt string,
| name string,
| price double,
| ts long,
| new_test_col decimal(25, 4) comment 'a column for test decimal type'
|) using hudi
|options
|(
| type = 'mor'
| ,primaryKey = 'id'
| ,hoodie.index.type = 'INMEMORY'
|)
| tblproperties (primaryKey = 'id')
| partitioned by (dt)
| location '${tmp.getCanonicalPath}'
""".stripMargin)
// Note: Do not write the field alias, the partition field must be placed last.
spark.sql(
s"""
| insert into $tableName values
| (1, 'a1', 10, 1000, 1.0, "2021-01-05"),
| (2, 'a2', 20, 2000, 2.0, "2021-01-06"),
| (3, 'a3', 30, 3000, 3.0, "2021-01-07")
""".stripMargin)
spark.sql(s"select id, name, price, ts, dt from $tableName").show(false)
}
}
}- Rdd this test case in module
hudi-spark-datasource/hudi-sparkin the test classorg.apache.spark.sql.hudi.TestInsertTable
Expected behavior
This test case should run properly without any exception
Environment Description
-
Hudi version : latest master branch, commit
3109d890f13b1b29e5796a9f34ab28fa898ec23c -
Spark version : tried Spark 2.4/3.1, all have the same issue
-
Hive version : N/A
-
Hadoop version : N/A
-
Storage (HDFS/S3/GCS..) : HDFS
-
Running on Docker? (yes/no) : no
Additional context
Add any other context about the problem here.
Stacktrace
The full error stack of the above test case
19963 [ScalaTest-run-running-TestInsertTable] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator [] - Code generated in 20.563541 ms
20015 [ScalaTest-run-running-TestInsertTable] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator [] - Code generated in 18.67177 ms
20036 [ScalaTest-run-running-TestInsertTable] INFO org.apache.spark.SparkContext [] - Starting job: apply at OutcomeOf.scala:85
20036 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler [] - Got job 24 (apply at OutcomeOf.scala:85) with 1 output partitions
20036 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler [] - Final stage: ResultStage 35 (apply at OutcomeOf.scala:85)
20036 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler [] - Parents of final stage: List()
20036 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler [] - Missing parents: List()
20037 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler [] - Submitting ResultStage 35 (MapPartitionsRDD[71] at apply at OutcomeOf.scala:85), which has no missing parents
20051 [dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore [] - Block broadcast_33 stored as values in memory (estimated size 15.4 KB, free 2002.3 MB)
20060 [dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore [] - Block broadcast_33_piece0 stored as bytes in memory (estimated size 7.4 KB, free 2002.3 MB)
20060 [dispatcher-event-loop-0] INFO org.apache.spark.storage.BlockManagerInfo [] - Added broadcast_33_piece0 in memory on 10.2.175.58:53317 (size: 7.4 KB, free: 2004.2 MB)
20061 [dag-scheduler-event-loop] INFO org.apache.spark.SparkContext [] - Created broadcast 33 from broadcast at DAGScheduler.scala:1161
20061 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler [] - Submitting 1 missing tasks from ResultStage 35 (MapPartitionsRDD[71] at apply at OutcomeOf.scala:85) (first 15 tasks are for partitions Vector(0))
20061 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.TaskSchedulerImpl [] - Adding task set 35.0 with 1 tasks
20064 [dispatcher-event-loop-1] INFO org.apache.spark.scheduler.TaskSetManager [] - Starting task 0.0 in stage 35.0 (TID 44, localhost, executor driver, partition 0, PROCESS_LOCAL, 8249 bytes)
20064 [Executor task launch worker for task 44] INFO org.apache.spark.executor.Executor [] - Running task 0.0 in stage 35.0 (TID 44)
20080 [Executor task launch worker for task 44] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] - Loading HoodieTableMetaClient from file:/private/var/folders/n7/7v_cwpdn79lc75czxd84bdd8mwzd9l/T/spark-cdd1a67d-c0be-4c46-826a-445e29dfa751
20081 [Executor task launch worker for task 44] INFO org.apache.hudi.common.table.HoodieTableConfig [] - Loading table properties from file:/private/var/folders/n7/7v_cwpdn79lc75czxd84bdd8mwzd9l/T/spark-cdd1a67d-c0be-4c46-826a-445e29dfa751/.hoodie/hoodie.properties
20081 [Executor task launch worker for task 44] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] - Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from file:/private/var/folders/n7/7v_cwpdn79lc75czxd84bdd8mwzd9l/T/spark-cdd1a67d-c0be-4c46-826a-445e29dfa751
20083 [Executor task launch worker for task 44] INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded instants upto : Option{val=[20221123105000131__deltacommit__COMPLETED]}
20083 [Executor task launch worker for task 44] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader [] - Scanning log file HoodieLogFile{pathStr='file:/private/var/folders/n7/7v_cwpdn79lc75czxd84bdd8mwzd9l/T/spark-cdd1a67d-c0be-4c46-826a-445e29dfa751/dt=2021-01-05/.04aba946-8423-4ddd-9d04-fbbd91ba37a2-0_20221123105000131.log.1_0-17-17', fileLen=-1}
20084 [Executor task launch worker for task 44] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader [] - Reading a data block from file file:/private/var/folders/n7/7v_cwpdn79lc75czxd84bdd8mwzd9l/T/spark-cdd1a67d-c0be-4c46-826a-445e29dfa751/dt=2021-01-05/.04aba946-8423-4ddd-9d04-fbbd91ba37a2-0_20221123105000131.log.1_0-17-17 at instant 20221123105000131
20084 [Executor task launch worker for task 44] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader [] - Merging the final data blocks
20084 [Executor task launch worker for task 44] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader [] - Number of remaining logblocks to merge 1
20086 [Executor task launch worker for task 44] ERROR org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader [] - Got exception when reading log file
org.apache.avro.AvroTypeException: Found hoodie.test_mor_tab.test_mor_tab_record.new_test_col.fixed, expecting union
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) ~[avro-1.8.2.jar:1.8.2]
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:207) ~[classes/:?]
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:144) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processDataBlock(AbstractHoodieLogRecordReader.java:633) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:715) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:368) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:220) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:209) ~[classes/:?]
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:112) ~[classes/:?]
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:105) ~[classes/:?]
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:343) ~[classes/:?]
at org.apache.hudi.LogFileIterator$.scanLog(LogFileIterator.scala:305) ~[classes/:?]
at org.apache.hudi.LogFileIterator.<init>(LogFileIterator.scala:88) ~[classes/:?]
at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:96) ~[classes/:?]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.scheduler.Task.run(Task.scala:123) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_345]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_345]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_345]
20095 [Executor task launch worker for task 44] ERROR org.apache.spark.executor.Executor [] - Exception in task 0.0 in stage 35.0 (TID 44)
org.apache.hudi.exception.HoodieException: Exception when reading log file
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:377) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:220) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:209) ~[classes/:?]
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:112) ~[classes/:?]
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:105) ~[classes/:?]
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:343) ~[classes/:?]
at org.apache.hudi.LogFileIterator$.scanLog(LogFileIterator.scala:305) ~[classes/:?]
at org.apache.hudi.LogFileIterator.<init>(LogFileIterator.scala:88) ~[classes/:?]
at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:96) ~[classes/:?]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.scheduler.Task.run(Task.scala:123) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) ~[spark-core_2.11-2.4.4.jar:2.4.4]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_345]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_345]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_345]
Caused by: org.apache.avro.AvroTypeException: Found hoodie.test_mor_tab.test_mor_tab_record.new_test_col.fixed, expecting union
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) ~[avro-1.8.2.jar:1.8.2]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) ~[avro-1.8.2.jar:1.8.2]
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:207) ~[classes/:?]
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:144) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processDataBlock(AbstractHoodieLogRecordReader.java:633) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:715) ~[classes/:?]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:368) ~[classes/:?]
... 27 more
20111 [task-result-getter-0] WARN org.apache.spark.scheduler.TaskSetManager [] - Lost task 0.0 in stage 35.0 (TID 44, localhost, executor driver): org.apache.hudi.exception.HoodieException: Exception when reading log file
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:377)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:220)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:209)
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:112)
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:105)
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:343)
at org.apache.hudi.LogFileIterator$.scanLog(LogFileIterator.scala:305)
at org.apache.hudi.LogFileIterator.<init>(LogFileIterator.scala:88)
at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:96)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.avro.AvroTypeException: Found hoodie.test_mor_tab.test_mor_tab_record.new_test_col.fixed, expecting union
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:207)
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:144)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processDataBlock(AbstractHoodieLogRecordReader.java:633)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:715)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:368)
... 27 more
20113 [task-result-getter-0] ERROR org.apache.spark.scheduler.TaskSetManager [] - Task 0 in stage 35.0 failed 1 times; aborting job
20113 [task-result-getter-0] INFO org.apache.spark.scheduler.TaskSchedulerImpl [] - Removed TaskSet 35.0, whose tasks have all completed, from pool
20116 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.TaskSchedulerImpl [] - Cancelling stage 35
20117 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.TaskSchedulerImpl [] - Killing all running tasks in stage 35: Stage cancelled
20118 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler [] - ResultStage 35 (apply at OutcomeOf.scala:85) failed in 0.081 s due to Job aborted due to stage failure: Task 0 in stage 35.0 failed 1 times, most recent failure: Lost task 0.0 in stage 35.0 (TID 44, localhost, executor driver): org.apache.hudi.exception.HoodieException: Exception when reading log file
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:377)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:220)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:209)
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:112)
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:105)
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:343)
at org.apache.hudi.LogFileIterator$.scanLog(LogFileIterator.scala:305)
at org.apache.hudi.LogFileIterator.<init>(LogFileIterator.scala:88)
at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:96)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.avro.AvroTypeException: Found hoodie.test_mor_tab.test_mor_tab_record.new_test_col.fixed, expecting union
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:207)
at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:144)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processDataBlock(AbstractHoodieLogRecordReader.java:633)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:715)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:368)
... 27 more
voonhous
Metadata
Metadata
Assignees
Labels
area:writerWrite client and core write operationsWrite client and core write operationspriority:highSignificant impact; potential bugsSignificant impact; potential bugsrelease-0.12.2Patches targetted for 0.12.2Patches targetted for 0.12.2
Type
Projects
Status
✅ Done