Skip to content

Conversation

@manojpec
Copy link
Contributor

What is the purpose of the pull request

When HoodieBackedTableMetadataWriter is instantiated for a writer client, it checks if
bootstrapping is needed for the table for any un-synced instants. When the writer initiates
a rollback action after the first commit (the one and only commit so far), the metadata table
writer instantiation assumes bootstrapping is needed for the table as it fails to find its latest
committed instant missing in the data table. And, while performing the bootstrapping it finds
a pending action in the data table timeline and so fails the operation and the errors bubbles
back to the writer client for the rollback action. The pending action for the data table is infact
the rollback action which the writer is attempting to do. 

Brief change log

Fix is to make Metadata table writer creation aware of the currently inflight action so that it can
make some informed decision about whether bootstrapping is needed for the table and whether
any pending action on the data timeline can be ignored.

Verify this pull request

TestHBaseIndex::testEnsureTagLocationUsesCommitTimeline and ::testSimpleTagLocationAndUpdateWithRollback
is updated to include metadata table.

New test case TestHoodieBackedMetadata::testFirstCommitRollback has been added to verify the fix

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

- When HoodieBackedTableMetadataWriter is instantiated for a writer client, it checks if
  bootstrapping is needed for the table for any un-synced instants. When the writer initiates
  a rollback action after the first commit (the one and only commit so far), the metadata table
  writer instantiation assumes bootstrapping is needed for the table as it fails to find its latest
  committed instant missing in the data table. And, while performing the bootstrapping it finds
  a pending action in the data table timeline and so fails the operation and the errors bubbles
  back to the writer client for the rollback action. The pending action for the data table is infact
  the rollback action which the writer is attempting to do. 

- Fix is to make Metadata table writer creation aware of the currently inflight action so that it can
  make some informed decision about whether bootstrapping is needed for the table and whether
  any pending action on the data timeline can be ignored.

- TestHBaseIndex::testEnsureTagLocationUsesCommitTimeline and ::testSimpleTagLocationAndUpdateWithRollback
  is updated to include metadata table.
@hudi-bot
Copy link
Collaborator

hudi-bot commented Oct 22, 2021

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run travis re-run the last Travis build
  • @hudi-bot run azure re-run the last Azure build

 - Fixing the HoodieBackedTableMetadataWriter conditional check for
   determing whether bootstrap is needed for the metadata table when
   the inflight action is Rollback.
*
* @return instance of {@link HoodieTableMetadataWriter
*/
public final Option<HoodieTableMetadataWriter> getMetadataWriter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor question. should we pass actionMetadata only for rollback ? If we are going to pass for every operation, may I know why do we need this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for CommitMetadata all other actions are extending SpecificRecordBase and are good to be moved. So, getMetadataWriter() version has to stay around till the CommitMetadata is solved. I can file a new ticket to take this on if you are ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manojpec
Copy link
Contributor Author

@hudi-bot run azure

@nsivabalan nsivabalan merged commit c9d641c into apache:master Oct 23, 2021
if (!isMetadataAvailabilityUpdated) {
// this code assumes that if metadata availability is updated once it will not change. please revisit this logic if that's not the case.
// this is done to avoid repeated calls to fs.exists().
// This code assumes that if metadata availability is updated once it will not change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the passed in actionMetadata used ?

// this code assumes that if metadata availability is updated once it will not change. please revisit this logic if that's not the case.
// this is done to avoid repeated calls to fs.exists().
// This code assumes that if metadata availability is updated once it will not change.
// Please revisit this logic if that's not the case. This is done to avoid repeated calls to fs.exists().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the passed in actionMetadata used ?

@dongkelun
Copy link
Contributor

dongkelun commented Oct 29, 2021

Hello, according to the jar compiled by the latest code of the master branch, the following exception is thrown when running spark SQL. I don't know whether it has anything to do with this pr. is there any missing jar package?

It cannot be solved when I copied hbase-server-1.2.3.jar to SPARK_home/jars path. If set hoodie.metadata.enable=false, this exception will not be thrown. I don't know how to solve it by default?

21/10/29 09:53:50 INFO AbstractHoodieLogRecordScanner: Scanning log file HoodieLogFile{pathStr='hdfs://cluster1/warehouse/tablespace/managed/hive/test_hudi_table/.hoodie/metadata/files/.files-0000_0000000000000.log.1_0-0-0', fileLen=0}
21/10/29 09:53:50 INFO AbstractHoodieLogRecordScanner: Reading a delete block from file hdfs://cluster1/warehouse/tablespace/managed/hive/test_hudi_table/.hoodie/metadata/files/.files-0000_0000000000000.log.1_0-0-0
21/10/29 09:53:50 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile{pathStr='hdfs://cluster1/warehouse/tablespace/managed/hive/test_hudi_table/.hoodie/metadata/files/.files-0000_0000000000000.log.1_0-6-6', fileLen=0}
21/10/29 09:53:50 INFO AbstractHoodieLogRecordScanner: Scanning log file HoodieLogFile{pathStr='hdfs://cluster1/warehouse/tablespace/managed/hive/test_hudi_table/.hoodie/metadata/files/.files-0000_0000000000000.log.1_0-6-6', fileLen=0}
21/10/29 09:53:50 INFO AbstractHoodieLogRecordScanner: Reading a data block from file hdfs://cluster1/warehouse/tablespace/managed/hive/test_hudi_table/.hoodie/metadata/files/.files-0000_0000000000000.log.1_0-6-6 at instant 0000000000000
21/10/29 09:53:50 INFO AbstractHoodieLogRecordScanner: Scanning log file HoodieLogFile{pathStr='hdfs://cluster1/warehouse/tablespace/managed/hive/test_hudi_table/.hoodie/metadata/files/.files-0000_0000000000000.log.1_0-6-6', fileLen=0}
21/10/29 09:53:50 INFO AbstractHoodieLogRecordScanner: Reading a data block from file hdfs://cluster1/warehouse/tablespace/managed/hive/test_hudi_table/.hoodie/metadata/files/.files-0000_0000000000000.log.1_0-6-6 at instant 20211029095323
21/10/29 09:53:50 INFO AbstractHoodieLogRecordScanner: Number of remaining logblocks to merge 2
21/10/29 09:53:50 INFO AbstractHoodieLogRecordScanner: Number of remaining logblocks to merge 1
21/10/29 09:53:50 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache{blockCount=0, currentSize=274.14 KB, freeSize=363.93 MB, maxSize=364.20 MB, heapSize=274.14 KB, minSize=345.99 MB, minFactor=0.95, multiSize=173.00 MB, multiFactor=0.5, singleSize=86.50 MB, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
21/10/29 09:53:50 ERROR SparkSQLDriver: Failed in [insert into test_hudi_table values (1,'hudi',10,100,'2021-05-05'),(2,'hudi',10,100,'2021-05-05')]
java.lang.NoSuchMethodError: org.apache.hadoop.hbase.io.hfile.HFile.createReader(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/hbase/io/FSDataInputStreamWrapper;JLorg/apache/hadoop/hbase/io/hfile/CacheConfig;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/hbase/io/hfile/HFile$Reader;
        at org.apache.hudi.io.storage.HoodieHFileReader.<init>(HoodieHFileReader.java:80)
        at org.apache.hudi.common.table.log.block.HoodieHFileDataBlock.deserializeRecords(HoodieHFileDataBlock.java:155)
        at org.apache.hudi.common.table.log.block.HoodieDataBlock.createRecordsFromContentBytes(HoodieDataBlock.java:128)
        at org.apache.hudi.common.table.log.block.HoodieDataBlock.getRecords(HoodieDataBlock.java:106)
        at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processDataBlock(AbstractHoodieLogRecordScanner.java:313)
        at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processQueuedBlocksForInstant(AbstractHoodieLogRecordScanner.java:355)
        at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:191)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:99)
        at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordScanner.<init>(HoodieMetadataMergedLogRecordScanner.java:54)
        at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordScanner.<init>(HoodieMetadataMergedLogRecordScanner.java:40)
        at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordScanner$Builder.build(HoodieMetadataMergedLogRecordScanner.java:176)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:276)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$openReadersIfNeeded$2(HoodieBackedTableMetadata.java:195)
        at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReadersIfNeeded(HoodieBackedTableMetadata.java:176)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKeyFromMetadata(HoodieBackedTableMetadata.java:124)
        at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:153)
        at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:94)
        at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:281)
        at org.apache.hudi.sync.common.AbstractSyncHoodieClient.getPartitionsWrittenToSince(AbstractSyncHoodieClient.java:157)
        at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:191)
        at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:131)
        at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:117)
        at org.apache.hudi.HoodieSparkSqlWriter$.org$apache$hudi$HoodieSparkSqlWriter$$syncHive(HoodieSparkSqlWriter.scala:520)
        at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$metaSync$2.apply(HoodieSparkSqlWriter.scala:576)
        at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$metaSync$2.apply(HoodieSparkSqlWriter.scala:572)
        at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
        at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:572)
        at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:645)
        at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:272)
        at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.run(InsertIntoHoodieTableCommand.scala:103)
        at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand.run(InsertIntoHoodieTableCommand.scala:59)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
        at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
        at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
        at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
        at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
        at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
        at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
        at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
        at org.apache.spark.sql.Dataset.<init>(Dataset.scala:194)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
        at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:371)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:274)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants