[HUDI-3826] Make truncate partition use delete_partition operation by XuQianJin-Stars · Pull Request #5272 · apache/hudi

XuQianJin-Stars · 2022-04-09T13:32:24Z

Tips

Thank you very much for contributing to Apache Hudi.
Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.

What is the purpose of the pull request

(For example: This pull request adds quick-start document.)

Brief change log

(for example:)

Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end.
Added HoodieClientWriteTest to verify the change.
Manually verified the change by running a job locally.

Committer checklist

Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

xushiyan · 2022-04-12T18:03:21Z

...source/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/HoodieSqlCommonUtils.scala

+    val partitionsToTruncate = normalizedSpec.map { spec =>
+      hoodieCatalogTable.partitionFields.map { partitionColumn =>
+        if (enableEncodeUrl) {
+          partitionColumn + "=" + "\"" + spec(partitionColumn) + "\""


this encode case did not call PartitionPathEncodeUtils.escapePathName?

yeah. not sure if he is confusing w/ hive style partitioning. Don't we need to consider both? i.e. url encode and hive style partitioning ?

snippet from KeyGenUtils

if (encodePartitionPath) { partitionPath = PartitionPathEncodeUtils.escapePathName(partitionPath); } if (hiveStylePartitioning) { partitionPath = partitionPathField + "=" + partitionPath; }

this encode case did not call PartitionPathEncodeUtils.escapePathName?

The urlcode character appears in the partition, which cannot be deleted with single quotation marks. Double quotation marks are used after url decoding.

hive style partitioning

Contains the processing of hive style partitioning, which is mainly to construct the where condition of delete sql.

I see w/ latest commit, all changes in HoodieSqlCommonutils is reverted. So, where exactly we process url decoding ?

guess getPartitionPathToDrop() does that.

...source/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/HoodieSqlCommonUtils.scala

xushiyan · 2022-04-12T18:05:30Z

...spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestTruncateTable.scala

        df.write.format("hudi")
          .option(HoodieWriteConfig.TBL_NAME.key, tableName)
-          .option(TABLE_TYPE.key, MOR_TABLE_TYPE_OPT_VAL)
+          .option(TABLE_TYPE.key, COW_TABLE_TYPE_OPT_VAL)


why this test case change?

I wonder how the tests are succeeding? bcoz, w/ latest master, delete partitions are lazy. only after cleaner gets a chance to clean up, the deleted partition may not show up when we make getAllPartitions(). Can you check the assertions in tests. Or are we asserting for records in the deleted partitions = 0 ?

why this test case change?

This case encountered this #5282 that was not covered by ut before.

...source/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala

xushiyan · 2022-04-12T18:35:37Z

@XuQianJin-Stars in org.apache.spark.sql.hudi.command.AlterHoodieTableDropPartitionCommand we should also fix the if (purge) case to use delete API too. can you include the change in the PR? also please make the PR title more descriptive. thanks.
cc @alexeykudinkin @nsivabalan

nsivabalan · 2022-04-12T18:52:16Z

yes. lets try to fix AlterHoodieTableDropPartitionCommand in the same patch as well.

vinothchandar

For DROP TABLE and TRUNCATE TABLE (esp the latter), we want to probably do a simple fs.delete of the entire thing.

for partition level, drop/truncate, we can use a DELETE_PARTITION write operation.

...ark-common/src/main/scala/org/apache/spark/sql/hudi/command/TruncateHoodieTableCommand.scala

...source/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestAlterTableDropPartition.scala

alexeykudinkin · 2022-04-13T22:04:40Z

@XuQianJin-Stars please update PR description to follow the format

nsivabalan

LGTM

.../src/main/scala/org/apache/spark/sql/hudi/command/AlterHoodieTableDropPartitionCommand.scala

...ark-common/src/main/scala/org/apache/spark/sql/hudi/command/TruncateHoodieTableCommand.scala

...source/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestAlterTableDropPartition.scala

...ark-common/src/main/scala/org/apache/spark/sql/hudi/command/TruncateHoodieTableCommand.scala

...source/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestAlterTableDropPartition.scala

hudi-bot · 2022-04-14T03:52:22Z

CI report:

de50fc1 UNKNOWN
9d25e9a UNKNOWN
f3e859e UNKNOWN
8bd6171 Azure: SUCCESS

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

…5272) Make truncate partition and drop partition behave as drop partition with purge, which delete all records via Hudi DELETE_PARTITION; partition removed from metastore

XuQianJin-Stars added 2 commits April 9, 2022 21:31

[HUDI-3826] Commands deleting partitions do so incorrectly

76a2f42

[HUDI-3826] Commands deleting partitions do so incorrectly

1cb81a9

XuQianJin-Stars changed the title ~~[WIP][HUDI-3826] Commands deleting partitions do so incorrectly~~ [HUDI-3826] Commands deleting partitions do so incorrectly Apr 10, 2022

xushiyan self-assigned this Apr 12, 2022

xushiyan added the priority:blocker Production down; release blocker label Apr 12, 2022

xushiyan reviewed Apr 12, 2022

View reviewed changes

XuQianJin-Stars added 2 commits April 13, 2022 09:22

fix comments

0d6e6af

fix comments

477b349

vinothchandar requested changes Apr 13, 2022

View reviewed changes

XuQianJin-Stars added 6 commits April 13, 2022 16:52

fix comments

48161e5

fix comments

de50fc1

fix comments

9d25e9a

fix comments

e42f8de

fix comments

f3e859e

fix comments

bc65d33

xushiyan reviewed Apr 13, 2022

View reviewed changes

...ark-common/src/main/scala/org/apache/spark/sql/hudi/command/TruncateHoodieTableCommand.scala Outdated Show resolved Hide resolved

fix comments

08f5f51

xushiyan reviewed Apr 13, 2022

View reviewed changes

...source/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestAlterTableDropPartition.scala Show resolved Hide resolved

nsivabalan approved these changes Apr 13, 2022

View reviewed changes

alexeykudinkin reviewed Apr 13, 2022

View reviewed changes

...ark-common/src/main/scala/org/apache/spark/sql/hudi/command/TruncateHoodieTableCommand.scala Show resolved Hide resolved

...source/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestAlterTableDropPartition.scala Show resolved Hide resolved

fix comments

8bd6171

xushiyan approved these changes Apr 14, 2022

View reviewed changes

xushiyan changed the title ~~[HUDI-3826] Commands deleting partitions do so incorrectly~~ [HUDI-3826] Make truncate partition use delete_partition operation Apr 14, 2022

xushiyan merged commit 44b3630 into apache:master Apr 14, 2022

Conversation

XuQianJin-Stars commented Apr 9, 2022

Tips

What is the purpose of the pull request

Brief change log

Verify this pull request

Committer checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XuQianJin-Stars Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xushiyan commented Apr 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nsivabalan commented Apr 12, 2022

Uh oh!

vinothchandar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

alexeykudinkin commented Apr 13, 2022

Uh oh!

nsivabalan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hudi-bot commented Apr 14, 2022

CI report:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

XuQianJin-Stars Apr 13, 2022 •

edited

Loading

xushiyan commented Apr 12, 2022 •

edited

Loading