Skip to content

Conversation

@zhuanshenbsj1
Copy link
Contributor

@zhuanshenbsj1 zhuanshenbsj1 commented Apr 20, 2023

Change Logs

Adjust the cleaning operation in Spark offline compact/cluster, when ASYNC_CLEAN is true will start asynchronous cleaning in prewrite and wait for the async-clean completion, otherwise will do synchronous clean after cluster/compact.

Impact

none

Risk level (write none, low medium or high below)

none

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@zhuanshenbsj1 zhuanshenbsj1 changed the title Spark offline compaction/Clustering Job will do clean like Flink job [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job Apr 20, 2023
@danny0405 danny0405 self-assigned this Apr 20, 2023
@danny0405 danny0405 added area:table-service Table services engine:spark Spark integration labels Apr 20, 2023
@zhuanshenbsj1 zhuanshenbsj1 force-pushed the addCleanInSparkOfflineJob branch 3 times, most recently from 425b764 to 5effbf3 Compare April 24, 2023 04:05
@zhuanshenbsj1 zhuanshenbsj1 force-pushed the addCleanInSparkOfflineJob branch 6 times, most recently from 442430f to 25c1856 Compare April 24, 2023 13:42
@zhuanshenbsj1 zhuanshenbsj1 force-pushed the addCleanInSparkOfflineJob branch from 7cf8111 to 4dad96b Compare April 26, 2023 07:09
@zhuanshenbsj1 zhuanshenbsj1 force-pushed the addCleanInSparkOfflineJob branch from 0eeee7d to 8539f13 Compare April 26, 2023 14:40
@zhuanshenbsj1 zhuanshenbsj1 force-pushed the addCleanInSparkOfflineJob branch from 8539f13 to 4fc5fb7 Compare May 5, 2023 04:03
@zhuanshenbsj1 zhuanshenbsj1 force-pushed the addCleanInSparkOfflineJob branch from e69e498 to fbd3411 Compare May 5, 2023 14:01
@zhuanshenbsj1 zhuanshenbsj1 force-pushed the addCleanInSparkOfflineJob branch from fbd3411 to 74c7784 Compare May 6, 2023 07:01
@danny0405
Copy link
Contributor

6106.patch.zip
Thanks for the contribution, I have reviewed and created a patch~

@zhuanshenbsj1
Copy link
Contributor Author

6106.patch.zip Thanks for the contribution, I have reviewed and created a patch~

Done.

@zhuanshenbsj1
Copy link
Contributor Author

@hudi-bot run azure

Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thanks for the contribution, @zhuanshenbsj1 ~

@danny0405 danny0405 closed this May 12, 2023
@danny0405 danny0405 reopened this May 12, 2023
@danny0405
Copy link
Contributor

@hudi-bot run azure

@danny0405
Copy link
Contributor

@zhuanshenbsj1 Hi, can you rebase with the latest maste and re-trigger the Azure CI tests?

@zhuanshenbsj1 zhuanshenbsj1 force-pushed the addCleanInSparkOfflineJob branch from 3ff0e0a to 091fe00 Compare May 13, 2023 13:00
@zhuanshenbsj1
Copy link
Contributor Author

@hudi-bot run azure

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@danny0405 danny0405 merged commit 81b4aca into apache:master May 14, 2023
@zhuanshenbsj1 zhuanshenbsj1 deleted the addCleanInSparkOfflineJob branch January 8, 2024 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:table-service Table services engine:spark Spark integration

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants