Skip to content

[HUDI-3007] Fix issues in HoodieRepairTool#4564

Merged
yihua merged 8 commits intoapache:masterfrom
yihua:HUDI-3007-repair-utility
Jan 12, 2022
Merged

[HUDI-3007] Fix issues in HoodieRepairTool#4564
yihua merged 8 commits intoapache:masterfrom
yihua:HUDI-3007-repair-utility

Conversation

@yihua
Copy link
Copy Markdown
Contributor

@yihua yihua commented Jan 11, 2022

What is the purpose of the pull request

This PR fixes a few issues in HoodieRepairTool and adds unit and functional tests to guarantee the functionality of the repair utility.

Brief change log

  • Fixes the file listing for backup path in UNDO mode to get the correct list of files to restore.
  • Uses HoodieEngineContext instance to generalize parallelized operations instead of hardcoding JavaSparkContext and spark logic.
  • Fixes serialization issues in deleteFiles().
  • Adds return status for various operations to check whether an operation is successful.

Verify this pull request

This change adds new tests:

  • Adds unit and functional tests in TestRepairUtil, TestHoodieRepairTool, and TestFSUtils.
  • Manually verifies the change by running a spark job with HoodieRepairTool locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

Copy link
Copy Markdown
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments. I will merge this in and pull it into 0.10.1. You can address any feedback in a follow up patch.

Comment thread hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieRepairTool.java Outdated
@nsivabalan
Copy link
Copy Markdown
Contributor

@yihua : looks like there are some test failures in Repair tool tests. Can you check them out please.

@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@yihua
Copy link
Copy Markdown
Contributor Author

yihua commented Jan 12, 2022

@yihua : looks like there are some test failures in Repair tool tests. Can you check them out please.

It's all good now. I'm going to merge the PR.

@yihua yihua merged commit 397795c into apache:master Jan 12, 2022
@vinishjail97 vinishjail97 mentioned this pull request Jan 24, 2022
5 tasks
vingov pushed a commit to vingov/hudi that referenced this pull request Jan 26, 2022
liusenhua pushed a commit to liusenhua/hudi that referenced this pull request Mar 1, 2022
vingov pushed a commit to vingov/hudi that referenced this pull request Apr 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants