Skip to content

Conversation

@swuferhong
Copy link
Contributor

…batch compaction

Tips

What is the purpose of the pull request

This PR is to support explicit partition compaction strategy for flink batch compaction. As PR #3046 support flink batch compaction, this PR can specify partition to compaction based on flink hudi batch compaction.

Brief change log

(for example:)

  • Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added HoodieClientWriteTest to verify the change.
  • Manually verified the change by running a job locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@codecov-commenter
Copy link

codecov-commenter commented Jun 17, 2021

Codecov Report

Merging #3100 (ac71607) into master (e99a6b0) will decrease coverage by 0.01%.
The diff coverage is 26.92%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #3100      +/-   ##
============================================
- Coverage     46.26%   46.25%   -0.02%     
  Complexity     5362     5362              
============================================
  Files           920      921       +1     
  Lines         39824    39850      +26     
  Branches       4289     4293       +4     
============================================
+ Hits          18425    18431       +6     
- Misses        19518    19535      +17     
- Partials       1881     1884       +3     
Flag Coverage Δ
hudicli 39.95% <ø> (ø)
hudiclient 30.43% <0.00%> (-0.03%) ⬇️
hudicommon 47.58% <ø> (ø)
hudiflink 61.44% <46.66%> (-0.06%) ⬇️
hudihadoopmr 51.29% <ø> (ø)
hudisparkdatasource 67.06% <ø> (ø)
hudisync 54.05% <ø> (ø)
huditimelineservice 64.36% <ø> (ø)
hudiutilities 58.23% <ø> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../strategy/ExplicitPartitionCompactionStrategy.java 0.00% <0.00%> (ø)
...pache/hudi/sink/compact/FlinkCompactionConfig.java 0.00% <0.00%> (ø)
...c/main/java/org/apache/hudi/util/StreamerUtil.java 60.71% <25.00%> (-2.75%) ⬇️
...va/org/apache/hudi/configuration/FlinkOptions.java 96.42% <100.00%> (+0.04%) ⬆️
...apache/hudi/utilities/deltastreamer/DeltaSync.java 70.84% <0.00%> (-0.34%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e99a6b0...ac71607. Read the comment docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compaction strategy with explicit partition, it is based on the {@link LogFileSizeBasedCompactionStrategy}.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit partition to compact with, by default, compact all the partitions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit partition to compact with, by default, compact all the partitions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/**
 * Set the compaction strategy.
 * /

@swuferhong swuferhong force-pushed the HUDI-2034 branch 2 times, most recently from 58e32ac to c3215f7 Compare June 18, 2021 02:26
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use FlinkOptions.COMPACTION_PARTITION.key()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify the logic to be synced with HoodieFlinkCompactor.

@swuferhong swuferhong force-pushed the HUDI-2034 branch 2 times, most recently from 9d7f1d1 to 8a9fed1 Compare June 18, 2021 09:56
@swuferhong swuferhong closed this Jun 21, 2021
@swuferhong swuferhong reopened this Jun 21, 2021
@swuferhong swuferhong closed this Jun 21, 2021
@swuferhong swuferhong reopened this Jun 21, 2021
@swuferhong swuferhong closed this Jun 23, 2021
@swuferhong swuferhong reopened this Jun 23, 2021
@hudi-bot
Copy link
Collaborator

hudi-bot commented Nov 5, 2021

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@xushiyan
Copy link
Member

xushiyan commented Jul 5, 2022

@swuferhong can we resume the work pls? or are we good to close this? cc @danny0405

@danny0405
Copy link
Contributor

Close it because the author is not active.

@danny0405 danny0405 closed this Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants