Skip to content

Conversation

@scxwhite
Copy link
Contributor

Brief change log

  • compact improve

I found that when the compact plan is generated, the delta log files under each filegroup are arranged in the natural order of instant time. in the majority of cases,We can think that the latest data is in the latest delta log file, so we sort it from large to small according to the instance time, which can largely avoid rewriting the data in the compact process, and then optimize the compact time.

In addition, when reading the delta log file, we compare the data in the external spillablemap with the delta log data. If oldrecord is selected, there is no need to rewrite the data in the external spillablemap. Rewriting data will waste a lot of resources when data is spill to disk

This pull request is already covered by existing tests, such as (please describe tests).

Committer checklist

  • [*] Has a corresponding JIRA in PR title & commit()

  • [*] Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@yihua yihua self-assigned this Dec 20, 2021
Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a clarification on the first fix. Could you add some UTs for this?

.getLatestFileSlices(partitionPath)
.filter(slice -> !fgIdsInPendingCompactionAndClustering.contains(slice.getFileGroupId()))
.map(s -> {
// We can think that the latest data is in the latest delta log file, so we sort it from large
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are assuming the later writes in the log always overwrites the earlier ones? this is not true always.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, but in most cases, the new data is often in the latest delta log, so we sort it from large to small according to the instance time. The program will avoid updating the data in the externalspillablemap to save compact time. What do you think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a clarification on the first fix. Could you add some UTs for this?

OK, I'll try to add some UTs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are assuming the later writes in the log always overwrites the earlier ones? this is not true always.

In the compact plan generation phase, I just changed the order of reading delta log files. In the internal production environment, I have used this method for a month, and no data exceptions have occurred. Now, I don't know how I should test this place. Can you give me some suggestions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, I changed the reading order of deltalog to avoid data rewriting to the greatest extent. Houdierecordpayload#precombine will still execute and select the correct data.

HoodieOperation operation = choosePrev ? oldRecord.getOperation() : hoodieRecord.getOperation();
records.put(key, new HoodieRecord<>(new HoodieKey(key, hoodieRecord.getPartitionPath()), combinedValue, operation));
// If combinedValue is oldValue, no need rePut oldRecord
if (!combinedValue.equals(oldValue)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a valid optimization.

@nsivabalan
Copy link
Contributor

@yihua : Can you follow up on the review please.

loukey-lj and others added 25 commits February 18, 2022 13:31
* [HUDI-3389] fix ColumnarArrayData ClassCastException issue

* [HUDI-3389] remove MapColumnVector.java, RowColumnVector.java, and add test case for array<int> field
…pache#4837)

* [HUDI-3446] Supports batch Reader in BootstrapOperator#loadRecords
* Fixing restore with metadata enabled

* Fixing test failures
…n operations are present using a config. (apache#4212)


Co-authored-by: sivabalan <[email protected]>
…ot be reused (apache#4861)

* Before the patch, the flink streaming reader caches the meta client thus the archived timeline,
  when fetching the instant details from the reused timeline, the exception throws
* Add a method in HoodieTableMetaClient to return a fresh new archived timeline each time
zhangyue19921010 and others added 22 commits February 25, 2022 16:46
ParquetColumnarRowSplitReader#batchSize is 2048, so Changing MINI_BATCH_SIZE to 2048 will reduce memory cache.
*  Use iterator to void eager materialization to be memory friendly
@scxwhite
Copy link
Contributor Author

scxwhite commented Mar 2, 2022

I found that after modifying the reading order of the delta log, HoodieRecordPayload#preCombine may have some problems when compacting (when the orderingVal of the two data is the same, the latest submitted data will not be selected). Later I will submit a new pr separately to fix this issue.

So this pr just optimizes the code。

@hudi-bot
Copy link
Collaborator

hudi-bot commented Mar 2, 2022

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@scxwhite
Copy link
Contributor Author

scxwhite commented Mar 2, 2022

Very sorry, my fault, there was a problem with the merge. I will split it into two PRs and resubmit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.