Skip to content

Conversation

@Mulavar
Copy link
Member

@Mulavar Mulavar commented Apr 5, 2023

Change Logs

Stop writing and reading compaction plans from .aux folder
Describe context and summary for this change. Highlight if any code was copied.

Impact

Reduce I/O when schedule compaction from 2 to 1.
Describe any public API or user-facing feature change or any performance impact.

Risk level (write none, low medium or high below)

None.
If medium or high, explain what verification was done to mitigate the risks.

Documentation Update

None
Describe any necessary documentation update if there is any new feature, config, or user-facing change

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, nice catch ~

public void saveToCompactionRequested(HoodieInstant instant, Option<byte[]> content, boolean overwrite) {
ValidationUtils.checkArgument(instant.getAction().equals(HoodieTimeline.COMPACTION_ACTION));
// Write workload to auxiliary folder
createFileInAuxiliaryFolder(instant, content);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @bvaradar for taking over this, generally looks good.

@Mulavar
Copy link
Member Author

Mulavar commented Apr 7, 2023

@bvaradar Hi, could u take the time to review this pr, Is there anything I need to modify?

@bvaradar bvaradar self-assigned this Apr 7, 2023
Copy link
Contributor

@bvaradar bvaradar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mulavar : This requires a table version change also as we need to create .aux files when we downgrade to older version. Can you add relevant UpgradeDowngrade handlers for this.

@Mulavar
Copy link
Member Author

Mulavar commented Apr 11, 2023

@bvaradar Thanks for your review. I think your suggestion is very valuable, and I need to read that piece of code first because I'm not very familiar with the code of table version downgrade.

@Mulavar
Copy link
Member Author

Mulavar commented Apr 11, 2023

@bvaradar hi bvaradar, could u take some time to review the pr again?

@bvaradar
Copy link
Contributor

@Mulavar : We need to create a higher version 6 and write upgrade/downgrade to handle transition from current version (5). You can look at #6248 as example of how to do this.

@Mulavar
Copy link
Member Author

Mulavar commented Apr 12, 2023

@Mulavar : This requires a table version change also as we need to create .aux files when we downgrade to older version. Can you add relevant UpgradeDowngrade handlers for this.

@bvaradar thanks, done.

Copy link
Contributor

@bvaradar bvaradar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized there are more places to cleanup. Please take a look.

compactionTimeline.getInstants().stream().forEach(instant -> {
String fileName = instant.getFileName();
try {
metaClient.getFs().delete(new Path(metaClient.getMetaAuxiliaryPath(), fileName));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mulavar : We need to basically look at all caller of metaClient.getMetaAuxiliaryPath() and remove the usage of metaAuxiliaryPath since we should not no longer write to them. Also in HoodieTableMetaClient.initTableAndGetMetaClient, can you remove the creation of .aux folder

Copy link
Member Author

@Mulavar Mulavar Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that some other feature like bootstrap index also use the .aux folder, refer to org.apache.hudi.common.table.HoodieTableMetaClient#BOOTSTRAP_INDEX_ROOT_FOLDER_PATH and we can not just remove the creation of .aux folder. Maybe we should remove the caller of metaClient.getMetaAuxiliaryPath() related to compaction logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Good point. We should only cleanup compaction plans in .aux folder and remove compaction only callers

@Mulavar Mulavar force-pushed the HUDI-6040 branch 2 times, most recently from 54fbf42 to 6fd073a Compare April 13, 2023 03:00
compactionTimeline.getInstants().stream().forEach(instant -> {
String fileName = instant.getFileName();
try {
metaClient.getFs().delete(new Path(metaClient.getMetaAuxiliaryPath(), fileName));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Good point. We should only cleanup compaction plans in .aux folder and remove compaction only callers

Copy link
Contributor

@bvaradar bvaradar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mulavar : Thanks for cleaning this. Functionality wise, it looks good to me. Can you add detailed tests for TestFiveSixUpgradeDowngrader. We need to cover cases when (a) aux folder is empty (b) aux folder is non-empty with incomplete compactions and (c) case b along with metadata bootstrap done. In all 3 cases, can you run subsequent compaction and check compaction succeeded followed by a write-schedule_compaction-run_compaction. Similar cases needed for downgrade. This should cover all cases.

@danny0405 : There is some cleanup in Flink related code. Can you kindly take a look and give a thumbs up if it looks good.

@Mulavar
Copy link
Member Author

Mulavar commented Apr 17, 2023

@bvaradar Yeah bvaradar, I agreed with you that we need to add some test cases to verify the upgrader/downgrader, but I'm not sure if there's a need to run subsequent compaction, cause we do not change anything about the compaction plan content, so maybe we should better verify if we can still read the compaction plan from .hoodie folder when after deleting compaction.requested from .aux folder. Since the upgrade verification has been covered by org.apache.hudi.common.table.timeline.TestHoodieActiveTimeline#testLoadingInstantsFromFiles and we could add some tests to verify SixToFiveDowngradeHandler. WDYT?

@Mulavar Mulavar force-pushed the HUDI-6040 branch 2 times, most recently from ddb5108 to 1cd0db6 Compare April 17, 2023 02:44
@Mulavar
Copy link
Member Author

Mulavar commented Apr 18, 2023

@hudi-bot run azure

@bvaradar
Copy link
Contributor

@Mulavar : W.r.t running compaction after upgrade/downgrade, this will help prove that the upgrade will be safer if we can ensure compaction did not fail in any place after upgrade/downgrade.

@Mulavar
Copy link
Member Author

Mulavar commented Apr 20, 2023

@bvaradar Yeah, but I'm not very familiar with the compaction test, could you give me some tips on which example should I look at?

@Mulavar
Copy link
Member Author

Mulavar commented Apr 20, 2023

@hudi-bot run azure

@Mulavar Mulavar force-pushed the HUDI-6040 branch 2 times, most recently from 9149994 to 9659d42 Compare April 23, 2023 16:01
Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we completely remove all usages of "aux" folder with this patch?
image

I don't see any other usages after this patch.

Also, @bvaradar : can you help me understand what will happen in this case.
if a compaction plan was written to aux but failed to add to ".hoodie" just before upgrade.
after upgrade, if we delete the entire "aux" folder, we should be good right. there won't be any other traces of that failed compaction schedule ?

HoodieTableMetaClient metaClient = table.getMetaClient();
// delete compaction file from .aux
HoodieTimeline compactionTimeline = metaClient.getActiveTimeline().filterPendingCompactionTimeline()
.filter(instant -> instant.getState() == HoodieInstant.State.REQUESTED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the usages of aux folder and its used only incase of compaction plans and may be flink clustering.
why can't we delete entire aux folder ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"I found that some other features like bootstrap index also use the .aux folder, refer to org.apache.hudi.common.table.HoodieTableMetaClient#BOOTSTRAP_INDEX_ROOT_FOLDER_PATH and we can not just remove the creation of .aux folder. Maybe we should remove the caller of metaClient.getMetaAuxiliaryPath() related to compaction logic?"
This patch only cares about the compaction, and there is some other file/folder in the .aux folder, we should only remove the files about compaction. If we want to confirm if the entire .aux folder could be deleted, maybe better if we should open another issue to discuss it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg.

HoodieTable table = upgradeDowngradeHelper.getTable(config, context);
HoodieTableMetaClient metaClient = table.getMetaClient();
// sync compaction requested file to .aux
HoodieTimeline compactionTimeline = metaClient.getActiveTimeline().filterPendingCompactionTimeline()
Copy link
Contributor

@nsivabalan nsivabalan Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could return instants either in inflight or in requested. if its inflight, we need to fetch the filename by constructing the requested instant as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code snippet has been followed by the filter which does the thing you said.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I meant is, for a compaction that has both requested and inflight in ".hoodie", only INFLIGHT will be returned with this call (metaClient.getActiveTimeline().filterPendingCompactionTimeline()) and in subsequent like you are filtering only for requested and hence we might miss.
can you check please.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice catch! fixed.

@nsivabalan
Copy link
Contributor

we should also https://issues.apache.org/jira/browse/HUDI-546 along with 6040.

@bvaradar
Copy link
Contributor

Should we completely remove all usages of "aux" folder with this patch? image

I don't see any other usages after this patch.

Also, @bvaradar : can you help me understand what will happen in this case. if a compaction plan was written to aux but failed to add to ".hoodie" just before upgrade. after upgrade, if we delete the entire "aux" folder, we should be good right. there won't be any other traces of that failed compaction schedule ?

@nsivabalan : It is still used for bootstrap. Bootstrap folder is under aux
Regarding the partial write case, it should be fine since we rely on timeline (.hoodie/) to figure out pending compactions. In both upgrade and normal case, this would be as if no compaction is pending.

@danny0405
Copy link
Contributor

It is still used for bootstrap. Bootstrap folder is under aux
Regarding the partial write case, it should be fine since we rely on timeline (.hoodie/) to figure out pending compactions. In both upgrade and normal case, this would be as if no compaction is pending.

Flink actually uses the .aux folder for some configurations like 1. fs view storage conf and 2. checkpoint metadata, so we may need to keep this folder, just drop the file creation for compaction should be fine.

@Mulavar
Copy link
Member Author

Mulavar commented Apr 25, 2023

@nsivabalan Hi nsivabalan, I've fixed the comment, could u check this patch again~

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@bvaradar
Copy link
Contributor

@nsivabalan : Are you ok with this change ? If so, can you approve it ?

@danny0405 danny0405 merged commit 8db07a8 into apache:master May 9, 2023
@Mulavar Mulavar deleted the HUDI-6040 branch May 10, 2023 03:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants