Skip to content

[HUDI-9332] Pluggable Table Format Support with native Integration#13216

Merged
danny0405 merged 11 commits intoapache:masterfrom
bvaradar:pluggable-tf-interface
Jul 3, 2025
Merged

[HUDI-9332] Pluggable Table Format Support with native Integration#13216
danny0405 merged 11 commits intoapache:masterfrom
bvaradar:pluggable-tf-interface

Conversation

@bvaradar
Copy link
Contributor

@bvaradar bvaradar commented Apr 24, 2025

Change Logs

[HUDI-9332] Pluggable Table Format Support with native Integration

  1. Includes base interface for pluggable table format
  2. Includes native hudi integration for pluggable table format
  3. Includes the MDT side integration for puggability.

Impact

Initial Implementation of RFC - 93

Risk level (write none, low medium or high below)

none

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@vinothchandar vinothchandar self-assigned this Apr 25, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements pluggable table format support with native integration, enabling the system to use a configurable table format for various timeline operations instead of the fixed timeline layout. Key changes include:

  • Replacing direct usage of timeline layout with the pluggable table format interface across multiple modules.
  • Introducing new methods and fields (e.g. in HoodieTableMetaClient and HoodieTableConfig) to support native table format integration.
  • Updating tests and client code to exercise the new table format behavior.

Reviewed Changes

Copilot reviewed 43 out of 43 changed files in this pull request and generated no comments.

Show a summary per file
File Description
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/BaseHoodieTimeline.java Added helper method to retrieve instants via the file system using the metaClient.
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java Integrated pluggable table format support and introduced a native active timeline accessor.
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java Added configuration properties and a method to instantiate a pluggable table format.
hudi-common/src/main/java/org/apache/hudi/common/PluggableTableFormat.java & HudiPluggableTableFormat.java Defined and implemented the pluggable table format interface.
Various client modules and tests Replaced timeline layout factory calls with table format factory calls to support the new integration.
Comments suppressed due to low confidence (2)

hudi-common/src/main/java/org/apache/hudi/common/HoodieTableConfig.java:739

  • The ReflectionUtils.loadClass call currently does not capture or use the loaded class instance, as the method always returns a new HudiPluggableTableFormat. Consider instantiating and returning the loaded class instance if a supplementary table format class is provided.
ReflectionUtils.loadClass(getSupplementaryTableFormatClassName(), new Class[] {TimelineLayoutVersion.class}, new Object[] {layoutVersion});

hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/versioning/v2/TimelineArchiverV2.java:121

  • [nitpick] Consider simplifying the lambda expression by using flatMap directly on the streams from each action to collect archived instants. This will enhance code readability and reduce unnecessary intermediate collection steps.
List<HoodieInstant> archivedInstants = instantsToArchive.stream()
            .map(action -> Stream.concat(action.getCompletedInstants().stream(), action.getPendingInstants().stream()).collect(Collectors.toList()))
            .flatMap(Collection::stream).collect(Collectors.toList());

@github-actions github-actions bot added the size:L PR with lines of changes in (300, 1000] label Apr 25, 2025
Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a pass. Have a directional comment which I feel may be cross cutting on the PR

@geserdugarov
Copy link
Contributor

geserdugarov commented May 30, 2025

I don't understand costs in terms of performance, that we should pay for this change. There are a lot of work that was already done, and there is still a lot to do down the road. From my point of view, performance is a critical parameter for users when we are talking about Big Data processing.

Is there any benchmark results for this change?

@geserdugarov
Copy link
Contributor

Another question, what will happen, if we wrote Hudi metadata successfully, but failed during writing of Iceberg or Delta metadata? How we will process this case, when Hudi table is fine, but interoperability support is broken at some commit?

@github-actions github-actions bot added size:XL PR with lines of changes > 1000 and removed size:L PR with lines of changes in (300, 1000] labels Jun 6, 2025
Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments (some nts: note to self) and some legit to check.

If you can rebase, address as much of these as possible - then I can take another full pass.

Please feel free to resolve code review comments that are addressed.

@balaji-varadarajan-ai balaji-varadarajan-ai force-pushed the pluggable-tf-interface branch 5 times, most recently from a7b6ae7 to a3477dc Compare June 18, 2025 00:37
Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstractions look good. Left a bunch of comments for me check, shepherd and land. I can make those myself.

table.getMetaClient().reloadActiveTimeline();
}

// If instant is inflight but marked as completed in native format, delete the completed instant from storage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nts: to review closely

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bug fix? how could this happen?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When committing -> we commit to native timeline first (A), then plugged-in tableformat (B)

instantToRollback.isInflight() can be true, if A happened, but we failed before B. This is fixing that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ll make this limited to cases where something else is plugged in

Copy link
Contributor

@danny0405 danny0405 Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, let's avoid listing the timeline multiple times for reqular workflow when the external table format is not there.

*/
HoodieInstant transitionClusterInflightToComplete(boolean shouldLock, HoodieInstant inflightInstant, HoodieReplaceCommitMetadata metadata);

HoodieInstant transitionClusterInflightToComplete(boolean shouldLock, HoodieInstant inflightInstant, HoodieReplaceCommitMetadata metadata, TableFormatCompletionAction tableFormatCompletionAction);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should try and avoid the overload?

@vinothchandar vinothchandar force-pushed the pluggable-tf-interface branch from a3477dc to 62e911a Compare June 25, 2025 23:45
metaClient.getStorage().exists(getInstantFileNamePath(fromInstantFileName)),
"File " + getInstantFileNamePath(fromInstantFileName) + " does not exist!");
createCompleteFileInMetaPath(shouldLock, toInstant, metadata);
String completionTime = HoodieInstantTimeGenerator.formatDateBasedOnTimeZone(new Date(createCompleteFileInMetaPath(shouldLock, toInstant, metadata)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nts: need to ensure this is consistent with our approach to time generation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. keeping this open.. and tracking.

Will get @the-other-tim-brown 's eyes on this next week.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is correct but redundant, I will fix this.

Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block on my final review.

@vinothchandar vinothchandar force-pushed the pluggable-tf-interface branch 2 times, most recently from 1b4151b to c146e82 Compare June 28, 2025 00:40
@vinothchandar vinothchandar force-pushed the pluggable-tf-interface branch from c146e82 to 3fc8cbf Compare June 28, 2025 17:18
@vinothchandar
Copy link
Member

@hudi-bot run azure

1 similar comment
@vinothchandar
Copy link
Member

@hudi-bot run azure

@vinothchandar
Copy link
Member

@geserdugarov sorry. missed your comments there.. does the RFC help address some of those

@danny0405 danny0405 dismissed their stale review July 1, 2025 04:14

unblock it from myside first.


private void initMetaClient() {
if (this.metaClient == null) {
this.metaClient = StreamerUtil.createMetaClient(conf);
Copy link
Member

@vinothchandar vinothchandar Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danny0405 general question for you. is StreamerUtil the right place for such reusable helper methods. I would have expected sth like FlinkTableUtils or sth generic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, for most of the the utilities method we put it here. we also have other utilities, you can check all the other utility classes under the same package.

bvaradar and others added 9 commits July 1, 2025 08:13
Address review comments
Avoid local timezone conversion for completion time
Handle rollback and savepoint. Add TableFormatCompletionAction functional interface. Fix bug in saveAsComplete. Use Consumer instead
Add functional tests for TableFormat
Fix check-style
Fix renames
Address comments from Vinoth
Fix tests and improve test coverage
Refactor FsUtils to take metaClient and remove overloaded methods
@vinothchandar vinothchandar force-pushed the pluggable-tf-interface branch from 84c72fd to 59cf106 Compare July 1, 2025 22:37
Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed one last change.. Tests pass, then I am good with the PR.

@danny0405 to make a pass and shepherd/land this.

cc @bvaradar @vinishjail97

@hudi-bot
Copy link
Collaborator

hudi-bot commented Jul 2, 2025

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@geserdugarov
Copy link
Contributor

geserdugarov commented Jul 2, 2025

@geserdugarov sorry. missed your comments there.. does the RFC help address some of those

@vinothchandar, thanks for your reply. Unfortunately, the RFC doesn't address mentioned questions. My main concern is performance, I really hope we won’t lose it after these changes. Since there are no performance tests in the project, we’ll have to verify it manually.
Another issue is the increasing code complexity. It keeps growing, and it’s becoming harder and harder to provide a stable solution. I hope some progress will be done under HUDI-9054.

@danny0405
Copy link
Contributor

@geserdugarov what is the performance concern that it may induces in this PR? We should definitely fix it IMO, can you guide me the lines in this patch?

In general, all the changes in this path should not bring in performance issues, some hook/empty invocations shoud be fine and the cost could be negligible.

I agree we should limit the code complexity, it's the way we are evolving the project, we welcome new features always if it is valuable, then try to make the code in good shape as much as we can, but yeah, sometimes it is hard to give a elegant solution based on the amout of work it could take.

Appreciate your points.

@geserdugarov
Copy link
Contributor

geserdugarov commented Jul 2, 2025

@geserdugarov what is the performance concern that it may induces in this PR? We should definitely fix it IMO, can you guide me the lines in this patch?

In general, all the changes in this path should not bring in performance issues, some hook/empty invocations shoud be fine and the cost could be negligible.

I agree we should limit the code complexity, it's the way we are evolving the project, we welcome new features always if it is valuable, then try to make the code in good shape as much as we can, but yeah, sometimes it is hard to give a elegant solution based on the amout of work it could take.

Appreciate your points.

@danny0405 , from quick checking of this patch, you're right, looks like there should be no performance issues. I only see that we started to pass HoodieTableMetaClient way more, and it's a Serializable class, which contains a lot of not transient objects. If there are any frequently called place in the code, where we will start to use Kryo SerDe of this class, (which I don't actually see) it could potentially decrease performance. But again, it's just a guessing, and it would be great to have some continuous performance monitoring to be sure in all big changes. I also will try to think about it in my spare time.

@danny0405
Copy link
Contributor

danny0405 commented Jul 2, 2025

looks like there should be no performance issues. I only see that we started to pass HoodieTableMetaClient way more

@geserdugarov I also noticed this change, in most of the changes, the context already contains a meta client, and we just need it's light-weight member like "table format", the other logic are almost the same before the patch.

@vinothchandar
Copy link
Member

it's just a guessing, and it would be great to have some continuous performance monitoring to be sure in all big changes.

+1. this is a great idea in general.. for this PR, we had a bunch of conversations around this. and @bvaradar explicitly assured me and ensured that we ll not incur overheads on native path.

@danny0405 I plugged the one issue I saw around extra timeline calls. CI is green. I'll land this in few hours, until I hear otherwise.

Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@danny0405 danny0405 merged commit 8a5163b into apache:master Jul 3, 2025
120 of 122 checks passed
alexr17 pushed a commit to alexr17/hudi that referenced this pull request Aug 25, 2025
…pache#13216)

* introduces a new table option 'hoodie.table.format';
* hoodie's native format just triggers nothing on core actions like #commit, #archive, #rollback, #clean, etc.

---------

Co-authored-by: vinoth chandar <vinoth@apache.org>
Co-authored-by: danny0405 <yuzhao.cyz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL PR with lines of changes > 1000

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants