Conversation
codope
left a comment
There was a problem hiding this comment.
@willzgw Thanks for adding the MOR snapshot query support. This would be very helpful for the community. Please edit the PR description accordingly and point to issue#14432.
Also, if you could add integ tests for MOR snapshot query that would be great. You can extend the existing setup. Note that you will have to add an upsert operation (folowwing after insert) in TpchHudiTablesInitializer#load.
plugin/trino-hudi/src/main/java/io/trino/plugin/hudi/HudiSplit.java
Outdated
Show resolved
Hide resolved
plugin/trino-hudi/src/main/java/io/trino/plugin/hudi/HudiPageSourceProvider.java
Outdated
Show resolved
Hide resolved
|
I am trying to add some unit test cases for this PR. |
plugin/trino-hudi/src/main/java/io/trino/plugin/hudi/HudiTableType.java
Outdated
Show resolved
Hide resolved
plugin/trino-hudi/src/main/java/io/trino/plugin/hudi/partition/HiveHudiPartitionInfo.java
Outdated
Show resolved
Hide resolved
plugin/trino-hudi/src/main/java/io/trino/plugin/hudi/HudiFile.java
Outdated
Show resolved
Hide resolved
plugin/trino-hudi/src/main/java/io/trino/plugin/hudi/HudiModule.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
HudiSplit already carry configuration. See if we can pass from the split in page source provider.
There was a problem hiding this comment.
Just for my understanding, what's the intention of setting compression codecs here?
plugin/trino-hudi/src/main/java/io/trino/plugin/hudi/HudiSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-hudi/src/main/java/io/trino/plugin/hudi/query/HudiDirectoryLister.java
Outdated
Show resolved
Hide resolved
|
Could you rebase on upstream to resolve conflicts and add product tests? |
|
@willzgw I have a PR for async split processing #16034 that improves the Hudi connector performance. On top of it, I have MoR support in my fork - https://github.com/codope/trino/tree/mor-snapshot |
Hi @codope Let's continue with this PR, I will rebase this branch with #16034. |
28b7b67 to
92ec871
Compare
@electrum The current version of |
|
This pr is very helpful to me, but when we use 433 version and I tried to cherry-pick this pr found that there are many conflicts after this pr 【#18241 】,especially the RecordCursor is not used after this pr. |
|
👋 @willzgw - this PR has become inactive. We hope you are still interested in working on it. Please let us know, and we can try to get reviewers to help with that. We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks. |
Hi @mosabua |
|
Great .. feel free to reach out to us if you need help. @codope can help on the Hudi side.. |
|
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Guowei Zhang.
|
|
Looks like github still thinks you have a lot of conflicts to resolve @willzgw |
Yeah, still in progress. |
|
I am not sure if it is worth refactoring this PR. Currently @yihua and @codope are working towards a new Hudi release that removes the Hadoop dependencies. Once that is out they will work on Hudi connector updates that will probably change the codebase quite a bit. So maybe its worth waiting to see if MOR snapshot support will come out of these PRs or just be easier to implement from scratch then. |
@mosabua |
|
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
|
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time. |
@yihua @codope Did Hudi 0.15.0 remove all Hadoop dependencies? I see https://issues.apache.org/jira/browse/HUDI-6497 contains 0.15.0 as a fix version. |
|
Hey @pravin1406, I am not actively contributing. For MOR query support, I would recommend that you can check Onehouse's fork of Trino -- https://github.com/onehouseinc/trino. It does have MOR support and also integrates with some of the indexes. |
|
Hi @pravin1406 , as @codope mentioned, this Trino fork has the latest support on Hudi MOR table https://github.com/onehouseinc/trino . We're going to upstream the changes to Trino OSS. Stay tuned. |
|
Hi @pravin1406 ,
The Trino fork (https://github.com/onehouseinc/trino) works on all table versions, including the Hudi tables created by Hudi 0.14.1.
It is based on top of OSS Trino 472. |
Description
Non-technical explanation
Release notes
( ) Support Hudi MOR Tables
( ) Support MOR snapshot querying for Hudi tables