Support PathFilter when fetching splits for Hudi tables#13818
Support PathFilter when fetching splits for Hudi tables#13818arhimondr merged 1 commit intoprestodb:masterfrom
Conversation
|
@arhimondr please take a look when you can. Thanks much! |
shixuan-fan
left a comment
There was a problem hiding this comment.
Quickly glanced through. Flush some comments. Looks good overall.
presto-hive/src/main/java/com/facebook/presto/hive/BackgroundHiveSplitLoader.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/DefaultPathFilterProvider.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/DefaultPathFilterProvider.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/HiveBenchmarkQueryRunner.java
Outdated
Show resolved
Hide resolved
89d58da to
cb2631d
Compare
|
@bhasudha : Assume this is ready for review, I will remove the |
@wenleix I kept it as WIP since I am still testing it. I have some related questions. When I bring up local presto server, I believe presto-hudi tries to register two connectors - hudi and hive (since its extending the HiveHadoop2Plugin). I see error saying hive connector is already registered. I am wondering if this is got to do with the way I implemented the presto-hudi connector. Should I change HudiPlugin to extend from HivePlugin like how HiveHadoop2Plugin does instead of extending HiveHadoop2Plugin? Do you have any recommendations around this? |
presto-hive/src/main/java/com/facebook/presto/hive/DefaultPathFilterProvider.java
Outdated
Show resolved
Hide resolved
|
@bhasudha In our internal extension of hive connector, we didn't extend |
presto-hive-hadoop2/src/main/java/com/facebook/presto/hive/HiveHadoop2Plugin.java
Outdated
Show resolved
Hide resolved
presto-hive-hadoop2/src/main/java/com/facebook/presto/hive/HiveHadoop2Plugin.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/DefaultPathFilterProvider.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/PathFilterProvider.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/BackgroundHiveSplitLoader.java
Outdated
Show resolved
Hide resolved
cb2631d to
eded1a3
Compare
presto-hive/src/main/java/com/facebook/presto/hive/BackgroundHiveSplitLoader.java
Outdated
Show resolved
Hide resolved
|
[UPDATE] After further discussions, we decided we will move this implementation inside presto-hive instead of a seperate connector for now. Presto-hive can support HoodieParquetInputFormat as a first class citizen eventually. Depending on future requirements, moving hudi to a seperate connector can be considered. I will update this PR and send out the changes soon. @arhimondr @wenleix @highker |
eded1a3 to
6799609
Compare
|
@arhimondr I updated the PR based on our last discussion. Please review when possible. I noticed that the Travis build failed in presto-cassandra module. Looks like it might be a transient connection issue. Would you be able to rekick the build ? I cant see an option for restarting the build in Travis. Thanks much! |
presto-hive/src/main/java/com/facebook/presto/hive/BackgroundHiveSplitLoader.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/BackgroundHiveSplitLoader.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/util/HiveFileIterator.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/util/HiveFileIterator.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/util/TestHiveFileIterator.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/util/TestHiveFileIterator.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/util/TestHiveFileIterator.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/util/TestHiveFileIterator.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/util/TestHiveFileIterator.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/util/TestHiveFileIterator.java
Outdated
Show resolved
Hide resolved
6799609 to
090b5f4
Compare
- Introduce PathFilter support in DirectoryLister interface and HiveFileIterator
- Unit test HiveFileIterator using PathFilter
- Plug specific PathFilter implementation for HoodieParquetInputFormat
090b5f4 to
f2e5fa8
Compare
Summary:
- Introduce PathFilter support in DirectoryLister interface and HiveFileIterator
- Unit test HiveFileIterator using PathFilter
- Plug specific PathFilter implementation for HoodieParquetInputFormat
Combines following PRs into one:
prestodb#13818
prestodb#14085
prestodb#14088
Reviewers: bhasudha
Differential Revision: https://code.uberinternal.com/D4168579

This PR is corresponds to the issue - #13511
Summary:
- Introduce PathFilter support in DirectoryLister interface and HiveFileIterator
- Unit test HiveFileIterator using PathFilter
- Plug specific PathFilter implementation for HoodieParquetInputFormat
Please make sure your submission complies with our Development, Formatting, and Commit Message guidelines.
Fill in the release notes towards the bottom of the PR description.
See Release Notes Guidelines for details.
If release note is NOT required, use: