Support listing of Hudi files through its metadata by umehrot2 · Pull Request #1 · umehrot2/presto

umehrot2 · 2020-12-16T02:17:04Z

This implements changes to support fetching the list of Hudi files through the metadata table maintained inside Hudi. This is part of feature https://cwiki.apache.org/confluence/display/HUDI/RFC+-+15%3A+HUDI+File+Listing+and+Query+Planning+Improvements where we are adding support for maintaining file listing metadata within the Hudi tables for faster listings when working with S3 specially.

This is based on apache/hudi#2326 where I introduced a new createInMemoryFileSystemView method inside FileSystemViewManager which returns a view according to users configuration, depending on whether they want to list using the metadata or not.

vinothchandar

One high level concern. Seems simple enough .

vinothchandar · 2020-12-17T20:56:04Z

presto-hive/src/main/java/com/facebook/presto/hive/HiveClientConfig.java

may be just, hive.use.hudi.metadata.to.list.files

I don't think community will accept this, as the convention they follow is sepration by -. In addition, I used the work prefer so it along the lines of https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/HiveClientConfig.java#L1530

vinothchandar · 2020-12-17T20:56:41Z

presto-hive/src/main/java/com/facebook/presto/hive/HiveClientConfig.java

hive.verify.hudi.metadata

Same as above. This is not the naming convention presto community follows.

vinothchandar · 2020-12-17T20:57:10Z

presto-hive/src/main/java/com/facebook/presto/hive/HiveSessionProperties.java

bit more guidance on when this should be turned on and off?

added more details in the description

vinothchandar · 2020-12-17T21:00:21Z

presto-hive/src/main/java/com/facebook/presto/hive/HudiDirectoryLister.java

so this will do one additional RPC per base file? if so, would n't this be bad actually for listing performance?

@umehrot2 what I meant was if we can simply do something like

LocatedFileStatus hoodieFileStatus = new LocatedFileStatus(fileStatus, new BlockLocation[] {new BlockLocation(name, host, 0, file.getLen())});

if so, would this work for hdfs.

@umehrot2 do you have an easy test environment for HDFS?

Uber and Facebook both use hdfs a lot. So this will come up in the review upstream for sure.

Yeah I will make that change, and do some testing on EMR cluster. EMR clusters all have HDFS so I should be able to test.

umehrot2 · 2021-01-23T01:28:04Z

Scale Testing

The patch has been tested on a 1.5 TB Hudi table. The patch in general is offering much better performance for two reason:

Use of directory lister, instead of the path filter approach
Use of metadata based listing

Based on my investigate #1 appears to be the cause of major improvements over the previous path filter approach. With the directory lister approach for example HoodieTableMetaClient is instantiated for every partition. With the directory lister approach it is instantiated only once per the number of thread used to load splits concurrently. By default this number is 3-4.

Even from #1 to #2 I see further performance improvement with S3 specially as the time taken to list one partition is slightly faster using metadata as compared to using file system. With metadata I see mean times of 20-50ms to load a partitions file listing, whereas with file system listing I see anywhere from 60-200ms time to list. When we have large number of partitions like in my case, this time can get added up to provide decent increase. For a count(*) query over the table I see 10-15 seconds improvement in query performance with metadata table enabled.

I did the same testing with HDFS too.

umehrot2 · 2021-01-23T01:34:30Z

Concerns regarding caching

In addition, the concern we had about caching the file system view in case of presto is not really a concern. Unlike in spark, where input format is directly used to do the file listing, here we directly use the file system view to do the listing for us. The directory lister is instantiated only once per thread that loads the split. By default this concurrency is 4 and hence no matter the number of partitions the directory lister is instantiated only 4 times and so is the FileSystemView. This is unlike spark, that uses the input format directly for listing and creates it for each partition.

vinothchandar · 2021-01-23T20:00:44Z

The directory lister is instantiated only once per thread that loads the split.

yes. it should be fine. Presto has a clean place for us to init once and do the fetching.

Thanks for the tests @umehrot2 ! cc @n3nash can we get someone from the presto team at uber involved here?

vinothchandar reviewed Dec 17, 2020

View reviewed changes

umehrot2 force-pushed the uditme_hudi_rfc15 branch 2 times, most recently from 5800c3e to 3959ddd Compare January 23, 2021 01:07

Support listing of Hudi files through its metadata

193cb46

umehrot2 force-pushed the uditme_hudi_rfc15 branch from 3959ddd to 193cb46 Compare January 23, 2021 01:15

Conversation

umehrot2 commented Dec 16, 2020

Uh oh!

vinothchandar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

umehrot2 commented Jan 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

umehrot2 commented Jan 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vinothchandar commented Jan 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

umehrot2 commented Jan 23, 2021 •

edited

Loading

umehrot2 commented Jan 23, 2021 •

edited

Loading