Skip to content

Support listing of Hudi files through its metadata#1

Open
umehrot2 wants to merge 1 commit intomasterfrom
uditme_hudi_rfc15
Open

Support listing of Hudi files through its metadata#1
umehrot2 wants to merge 1 commit intomasterfrom
uditme_hudi_rfc15

Conversation

@umehrot2
Copy link
Owner

This implements changes to support fetching the list of Hudi files through the metadata table maintained inside Hudi. This is part of feature https://cwiki.apache.org/confluence/display/HUDI/RFC+-+15%3A+HUDI+File+Listing+and+Query+Planning+Improvements where we are adding support for maintaining file listing metadata within the Hudi tables for faster listings when working with S3 specially.

This is based on apache/hudi#2326 where I introduced a new createInMemoryFileSystemView method inside FileSystemViewManager which returns a view according to users configuration, depending on whether they want to list using the metadata or not.

Copy link

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One high level concern. Seems simple enough .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be just, hive.use.hudi.metadata.to.list.files

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think community will accept this, as the convention they follow is sepration by -. In addition, I used the work prefer so it along the lines of https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/HiveClientConfig.java#L1530

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hive.verify.hudi.metadata

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. This is not the naming convention presto community follows.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bit more guidance on when this should be turned on and off?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more details in the description

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this will do one additional RPC per base file? if so, would n't this be bad actually for listing performance?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umehrot2 what I meant was if we can simply do something like

LocatedFileStatus hoodieFileStatus = new LocatedFileStatus(fileStatus, 
   new BlockLocation[] {new BlockLocation(name, host, 0, file.getLen())});

if so, would this work for hdfs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umehrot2 do you have an easy test environment for HDFS?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uber and Facebook both use hdfs a lot. So this will come up in the review upstream for sure.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I will make that change, and do some testing on EMR cluster. EMR clusters all have HDFS so I should be able to test.

@umehrot2 umehrot2 force-pushed the uditme_hudi_rfc15 branch 2 times, most recently from 5800c3e to 3959ddd Compare January 23, 2021 01:07
@umehrot2
Copy link
Owner Author

umehrot2 commented Jan 23, 2021

Scale Testing

The patch has been tested on a 1.5 TB Hudi table. The patch in general is offering much better performance for two reason:

  1. Use of directory lister, instead of the path filter approach
  2. Use of metadata based listing

Based on my investigate #1 appears to be the cause of major improvements over the previous path filter approach. With the directory lister approach for example HoodieTableMetaClient is instantiated for every partition. With the directory lister approach it is instantiated only once per the number of thread used to load splits concurrently. By default this number is 3-4.

Even from #1 to #2 I see further performance improvement with S3 specially as the time taken to list one partition is slightly faster using metadata as compared to using file system. With metadata I see mean times of 20-50ms to load a partitions file listing, whereas with file system listing I see anywhere from 60-200ms time to list. When we have large number of partitions like in my case, this time can get added up to provide decent increase. For a count(*) query over the table I see 10-15 seconds improvement in query performance with metadata table enabled.

I did the same testing with HDFS too.

@umehrot2
Copy link
Owner Author

umehrot2 commented Jan 23, 2021

Concerns regarding caching

In addition, the concern we had about caching the file system view in case of presto is not really a concern. Unlike in spark, where input format is directly used to do the file listing, here we directly use the file system view to do the listing for us. The directory lister is instantiated only once per thread that loads the split. By default this concurrency is 4 and hence no matter the number of partitions the directory lister is instantiated only 4 times and so is the FileSystemView. This is unlike spark, that uses the input format directly for listing and creates it for each partition.

@vinothchandar
Copy link

The directory lister is instantiated only once per thread that loads the split.

yes. it should be fine. Presto has a clean place for us to init once and do the fetching.

Thanks for the tests @umehrot2 ! cc @n3nash can we get someone from the presto team at uber involved here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants