-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups #4352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups #4352
Conversation
235981a to
9ee3e62
Compare
hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java
Outdated
Show resolved
Hide resolved
9ee3e62 to
83f8b8c
Compare
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with 1 pass over source code.
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java
Outdated
Show resolved
Hide resolved
vinothchandar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to review the full stats/bloom filter write/read path still
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
| final String keyField = hoodieTable.getMetaClient().getTableConfig().getRecordKeyFieldProp(); | ||
| return context.flatMap(partitions, new SerializableFunction<String, Stream<Pair<String, BloomIndexFileInfo>>>() { | ||
| @Override | ||
| public Stream<Pair<String, BloomIndexFileInfo>> apply(String partitionName) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we discussed reading all of this from the driver correct? like fetch the entire list of stats for a key column alone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With millions of files, loading all from the driver might not be a good idea. Will explore this more as part of the new DAG PR.
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
...link-client/src/main/java/org/apache/hudi/metadata/FlinkHoodieBackedTableMetadataWriter.java
Show resolved
Hide resolved
| .sinceVersion("0.10.0") | ||
| .withDocumentation("Enable full scanning of log files while reading log records. If disabled, hudi does look up of only interested entries."); | ||
|
|
||
| public static final ConfigProperty<Boolean> ENABLE_META_INDEX_BLOOM_FILTER = ConfigProperty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are assuming its always the key field that the bloom filter points to. We need to also take another config where user can specify list of columns/fields to track bloom filters for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking up this in https://issues.apache.org/jira/browse/HUDI-3327
4aea835 to
46d4587
Compare
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLookupHandle.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLookupHandle.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLookupHandle.java
Outdated
Show resolved
Hide resolved
| final String partitionPath = entry._2.getPartitionPath(); | ||
| final String fileId = entry._1; | ||
| if (!fileIDBaseFileMap.containsKey(fileId)) { | ||
| Option<HoodieBaseFile> baseFile = hoodieTable.getBaseFileOnlyView().getLatestBaseFile(partitionPath, fileId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible at any point that there are no base files in the table? What happens then? Like for example, the MOR table due to kakfka-connect creates only log files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for iterator to have a fileId multiple times in the same task?
The caller passes in list of tuples of file id/name to key. So, when many keys fall in the same file, we can see the same fileid repeating in the input list. Here we are constructing the base file for the file id only once.
Is it possible at any point that there are no base files in the table? What happens then?
The bloom filter and column range info are built from base files footer details. When there are no base files, we don't have index for them.
Is it possible at any point that there are no base files in the table? What happens then?
Generally, even the MOR user table, starts off with a base file. With no indexes or index lookup miss, upserts will choose the insert code path and there by forcing the base file creation. But, I need to explore more on the kafka-connect case creating log files only. This is an open item.
...-client/src/main/java/org/apache/hudi/index/bloom/HoodieBloomMetaIndexLazyCheckFunction.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReader.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
Outdated
Show resolved
Hide resolved
vinothchandar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made one pass. But lets address the perf issues and simplify all these different code paths
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Show resolved
Hide resolved
|
|
||
| Collections.sort(columnStatKeys); | ||
| Map<Pair<String, String>, HoodieColumnStats> fileToColumnStatMap = hoodieTable | ||
| .getMetadataTable().getColumnStats(columnStatKeys, keyField); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Pair<String, String> is a less than ideal API for partitionPath and file or something. lets go with getColumnStats(Option<String> partitionPath, String fileName)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getColumnStats() and getBloomFilters() are built to work for a list of partition-file paired keys. Same partition and multiple files can be requested. So, i need a combo of partition and file names here. I can revisit this after the performance/dag work.
...park-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java
Outdated
Show resolved
Hide resolved
...park-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java
Outdated
Show resolved
Hide resolved
...-client/src/main/java/org/apache/hudi/index/bloom/HoodieBloomMetaIndexLazyCheckFunction.java
Outdated
Show resolved
Hide resolved
...-client/src/main/java/org/apache/hudi/index/bloom/HoodieBloomMetaIndexLazyCheckFunction.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/MetadataPartitionType.java
Outdated
Show resolved
Hide resolved
5d5925e to
b3d8632
Compare
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReader.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java
Show resolved
Hide resolved
b3d8632 to
69f3357
Compare
83e8c78 to
176c05c
Compare
|
@nsivabalan @codope CI test failure in TestHoodieDeltaStreamerWithMultiWriter is fixed by #4704 |
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. just one nit.
...di-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
Show resolved
Hide resolved
...di-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
Outdated
Show resolved
Hide resolved
...di-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
Outdated
Show resolved
Hide resolved
...-client/src/main/java/org/apache/hudi/index/bloom/HoodieMetadataBloomIndexCheckFunction.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/metadata/MetadataPartitionType.java
Outdated
Show resolved
Hide resolved
2541391 to
d78e61c
Compare
|
@codope @hudi-bot run azure |
…peed up index lookups - Today, base files have bloom filter at their footers and index lookups have to load the base file to perform any bloom lookups. Though we have interval tree based file purging, we still end up in significant amount of base file read for the bloom filter for the end index lookups for the keys. This index lookup operation can be made more performant by having all the bloom filters in a new metadata partition and doing pointed lookups based on keys.
…peed up index lookups - Adding indexing support for clean, restore and rollback operations. Each of these operations will now be converted to index records for bloom filter and column stats additionally.
…peed up index lookups - Making hoodie key consistent for both column stats and bloom index by including fileId instead of fileName, in both read and write paths. - Performance optimization for looking up records in the metadata table. - Avoiding multi column sorting needed for HoodieBloomMetaIndexBatchCheckFunction
…peed up index lookups - HoodieBloomMetaIndexBatchCheckFunction cleanup to remove unused classes - Base file checking before reading the file footer for bloom or column stats
…peed up index lookups - Updating the bloom index and column stats index to have full file name included in the key instead of just file id. - Minor test fixes.
…peed up index lookups - Fixed flink commit method to handle metadata table all partition update records - TestBloomIndex fixes
…peed up index lookups - SparkHoodieBloomIndexHelper code simplification for various config modes - Signature change for getBloomFilters() and getColumnStats(). Callers can just pass in interested partition and file names, the index key is then constructed internally based on the passed in parameters. - KeyLookupHandle and KeyLookupResults code refactoring - Metadata schema changes - removed the reserved field
…peed up index lookups - Removing HoodieColumnStatsMetadata and using HoodieColumnRangeMetadata instead. Fixed the users of the the removed class.
…peed up index lookups - Extending meta index test to cover deletes, compactions, clean and restore table operations. Also, fixed the getBloomFilters() and getColumnStats() to account for deleted entries.
…peed up index lookups - Addressing review comments - java doc for new classes, keys sorting for lookup, index methods renaming.
…peed up index lookups - Consolidated the bloom filter checking for keys in to one HoodieMetadataBloomIndexCheckFunction instead of a spearate batch and lazy mode. Removed all the configs around it. - Made the metadata table partition file group count configurable. - Fixed the HoodieKeyLookupHandle to have auto closable file reader when checking bloom filter and range keys. - Config property renames. Test fixes.
…peed up index lookups - Enabling column stats indexing for all columns by default - Handling column stat generation errors and test update
…peed up index lookups - Metadata table partition file group count taken from the slices when the table is bootstrapped. - Prep records for the commit refactored to the base class - HoodieFileReader interface changes for filtering keys - Multi column and data types support for colums stats index
…peed up index lookups - rebase to latest master and merge fixes for the build and test failures
…peed up index lookups - Extending the metadata column stats type payload schema to include more statistics about the column ranges to help query integration.
…peed up index lookups - Addressing review comments
d78e61c to
e489045
Compare
codope
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Please resolve the conflicts and this should be ready to land.
What is the purpose of the pull request
have to load the base file to perform any bloom lookups. Though we have
interval tree based file purging, we still end up in significant amount
of base file read for the bloom filter for the end index lookups for the
keys. This index lookup operation can be made more performant by having
all the bloom filters in a new metadata partition and doing pointed
lookups based on keys.
Brief change log
RFC-37 #3989 Implementation
Write path will now additionally persist bloom filters from all the newly added
base files from the inflight commit to the metadata table bloom index, and
column range metadata to the metadata table column stats index.
Read path during tagLocation() lookupIndex() will look at the metadata table
indices instead of base files. Final verification of the incoming keys will continue
to happen with the respective base files.
Tests
Total Index lookup time taken:
Table: COW
Operation: Upsert
Spark executors, cores: 25, 4
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.