Skip to content

Conversation

@bhat-vinay
Copy link
Contributor

@bhat-vinay bhat-vinay commented Mar 1, 2024

There are a couple of issues in how functional indexes are managed.

  1. HoodieSparkFunctionalIndexClient::create(...) was failing to register a functional index iff a (different) functional index was already created. Fixed this check by looking up the index-name in the FunctionalIndexMetadata
  2. HoodieTableConfig TABLE_METADATA_PARTITIONS and TABLE_METADATA_PARTITIONS_INFLIGHT should actually store the Metadata partition path. While the path is contained in the MeatadatPartitionType for most of the indexes, it is not correct for functional-index. MeatadatPartitionType.FUNCTIONAL_INDEX only stores the prefix (i.e func_index_). The actual partition path needs to be extracted from the index-name.
  3. Because of 2, most of the helper methods that operate on metadata-partitions, should take partition-path (and not partition-type)

This PR addresses the problem listed above. This fix is required to add SQL support for secondary-indexes (the configs for which will be based on functional-index-config).

Note that there are still issues with some functional-index operations (like drop index / delete partition) because of the issues listed here. Those will be fixed in a subsequent PR.

Change Logs

There are a couple of issues in how functional indexes are managed.

  1. HoodieSparkFunctionalIndexClient::create(...) was failing to register a functional index iff a (different) functional index was already created. Fixed this check by looking up the index-name in the FunctionalIndexMetadata
  2. HoodieTableConfig TABLE_METADATA_PARTITIONS and TABLE_METADATA_PARTITIONS_INFLIGHT should actually store the Metadata partition path. While the path is contained in the MeatadatPartitionType for most of the indexes, it is not correct for functional-index. MeatadatPartitionType.FUNCTIONAL_INDEX only stores the prefix (i.e func_index_). The actual partition path needs to be extracted from the index-name.
  3. Because of 2, most of the helper methods that operate on metadata-partitions, should take partition-path (and not partition-type)

This PR addresses the problem listed above. This fix is required to add SQL support for secondary-indexes (the configs for which will be based on functional-index-config).

Note that there are still issues with some functional-index operations (like drop index / delete partition) because of the issues listed here. Those will be fixed in a subsequent PR.

Impact

None

Risk level (write none, low medium or high below)

Low

Documentation Update

NA

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@bhat-vinay bhat-vinay changed the title [HUDI-7458] Allow multiple functional index to be created [HUDI-7458] Fix bug with functional index to be creation Mar 1, 2024
@github-actions github-actions bot added the size:M PR with lines of changes in (100, 300] label Mar 1, 2024
@bhat-vinay
Copy link
Contributor Author

cc: @codope

@bhat-vinay bhat-vinay changed the title [HUDI-7458] Fix bug with functional index to be creation [HUDI-7458] Fix bug with functional index creation Mar 1, 2024
@bhat-vinay bhat-vinay force-pushed the fix-functional-index-def branch from a805b9f to b8058f7 Compare March 1, 2024 10:44
There are a couple of issues in how functional indexes are managed.
1. HoodieSparkFunctionalIndexClient::create(...) was failing a register a functional index iff a (different) functional
index was already created. Fixed this check by looking up the index-name in the FunctionalIndexMetadata
2. HoodieTableConfig `TABLE_METADATA_PARTITIONS` and `TABLE_METADATA_PARTITIONS_INFLIGHT` should actually store the Metadata
partition path. While the path is contained in the `MeatadatPartitionType` for most of the indexes, it is not correct for
functional-index. MeatadatPartitionType.FUNCTIONAL_INDEX only stores the prefix (i.e func_index_). The actual partition
path needs to be extracted from the index-name.
3. Because of apache#2, most of the helper methods that operate on metadata-partitions, should take partition-path (and not partition-type)

This PR addresses the problem listed above. This fix is required to add SQL support for secondary-indexes (the configs
for which will be based on functional-index-config).

Note that there are still issues with some functional-index operations (like drop index / delete partition)
because of the issues listed here. Those will be fixed in a subsequent PR.
@bhat-vinay bhat-vinay force-pushed the fix-functional-index-def branch from b8058f7 to cfc8573 Compare March 1, 2024 10:59
@hudi-bot
Copy link
Collaborator

hudi-bot commented Mar 1, 2024

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Copy link
Member

@codope codope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing it.

@codope codope merged commit d23abd3 into apache:master Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M PR with lines of changes in (100, 300]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants