-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-7563] Add support to drop index using sql #11951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5fc8434 to
0ea02b9
Compare
...ommon/src/main/java/org/apache/hudi/table/action/index/functional/BaseHoodieIndexClient.java
Outdated
Show resolved
Hide resolved
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left few minor comments
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few minor nits
...ommon/src/main/java/org/apache/hudi/table/action/index/functional/BaseHoodieIndexClient.java
Outdated
Show resolved
Hide resolved
...ommon/src/main/java/org/apache/hudi/table/action/index/functional/BaseHoodieIndexClient.java
Outdated
Show resolved
Hide resolved
...spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/HoodieSparkIndexClient.java
Show resolved
Hide resolved
0ea02b9 to
235756d
Compare
| if (!indexExists(metaClient, indexName)) { | ||
| if (ignoreIfNotExists) { | ||
| return; | ||
| } else { | ||
| throw new HoodieFunctionalIndexException("Index does not exist: " + indexName); | ||
| } | ||
| } | ||
|
|
||
| LOG.info("Dropping index {}", indexName); | ||
| HoodieIndexDefinition indexDefinition = metaClient.getIndexMetadata().get().getIndexDefinitions().get(indexName); | ||
| try (SparkRDDWriteClient writeClient = HoodieCLIUtils.createHoodieWriteClient( | ||
| sparkSession, metaClient.getBasePath().toString(), mapAsScalaImmutableMap(buildWriteConfig(metaClient, indexDefinition)), toScalaOption(Option.empty()))) { | ||
| writeClient.dropIndex(Collections.singletonList(indexName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference, this logic does not have much specifics to engine itself, so it can be abstracted to the index client by plugging in the engine-specific write client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need the engine-specific write client to call the base API BaseHoodieWriteClient.dropIndex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is BaseHoodieWriteClient to abstract the write logic. Engine-specific implementation only needs to instantiate the engine-specific write client. The index client should only care about calling APIs in BaseHoodieWriteClient, which is the case here (need to generalize HoodieCLIUtils.createHoodieWriteClient too).
| HoodieSparkIndexClient.getInstance(sparkSession).drop(metaClient, indexName, ignoreIfNotExists) | ||
| } catch { | ||
| case _: IllegalArgumentException => | ||
| SecondaryIndexManager.getInstance().drop(metaClient, indexName, ignoreIfNotExists) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why drop here again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is legacy code due to incomplete RFC-52. SecondaryIndexManager was introduced in RFC-52 but it's just a wrapper code and does not really manage any index underneath. From RFC, it's supposed to support index built on third party libraries such as Lucene. But, we have not yet added any support so far. In my opinion, we should remove all that code. Just keeping it here to make some tests pass. If you agree, I can take the cleanup as a followup later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Let's have JIRA to track that.
...asource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestSecondaryIndexPruning.scala
Outdated
Show resolved
Hide resolved
.../hudi-spark/src/test/scala/org/apache/spark/sql/hudi/command/index/TestFunctionalIndex.scala
Outdated
Show resolved
Hide resolved
.../hudi-spark/src/test/scala/org/apache/spark/sql/hudi/command/index/TestFunctionalIndex.scala
Outdated
Show resolved
Hide resolved
235756d to
e8a0631
Compare
yihua
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| HoodieSparkIndexClient.getInstance(sparkSession).drop(metaClient, indexName, ignoreIfNotExists) | ||
| } catch { | ||
| case _: IllegalArgumentException => | ||
| SecondaryIndexManager.getInstance().drop(metaClient, indexName, ignoreIfNotExists) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Let's have JIRA to track that.
Change Logs
Impact
Users can now drop index using SQL
Risk level (write none, low medium or high below)
low
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist