-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-33623][SQL] Add canDeleteWhere to SupportsDelete #30562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,8 +28,30 @@ | |
| */ | ||
| @Evolving | ||
| public interface SupportsDelete { | ||
|
|
||
| /** | ||
| * Checks whether it is possible to delete data from a data source table that matches filter | ||
| * expressions. | ||
| * <p> | ||
| * Rows should be deleted from the data source iff all of the filter expressions match. | ||
| * That is, the expressions must be interpreted as a set of filters that are ANDed together. | ||
| * <p> | ||
| * Spark will call this method at planning time to check whether {@link #deleteWhere(Filter[])} | ||
| * would reject the delete operation because it requires significant effort. If this method | ||
| * returns false, Spark will not call {@link #deleteWhere(Filter[])} and will try to rewrite | ||
| * the delete operation and produce row-level changes if the data source table supports deleting | ||
| * individual records. | ||
| * | ||
| * @param filters filter expressions, used to select rows to delete when all expressions match | ||
| * @return true if the delete operation can be performed | ||
| */ | ||
| default boolean canDeleteWhere(Filter[] filters) { | ||
| return true; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we have
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm wondering if there is a breaking change which we should have
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately, this would change the assumptions for existing implementations. Right now, if this interface is implemented, Spark will call The original idea was to try to delete using
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, @rdblue .
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's correct and the method returns |
||
| } | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that would be the case. In our use case, we can tell whether a delete is aligned with partitioning for this check. But, we can also scan through data to determine whether files themselves are fully matched (or not matched) by the filter. We would do the partitioning check here and the more expensive stats-based check in |
||
| /** | ||
| * Delete data from a data source table that matches filter expressions. | ||
| * Delete data from a data source table that matches filter expressions. Note that this method | ||
| * will be invoked only if {@link #canDeleteWhere(Filter[])} returns true. | ||
| * <p> | ||
| * Rows are deleted from the data source iff all of the filter expressions match. That is, the | ||
| * expressions must be interpreted as a set of filters that are ANDed together. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -221,6 +221,12 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat | |
| throw new AnalysisException(s"Exec update failed:" + | ||
| s" cannot translate expression to source filter: $f")) | ||
| }).toArray | ||
|
|
||
| if (!table.asDeletable.canDeleteWhere(filters)) { | ||
| throw new AnalysisException( | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this exception handled later? the rewrite part for row deletion is a TBD?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The rewrite would happen earlier. This just throws a good error message if
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The rewrite part is yet to be done. This PR just adds a way to have more info at planning time. Specifically, we will know if the rewrite is needed.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, thanks. So this method will be called in an earlier place and before rewrite once the rewrite part is ready, is that right?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is going to be called at planning time to check if we should apply the rewrite or just pass filters down. |
||
| s"Cannot delete from table ${table.name} where ${filters.mkString("[", ", ", "]")}") | ||
| } | ||
|
|
||
| DeleteFromTableExec(table.asDeletable, filters) :: Nil | ||
| case _ => | ||
| throw new AnalysisException("DELETE is only supported with v2 tables.") | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.