[SPARK-28974][SQL] centralize the Data Source V2 table capability checks#25679
[SPARK-28974][SQL] centralize the Data Source V2 table capability checks#25679cloud-fan wants to merge 2 commits intoapache:masterfrom
Conversation
| OverwritePartitionsDynamicExec(r.table.asWritable, r.options, planLater(query)) :: Nil | ||
|
|
||
| case DeleteFromTable(r: DataSourceV2Relation, condition) => | ||
| if (SubqueryExpression.hasSubquery(condition)) { |
There was a problem hiding this comment.
We think that the subquery check is for the current limitation, which should not be treated as a capability check. It's better to put it near the implementation, and should be updated together when we improve the implementation.
| override def apply(plan: LogicalPlan): Unit = { | ||
| plan.foreach { | ||
| plan foreach { | ||
| case r: DataSourceV2Relation if !r.table.supports(BATCH_READ) => |
There was a problem hiding this comment.
Here I add the batch scan check. It's possible that a table implements SupportsRead without reporting BATCH_READ capability. For example, a steaming table which doesn't support batch scan. We must check the BATCH_READ capability here, instead of relying on the .isInstaceOf[SupportsRead] check at the planner side.
|
|
||
| assert(exc.getMessage.contains( | ||
| "does not support overwrite expression (`x` = 5) in batch mode")) | ||
| assert(exc.getMessage.contains("does not support overwrite by filter in batch mode")) |
There was a problem hiding this comment.
I updated the error message a little bit. The previous message is a little misleading that only the specific expression is not supported.
|
Test build #110124 has finished for PR 25679 at commit
|
|
Looks fine to me, once tests are passing. |
|
Test build #110163 has finished for PR 25679 at commit
|
|
retest this please |
|
Test build #110168 has finished for PR 25679 at commit
|
|
thanks for the review, merging to master! |
### What changes were proposed in this pull request? merge the `V2WriteSupportCheck` and `V2StreamingScanSupportCheck` to one rule: `TableCapabilityCheck`. ### Why are the changes needed? It's a little confusing to have 2 rules to check DS v2 table capability, while one rule says it checks write and another rule says it checks streaming scan. We can clearly tell it from the rule names that the batch scan check is missing. It's better to have a centralized place for this check, with a name that clearly says it checks table capability. ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing tests Closes apache#25679 from cloud-fan/dsv2-check. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
merge the
V2WriteSupportCheckandV2StreamingScanSupportCheckto one rule:TableCapabilityCheck.Why are the changes needed?
It's a little confusing to have 2 rules to check DS v2 table capability, while one rule says it checks write and another rule says it checks streaming scan. We can clearly tell it from the rule names that the batch scan check is missing.
It's better to have a centralized place for this check, with a name that clearly says it checks table capability.
Does this PR introduce any user-facing change?
No
How was this patch tested?
existing tests