-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-47265][SQL][TESTS] Replace createTable(..., schema: StructType, ...) with createTable(..., columns: Array[Column], ...) in UT
#45368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…dDataSourceSuite`
FileBasedDataSourceSuiteFileBasedDataSourceSuite
sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
Show resolved
Hide resolved
…..)` instead of `createTable(..., columns: Array[Column], ...)` in UT
…FileBasedDataSourceSuite`" This reverts commit 3618b30.
FileBasedDataSourceSuitecreateTable(..., schema: StructType, ...) instead of createTable(..., columns: Array[Column], ...) in UT
|
hmm.... Is the substitution relationship written in reverse... |
createTable(..., schema: StructType, ...) instead of createTable(..., columns: Array[Column], ...) in UTcreateTable(..., schema: StructType, ...) with createTable(..., columns: Array[Column], ...) in UT
|
createTable(..., schema: StructType, ...) with createTable(..., columns: Array[Column], ...) in UTcreateTable(..., schema: StructType, ...) with createTable(..., columns: Array[Column], ...) in UT
|
cc @cloud-fan |
|
|
||
| override def createTable( | ||
| ident: Identifier, | ||
| schema: StructType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove this overload now if no one is calling it?
|
|
||
| override def createTable( | ||
| ident: Identifier, | ||
| schema: StructType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| } | ||
|
|
||
| // TODO: remove it when no tests calling this deprecated method. | ||
| override def createTable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will move the specific implementation of createTable from createTable(..., StructType, ...) to createTable(..., Array[Column], ...), so that when we delete the deprecated method createTable(..., StructType, ...) in TableCatalog, we can directly delete the method createTable(..., StructType, ...) here.
| } | ||
|
|
||
| // TODO: remove it when no tests calling this deprecated method. | ||
| // TODO: remove it when the deprecated method `createTable(..., StructType, ...)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove it now? Spark won't call this method, only tests will
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to delete this now, then we need to remove the method TableCatalog#createTable(..., StructType, ...),
I am a bit worried that although TableCatalog is marked with @Evolving, third parties (eg: iceberg) relying on this interface will immediately feel the break changes, such as:
https://github.com/apache/iceberg/blob/f0b3733e0e7071894bdf63133bcc8c00754a1080/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java#L172-L182

Do we really need to do this now?
At present, Spark can indeed directly delete it internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, so we only keep it to pass compilation? Then we can just throw an internal exception in this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, very good suggestion! I will try it out, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply from the definition of @Evolving, it is indeed possible to make this change in 4.0.
I'm not sure if we need to pay attention to direct calls to this method from non-Spark internal code(s there really such a possibility?). If so, throwing an exception only delays the problem from compile time to runtime...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not about callers, but about existing TableCatalog implementations and it's better to not break them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| null // Return null to save the `loadTable` call for CREATE TABLE without AS SELECT. | ||
| } | ||
|
|
||
| // TODO: remove it when the deprecated method `createTable(..., StructType, ...)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| } | ||
|
|
||
| // TODO: remove it when no tests calling this deprecated method. | ||
| // TODO: remove it when the deprecated method `createTable(..., StructType, ...)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove this TODO? Isn't the current solution the final one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually a bit hesitant, because if we eventually decide to completely remove this in a future version, keeping it now may be a good reminder, and I want to hear everyone's opinions. @cloud-fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the TODO now. We should never remove the API and implementation will have to provide a fake implementation that just fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding some new comments to explain the reason for retaining this API and throwing exceptions by default while removing the TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the TODO now. We should never remove the API and implementation will have to provide a fake implementation that just fails.
Done.
| null // Return null to save the `loadTable` call for CREATE TABLE without AS SELECT. | ||
| } | ||
|
|
||
| // TODO: remove it when the deprecated method `createTable(..., StructType, ...)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
thanks, merging to master! |
…fail ### What changes were proposed in this pull request? This is a followup of #45368 . `DelegatingCatalogExtension` can invoke any method in `V2SessionCatalog`, and we should make sure the two overloads of `createTable` in `V2SessionCatalog` both work. ### Why are the changes needed? Avoid breaking custom catalogs that extend `DelegatingCatalogExtension`. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? N/A ### Was this patch authored or co-authored using generative AI tooling? no Closes #47755 from cloud-fan/create-table. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Kent Yao <[email protected]>
What changes were proposed in this pull request?
The pr aims to use
createTable(..., schema: StructType, ...)instead ofcreateTable(..., columns: Array[Column], ...)in UT.Why are the changes needed?
Because the
TableCatalog#createTable(..., schema: StructType, ...)method has been marked asdeprecatedafter version3.4.0.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.