-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default #35113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| if isSessionCatalog(catalog) => | ||
| if (name.length != 1) { | ||
| throw QueryCompilationErrors.invalidDatabaseNameError(name.quoted) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is handled in DatabaseNameInSessionCatalog.
| with command.TestsV1AndV2Commands { | ||
| override def namespace: String = "db" | ||
| override def notFoundMsgPrefix: String = "Database" | ||
| override def notFoundMsgPrefix: String = if (runningV1Command) "Database" else "Namespace" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to update the test now that v1 catalogs are run against both v1 and v2 commands (after adding if conf.useV1Command) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we move this to the base class as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
cc @cloud-fan |
| super.test(testName, testTags: _*) { | ||
| // Need to set command version inside this test function so that | ||
| // the correct command version is available in each test. | ||
| setCommandVersion() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I'm confused. We set the version right before super.test(....), do you mean it's being set again during super.test(....)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This def test doesn't run testFun when it's invoked; it only registers the test). So by the time, testFunc is actually run, _version will always be set to "V2" (useV1Command == false). So we need to capture this inside the lambda that's passed into super.test.
Also, note that we need to call setCommandVersion() before calling super.test because super.test being called is utilizing the commandVersion to set the right test name:
spark/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLCommandTestUtils.scala
Lines 56 to 57 in 527e842
| val testNamePrefix = s"$command using $catalogVersion catalog $commandVersion command" | |
| super.test(s"$testNamePrefix: $testName", testTags: _*)(testFun) |
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/CreateNamespaceSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Show resolved
Hide resolved
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
Outdated
Show resolved
Hide resolved
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/CreateNamespaceSuite.scala
Show resolved
Hide resolved
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
Outdated
Show resolved
Hide resolved
| "org.apache.hadoop.hive.metastore.api.AlreadyExistsException") | ||
| && exception.getMessage.contains( | ||
| s"Database ${dbDefinition.name} already exists")) { | ||
| Some(new DatabaseAlreadyExistsException(dbDefinition.name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we test it in the VersionsSuite, by updating the existing $version: createDatabase test? We need to make sure this wrap exception logic works for all hive versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test. Thanks!
|
thanks, merging to master! |
### What changes were proposed in this pull request? This PR proposes to use V2 commands as default as outlined in [SPARK-36588](https://issues.apache.org/jira/browse/SPARK-36588), and this PR migrates `CREATE NAMESPACE` to use v2 command by default. Note that the work to tests covering both v1/v2 were done in apache#35093. ### Why are the changes needed? It's been a while since we introduced the v2 commands, and it seems reasonable to use v2 commands by default even for the session catalog, with a legacy config to fall back to the v1 commands. ### Does this PR introduce _any_ user-facing change? The error message will be slightly different if namespace already exists when v2 command is run against v1 catalog: v1 command: `Database 'db' already exists"` vs. v2 command: `Namespace 'db' already exists"` Also, the error message will be slightly different if namespace already exists when v1 command is run against Hive catalog: Before: `Database db already exists"` vs. After: `Database 'db' already exists"` ### How was this patch tested? Existing *CreateNamespaceSuite tests cover this PR's change. Closes apache#35113 from imback82/migrate_create_namespace. Authored-by: Terry Kim <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…e/drop database to HiveClientImpl ### What changes were proposed in this pull request? #35113 introduced `HiveExternalCatalog.withClientWrappingException` to to wrap a Hive exception to a Spark exception. However, there was a limitation of testing it against different Hive versions as outlined in #35173 (comment). This PR proposes to revert `HiveExternalCatalog.withClientWrappingException` and move wrapping logic to `HiveClientImpl`. ### Why are the changes needed? 1. To make testing against different Hive versions better. 2. For Hive 0.12, the wrapping logic was not working correctly for `dropDatabase` because the message was not matching. ### Does this PR introduce _any_ user-facing change? Yes, for Hive 0.12, the exception message will now be consistent with other versions. Before: `InvalidOperationException(message:Database db is not empty)` After: `Cannot drop a non-empty database: db. Use CASCADE option to drop a non-empty database` ### How was this patch tested? Added a new test coverage. Closes #35173 from imback82/drop_db_withClientWrappingException. Authored-by: Terry Kim <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…e/drop database to HiveClientImpl ### What changes were proposed in this pull request? apache#35113 introduced `HiveExternalCatalog.withClientWrappingException` to to wrap a Hive exception to a Spark exception. However, there was a limitation of testing it against different Hive versions as outlined in apache#35173 (comment). This PR proposes to revert `HiveExternalCatalog.withClientWrappingException` and move wrapping logic to `HiveClientImpl`. ### Why are the changes needed? 1. To make testing against different Hive versions better. 2. For Hive 0.12, the wrapping logic was not working correctly for `dropDatabase` because the message was not matching. ### Does this PR introduce _any_ user-facing change? Yes, for Hive 0.12, the exception message will now be consistent with other versions. Before: `InvalidOperationException(message:Database db is not empty)` After: `Cannot drop a non-empty database: db. Use CASCADE option to drop a non-empty database` ### How was this patch tested? Added a new test coverage. Closes apache#35173 from imback82/drop_db_withClientWrappingException. Authored-by: Terry Kim <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
This PR proposes to use V2 commands as default as outlined in SPARK-36588, and this PR migrates
CREATE NAMESPACEto use v2 command by default.Note that the work to tests covering both v1/v2 were done in #35093.
Why are the changes needed?
It's been a while since we introduced the v2 commands, and it seems reasonable to use v2 commands by default even for the session catalog, with a legacy config to fall back to the v1 commands.
Does this PR introduce any user-facing change?
The error message will be slightly different if namespace already exists when v2 command is run against v1 catalog:
v1 command:
Database 'db' already exists"vs.
v2 command:
Namespace 'db' already exists"Also, the error message will be slightly different if namespace already exists when v1 command is run against Hive catalog:
Before:
Database db already exists"vs.
After:
Database 'db' already exists"How was this patch tested?
Existing *CreateNamespaceSuite tests cover this PR's change.