Skip to content

Conversation

@imback82
Copy link
Contributor

@imback82 imback82 commented Jan 4, 2022

What changes were proposed in this pull request?

  1. Move CREATE NAMESPACE parsing tests to CreateNamespaceParserSuite.
  2. Put common CREATE NAMESPACE tests into one trait org.apache.spark.sql.execution.command.CreateNamespaceSuiteBase, and put datasource specific tests to the v1.CreateNamespaceSuite and v2.CreateNamespaceSuite.

The changes follow the approach of #30287.

Why are the changes needed?

  1. The unification will allow to run common CREATE NAMESPACE tests for both DSv1/Hive DSv1 and DSv2
  2. We can detect missing features and differences between DSv1 and DSv2 implementations.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests and new tests.

@github-actions github-actions bot added the SQL label Jan 4, 2022
}
}

test("SPARK-37456: Location in CreateNamespace should be qualified") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be covered in test("namespace with location") for all catalogs.

sql(s"CREATE NAMESPACE IF NOT EXISTS $ns")

// The namespace already exists, so this should fail.
// TODO: non-Hive catalogs throw NamespaceAlreadyExistsException.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new TODO similar to // TODO: HiveExternalCatalog throws DatabaseAlreadyExistsException. Basically, hive catalog throws a different exception (and a slightly different message, requiring alreadyExistErrorMessage).

@imback82
Copy link
Contributor Author

imback82 commented Jan 4, 2022

cc @cloud-fan @MaxGekk. Happy New Year!

*/
class CreateNamespaceSuite extends v1.CreateNamespaceSuiteBase with CommandSuiteBase {
override def commandVersion: String = super[CreateNamespaceSuiteBase].commandVersion
override def alreadyExistErrorMessage: String = s"$notFoundMsgPrefix $namespace already exists"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Hive, there is no ' ' around the namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this exception thrown by hive? or by Spark?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is thrown by Hive: org.apache.hadoop.hive.metastore.api.AlreadyExistsException: Database db already exists and wrapped to AnalysisException here:

throw new AnalysisException(
e.getClass.getCanonicalName + ": " + e.getMessage, cause = Some(e))

override def namespace: String = "db"
override def notFoundMsgPrefix: String = "Database"

test("Create namespace using default warehouse path") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only difference b/w v1 and v2 catalog.

trait CreateNamespaceSuiteBase extends QueryTest with DDLCommandTestUtils {
import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._

override val command = "Create NAMESPACE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
override val command = "Create NAMESPACE"
override val command = "CREATE NAMESPACE"

assert(!path.startsWith("file:/"))

val e = intercept[IllegalArgumentException] {
sql(s"CREATE NAMESPACE $ns Location ''")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sql(s"CREATE NAMESPACE $ns Location ''")
sql(s"CREATE NAMESPACE $ns LOCATION ''")

assert(e.getMessage.contains("Can not create a Path from an empty string"))

val uri = new Path(path).toUri
sql(s"CREATE NAMESPACE $ns Location '$uri'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sql(s"CREATE NAMESPACE $ns Location '$uri'")
sql(s"CREATE NAMESPACE $ns LOCATION '$uri'")

}
}

test("test handling of 'IF NOT EXIST'") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we merge this test case to the one above? It's just adding sql(s"CREATE NAMESPACE IF NOT EXISTS $ns") to the end of the above test case

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 33bdfba Jan 5, 2022
@imback82 imback82 deleted the v2_create_namespace branch January 7, 2022 06:33
cloud-fan pushed a commit that referenced this pull request Jan 11, 2022
### What changes were proposed in this pull request?

This PR proposes to use V2 commands as default as outlined in [SPARK-36588](https://issues.apache.org/jira/browse/SPARK-36588), and this PR migrates `CREATE NAMESPACE` to use v2 command by default.

Note that the work to tests covering both v1/v2 were done in #35093.

### Why are the changes needed?

It's been a while since we introduced the v2 commands,  and it seems reasonable to use v2 commands by default even for the session catalog, with a legacy config to fall back to the v1 commands.

### Does this PR introduce _any_ user-facing change?

The error message will be slightly different if namespace already exists when v2 command is run against v1 catalog:
v1 command: `Database 'db' already exists"`
vs.
v2 command: `Namespace 'db' already exists"`

Also, the error message will be slightly different if namespace already exists when v1 command is run against Hive catalog:
Before: `Database db already exists"`
vs.
After: `Database 'db' already exists"`

### How was this patch tested?

Existing *CreateNamespaceSuite tests cover this PR's change.

Closes #35113 from imback82/migrate_create_namespace.

Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
dchvn pushed a commit to dchvn/spark that referenced this pull request Jan 19, 2022
### What changes were proposed in this pull request?

This PR proposes to use V2 commands as default as outlined in [SPARK-36588](https://issues.apache.org/jira/browse/SPARK-36588), and this PR migrates `CREATE NAMESPACE` to use v2 command by default.

Note that the work to tests covering both v1/v2 were done in apache#35093.

### Why are the changes needed?

It's been a while since we introduced the v2 commands,  and it seems reasonable to use v2 commands by default even for the session catalog, with a legacy config to fall back to the v1 commands.

### Does this PR introduce _any_ user-facing change?

The error message will be slightly different if namespace already exists when v2 command is run against v1 catalog:
v1 command: `Database 'db' already exists"`
vs.
v2 command: `Namespace 'db' already exists"`

Also, the error message will be slightly different if namespace already exists when v1 command is run against Hive catalog:
Before: `Database db already exists"`
vs.
After: `Database 'db' already exists"`

### How was this patch tested?

Existing *CreateNamespaceSuite tests cover this PR's change.

Closes apache#35113 from imback82/migrate_create_namespace.

Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants