Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,9 @@ abstract class Catalog {
/**
* Returns a list of columns for the given table/view in the specified database.
*
* This API does not support 3 layer namespace since 3.4.0. To use 3 layer namespace,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 layer namespace is a bit confusing, how about

This API does not support specifying the catalog name. To specify the catalog name, please use
`listColumns(qualifiedTableNameWithCatalog)` instead.

Copy link
Member

@dongjoon-hyun dongjoon-hyun Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question to @cloud-fan because I agree with you that that naming is confusing.

3 layer namespace is a bit confusing

I've monitored many commits in the community.

a2c1038031 [SPARK-39579][SQL][PYTHON][R] Make ListFunctions/getFunction/functionExists compatible with 3 layer namespace
6e7a571532 [SPARK-39649][PYTHON] Make listDatabases / getDatabase / listColumns / refreshTable in PySpark support 3-layer-namespace
cbb4e7da69 [SPARK-39646][SQL] Make setCurrentDatabase compatible with 3 layer namespace
b0d297c6d1 [SPARK-39645][SQL] Make getDatabase and listDatabases compatible with 3 layer namespace
8c02823b49 [SPARK-39583][SQL] Make RefreshTable be compatible with 3 layer namespace
ed1a3402d2 [SPARK-39598][PYTHON] Make *cache*, *catalog* in the python side support 3-layer-namespace
c1106fbe22 [SPARK-39597][PYTHON] Make GetTable, TableExists and DatabaseExists in the python side support 3-layer-namespace
1f15f2c6ad [SPARK-39615][SQL] Make listColumns be compatible with 3 layer namespace
b2d249b1aa [SPARK-39555][PYTHON] Make createTable and listTables in the python side support 3-layer-namespace
ca5f7e6c35 [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace
cb55efadea [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

Comments like this.

multi-layer-namespace identifier, then try to ``tableName`` as a normal table

Even in function naming like this.

private def getTable3LNamespace(tableName: String): Table = {

I believe we need a naming rule for this to promote new naming or demote it by preventing further usage. Which way do you prefer, @cloud-fan ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 layer name is more common in traditional databases as the identifier is 3 parts: catalog.schema.name. But Spark is more general and the identifier has n parts: catalog.ns1.ns2....name.

I think qualifiedNameWithCatalog is more accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me clean the naming up in a followup PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much, @cloud-fan !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this discussion!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on it: #37287

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @cloud-fan .

* use listColumns(tableName) instead.
*
* @param dbName is a name that designates a database.
* @param tableName is an unqualified name that designates a table/view.
* @since 2.0.0
Expand Down Expand Up @@ -133,6 +136,9 @@ abstract class Catalog {
* Get the table or view with the specified name in the specified database. This throws an
* AnalysisException when no Table can be found.
*
* This API does not support 3 layer namespace since 3.4.0. To use 3 layer namespace,
* use getTable(tableName) instead.
*
* @since 2.1.0
*/
@throws[AnalysisException]("database or table does not exist")
Expand All @@ -154,6 +160,9 @@ abstract class Catalog {
* Get the function with the specified name. This throws an AnalysisException when the function
* cannot be found.
*
* This API does not support 3 layer namespace since 3.4.0. To use 3 layer namespace,
* use getFunction(functionName) instead.
*
* @param dbName is a name that designates a database.
* @param functionName is an unqualified name that designates a function in the specified database
* @since 2.1.0
Expand Down Expand Up @@ -182,6 +191,9 @@ abstract class Catalog {
/**
* Check if the table or view with the specified name exists in the specified database.
*
* This API does not support 3 layer namespace since 3.4.0. To use 3 layer namespace,
* use tableExists(tableName) instead.
*
* @param dbName is a name that designates a database.
* @param tableName is an unqualified name that designates a table.
* @since 2.1.0
Expand All @@ -202,6 +214,9 @@ abstract class Catalog {
/**
* Check if the function with the specified name exists in the specified database.
*
* This API does not support 3 layer namespace since 3.4.0. To use 3 layer namespace,
* use functionExists(functionName) instead.
*
* @param dbName is a name that designates a database.
* @param functionName is an unqualified name that designates a function.
* @since 2.1.0
Expand Down