Skip to content

Conversation

@liancheng
Copy link
Contributor

@liancheng liancheng commented May 12, 2016

What changes were proposed in this pull request?

This is a follow-up of #12781. It adds native SHOW CREATE TABLE support for Hive tables and views. A new field hasUnsupportedFeatures is added to CatalogTable to indicate whether all table metadata retrieved from the concrete underlying external catalog (i.e. Hive metastore in this case) can be mapped to fields in CatalogTable. This flag is useful when the target Hive table contains structures that can't be handled by Spark SQL, e.g., skewed columns and storage handler, etc..

How was this patch tested?

New test cases are added in ShowCreateTableSuite to do round-trip tests.

@liancheng liancheng changed the title [SPARK-14346][SQL] Native SHOW CREATE TABLE for Hive tables/views [SPARK-14346][SQL][WIP] Native SHOW CREATE TABLE for Hive tables/views May 12, 2016
@liancheng liancheng changed the title [SPARK-14346][SQL][WIP] Native SHOW CREATE TABLE for Hive tables/views [SPARK-14346][SQL] Native SHOW CREATE TABLE for Hive tables/views May 12, 2016
@liancheng
Copy link
Contributor Author

cc @yhuai @cloud-fan

@yhuai
Copy link
Contributor

yhuai commented May 12, 2016

test this please

}
builder ++= metadata.viewOriginalText.mkString(" AS\n", "", "\n")
} else {
showHiveTableDataColumns(metadata, builder)
Copy link
Contributor

@xwu0226 xwu0226 May 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be table comment clause generated here, after the data column list? COMMENT '.....'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@SparkQA
Copy link

SparkQA commented May 12, 2016

Test build #58504 has finished for PR 13079 at commit 2020d77.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* Note that Hive's metastore also tracks skewed columns. We should consider adding that in the
* future once we have a better understanding of how we want to handle skewed columns.
*
* Field `fullyMapped` is used to indicate whether all table metadata entries retrieved from the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use @param here

@SparkQA
Copy link

SparkQA commented May 17, 2016

Test build #58697 has finished for PR 13079 at commit 74fd5d3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

throw new UnsupportedOperationException(
s"Failed to execute SHOW CREATE TABLE against table ${metadata.identifier.quotedString}, " +
"because it contains table structure(s) (e.g. skewed columns) that Spark SQL doesn't " +
"support yet."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to explicitly say that the table was created by Hive with keywords that we do not support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean we should mention the exact keywords, or just saying that the table is created by Hive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced hasUnsupportedFeatures: Boolean with unsupportedFeatures: Seq[String], which holds string descriptions of unsupported features, so that we can list them in the exception message.

@yhuai
Copy link
Contributor

yhuai commented May 17, 2016

LGTM. I am merging this to master and 2.0. Let's change the error message and exception class in a separate pr.

asfgit pushed a commit that referenced this pull request May 17, 2016
## What changes were proposed in this pull request?

This is a follow-up of #12781. It adds native `SHOW CREATE TABLE` support for Hive tables and views. A new field `hasUnsupportedFeatures` is added to `CatalogTable` to indicate whether all table metadata retrieved from the concrete underlying external catalog (i.e. Hive metastore in this case) can be mapped to fields in `CatalogTable`. This flag is useful when the target Hive table contains structures that can't be handled by Spark SQL, e.g., skewed columns and storage handler, etc..

## How was this patch tested?

New test cases are added in `ShowCreateTableSuite` to do round-trip tests.

Author: Cheng Lian <[email protected]>

Closes #13079 from liancheng/spark-14346-show-create-table-for-hive-tables.

(cherry picked from commit b674e67)
Signed-off-by: Yin Huai <[email protected]>
@asfgit asfgit closed this in b674e67 May 17, 2016

require(metadata.identifier.database == Some(databaseName),
require(
metadata.identifier.database.contains(databaseName),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contains is not in scala 2.10. Let me fixing the build.

asfgit pushed a commit that referenced this pull request May 18, 2016
## What changes were proposed in this pull request?
Scala 2.10 build was broken by #13079. I am reverting the change of that line.

Author: Yin Huai <[email protected]>

Closes #13157 from yhuai/SPARK-14346-fix-scala2.10.

(cherry picked from commit 2a5db9c)
Signed-off-by: Yin Huai <[email protected]>
asfgit pushed a commit that referenced this pull request May 18, 2016
## What changes were proposed in this pull request?
Scala 2.10 build was broken by #13079. I am reverting the change of that line.

Author: Yin Huai <[email protected]>

Closes #13157 from yhuai/SPARK-14346-fix-scala2.10.
@liancheng liancheng deleted the spark-14346-show-create-table-for-hive-tables branch May 18, 2016 01:45
asfgit pushed a commit that referenced this pull request May 19, 2016
…LE output

## What changes were proposed in this pull request?

This PR is a follow-up of #13079. It replaces `hasUnsupportedFeatures: Boolean` in `CatalogTable` with `unsupportedFeatures: Seq[String]`, which contains unsupported Hive features of the underlying Hive table. In this way, we can accurately report all unsupported Hive features in the exception message.

## How was this patch tested?

Updated existing test case to check exception message.

Author: Cheng Lian <[email protected]>

Closes #13173 from liancheng/spark-14346-follow-up.
asfgit pushed a commit that referenced this pull request May 19, 2016
…LE output

## What changes were proposed in this pull request?

This PR is a follow-up of #13079. It replaces `hasUnsupportedFeatures: Boolean` in `CatalogTable` with `unsupportedFeatures: Seq[String]`, which contains unsupported Hive features of the underlying Hive table. In this way, we can accurately report all unsupported Hive features in the exception message.

## How was this patch tested?

Updated existing test case to check exception message.

Author: Cheng Lian <[email protected]>

Closes #13173 from liancheng/spark-14346-follow-up.

(cherry picked from commit 6ac1c3a)
Signed-off-by: Andrew Or <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants