Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -697,7 +697,7 @@ case class ShowCreateTableCommand(table: TableIdentifier) extends RunnableComman
private def showCreateHiveTable(metadata: CatalogTable): String = {
def reportUnsupportedError(features: Seq[String]): Unit = {
throw new AnalysisException(
s"Failed to execute SHOW CREATE TABLE against table ${metadata.identifier.quotedString}, " +
s"Failed to execute SHOW CREATE TABLE against table/view ${metadata.identifier}, " +
"which is created by Hive and uses the following unsupported feature(s)\n" +
features.map(" - " + _).mkString("\n")
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,10 @@ private[hive] class HiveClientImpl(
unsupportedFeatures += "bucketing"
}

if (h.getTableType == HiveTableType.VIRTUAL_VIEW && partCols.nonEmpty) {
unsupportedFeatures += "partitioned view"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we read partitioned view in Spark SQL? What does partition mean for a view?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, partitioned view is a partition-aware view. Users can add or drop partition after creation. For more details, below is the Hive design doc:
https://cwiki.apache.org/confluence/display/Hive/PartitionedViews

Let me try whether reading partitioned view is partition aware in Spark SQL.

Thanks!

Copy link
Member Author

@gatorsmile gatorsmile Sep 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After digging it deeper and deeper, I am really doubting the initial motivation of partitioned views makes sense...

First, see the Hive design link: https://cwiki.apache.org/confluence/display/Hive/ViewDev

Update 30-Dec-2009: Prasad pointed out that even without supporting materialized views, it may be necessary to provide users with metadata about data dependencies between views and underlying table partitions so that users can avoid seeing inconsistent results during the window when not all partitions have been refreshed with the latest data. One option is to attempt to derive this information automatically (using an overconservative guess in cases where the dependency analysis can't be made smart enough); another is to allow view creators to declare the dependency rules in some fashion as part of the view definition. Based on a design review meeting, we will probably go with the automatic analysis approach once dependency tracking is implemented. The analysis will be performed on-demand, perhaps as part of describing the view or submitting a query job against it. Until this becomes available, users may be able to do their own analysis either via empirical lineage tools or via view->table dependency tracking metadata once it is implemented. See HIVE-1079.
Update 1-Feb-2011: For the latest on this, see PartitionedViews.

Basically, this feature just affects the metadata of views. It does not affect the query execution.

To add the partition info into the views, users have to manually issue the SQL:

ALTER VIEW view_name ADD [IF NOT EXISTS] partition_spec partition_spec ...
ALTER VIEW view_name DROP [IF EXISTS] partition_spec, partition_spec, ...

I read the code changes and test cases in the Hive JIRA: https://issues.apache.org/jira/browse/HIVE-1079. I think we do not need to worry about this Hive-specific feature. The usage scenario is very limited. Maybe the code changes in the existing PR is enough.

If you think we should support it, we might also need the code changes in SHOW PARTITIONS and DESC table PARTITONS. Then, we need to change the fromHivePartition function, because getSD will be NULL for partitioned views; otherwise, we will get a NullPointerException.

}

val properties = Option(h.getParameters).map(_.asScala.toMap).orNull

CatalogTable(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,34 @@ class ShowCreateTableSuite extends QueryTest with SQLTestUtils with TestHiveSing
}
}

test("hive partitioned view is not supported") {
withTable("t1") {
withView("v1") {
sql(
s"""
|CREATE TABLE t1 (c1 INT, c2 STRING)
|PARTITIONED BY (
| p1 BIGINT COMMENT 'bla',
| p2 STRING )
""".stripMargin)

createRawHiveTable(
s"""
|CREATE VIEW v1
|PARTITIONED ON (p1, p2)
|AS SELECT * from t1
""".stripMargin
)

val cause = intercept[AnalysisException] {
sql("SHOW CREATE TABLE v1")
}

assert(cause.getMessage.contains(" - partitioned view"))
}
}
}

private def createRawHiveTable(ddl: String): Unit = {
hiveContext.sharedState.externalCatalog.asInstanceOf[HiveExternalCatalog].client.runSqlHive(ddl)
}
Expand Down