Skip to content

Conversation

@tejasapatil
Copy link
Contributor

What changes were proposed in this pull request?

Jira link : https://issues.apache.org/jira/browse/SPARK-15275

For bucketed tables in Hive, one can also add constraint about column sortedness along with ordering.
As per the spec in [0], CREATE TABLE statement can allow SORT ordering as well:

[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]

[0] : https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable

Currently CatalogTable does not store any information about the sort ordering and just has the names of the sorted columns. This PR adds CatalogSortOrder to hold the sorted column name and the sorted order. Currently this information is not used in query execution but can be used as more support for bucketing is added. Possible advantage is ability to skip rows while performing predicate matching.

How was this patch tested?

Currently trunk does support creating bucketed hive tables. I am relying on existing tests.

@SparkQA
Copy link

SparkQA commented May 11, 2016

Test build #58408 has finished for PR 13059 at commit c2f81ca.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CatalogSortOrder(

@tejasapatil
Copy link
Contributor Author

Came across #12759 and realised that DESC ordering is not supported inherently.

@rxin
Copy link
Contributor

rxin commented Jun 1, 2016

Is it actually useful to support desc order?

@tejasapatil
Copy link
Contributor Author

I initially did not knew that DESC order is inherently not supported in Spark so had worked on this PR. One could add that in the engine but thats not my priority right now. I am working on improving #13231 which will have some parts of this PR. Closing this PR.

@tejasapatil tejasapatil closed this Jun 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants