-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14127][SQL][WIP] Describe table #12460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@andrewor14 Looking for some early feedback on this as i was thinking to do the same for show table extended. I did have a brief discussion with @gatorsmile on this. |
|
Please resolve the conflicts. : ) |
5b349da to
cfb0eeb
Compare
|
@gatorsmile Thank you. I have resolved the conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @hvanhovell
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit more complicates than I thought. We allow strings here because Hive allows us to use the '$elem', '$keys' and '$values' 'keywords'. That is why I added strings to the rule. I am not sure if we should support this. What do you guys think?
This is what I found in the Hive manual:
DESCRIBE [EXTENDED|FORMATTED] [db_name.]table_name[ col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];See also: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah Herman. Not supporting it would certainly simplify things. FYI - I checked that the unit test case describe_xpath.q which exercises this syntax is not getting tested in HiveCompatibleSuite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, lets remove this from the grammar as well, and just use a dot separated list of identifiers. Actually, are we currently able to deal with nested columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell Hi Herman, I tried very simple scenarios of using nested columns and it seems to work ok. Let me paste the output here.
map
create table mp_t1 (a map <int, string>, b string) row format delimited collection items terminated by '$' map keys terminated by '#';
load data local inpath '/data/mapfile' overwrite into table mp_t1;
select * from mp_t1;
a b
{100:"spark"} ABC
describe extended mp_t1.a.$key$;
Result
======
$key$ int from deserializerStruct
create table ct_t (a struct<n1: string, n2: string>, b string) stored as textfile;
insert into ct_t values (('abc', 'efg'), 'ABC');
spark-sql> select * from ct_t;
{"n1":"abb","n2":"efg"} ABC
spark-sql> describe extended ct_t.a.n1;
OK
n1 string from deserializer Herman, based on hive syntax diagram, i was expecting the following command to work.
describe extended mp_t1.a.'$key$';
However, i get a parse exception and when i remove the quotes it works like following.
describe extended mp_t1.a.$key$
Given this, what kind of changes we need to make to the grammar if we need to support this ? Please let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell Let me work on the grammar change. I will introduce a rule colPathIdentifier which is basically a regular identifier or the set of key, value, elem keywords.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dilipbiswal Do you plan on supporting the key/value/elem keywords and nested elements? Which would be cool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell Yeah. I have attempted to support the key/value/elem keywords. Could you please check to see if there are any issues ? I am also trying to test this a bit more in parallel.
|
Test build #56071 has finished for PR 12460 at commit
|
|
Test build #56333 has finished for PR 12460 at commit
|
|
Test build #56335 has finished for PR 12460 at commit
|
|
Test build #56371 has finished for PR 12460 at commit
|
41cf12d to
eb1c30e
Compare
|
rebased.. |
|
Test build #56394 has finished for PR 12460 at commit
|
|
@liancheng Hi Lian, can you please look over this PR and give some comments. Thanks !! |
eb1c30e to
83c2875
Compare
|
Test build #57062 has finished for PR 12460 at commit
|
83c2875 to
34f6d32
Compare
|
Test build #57106 has finished for PR 12460 at commit
|
|
@dilipbiswal One purpose of re-implementing all DDL as native Spark SQL command is to minimize dependency to Hive so that we can move Hive into a separate data source some day. That said, we really don't want to make these new DDL commands rely on classes like |
|
@liancheng Thank you for your comment. Actually initially i started with the idea of serving the describe command solely from
|
34f6d32 to
319d45b
Compare
|
Test build #57400 has finished for PR 12460 at commit
|
|
It looks like this can be closed because #12844 was merged |
|
@liancheng Hi Lian, in this PR, i had implemented "describe table partition" and "describe column". @viirya - fyi. |
What changes were proposed in this pull request?
This PR adds .support for describing partitions and columns. Support for describing
tables were already in place. The PR moves the code to SessionCatalog/HiveSessionCatalog.
Command Syntax:
How was this patch tested?
Added test cases to DDLCommandSuite to verify the plan. Added some error tests
to HiveCommandSuite. The rest of the coverage should be from existing test cases.