[SPARK-31502][SQL][DOCS] Document identifier in SQL Reference #28277

huaxingao · 2020-04-21T04:24:28Z

What changes were proposed in this pull request?

Document identifier in SQL Reference

Why are the changes needed?

make SQL Reference complete

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

Manually build and check

SparkQA · 2020-04-21T04:44:32Z

Test build #121561 has finished for PR 28277 at commit 2dc2910.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2020-04-21T05:08:31Z

cc @cloud-fan

huaxingao · 2020-04-21T21:44:42Z

also cc @maropu

SparkQA · 2020-04-23T07:15:33Z

Test build #121657 has finished for PR 28277 at commit b0b1262.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

huaxingao · 2020-04-23T07:17:43Z

@cloud-fan @maropu
One more to review. Thank you very much!

maropu · 2020-04-22T00:20:31Z

docs/sql-ref-identifier.md

+
+### Description
+
+An identifier is a string used to identify a database object such as a table, view, schema, column, etc. Spark SQL has regular identifiers and delimited identifiers, which are enclosed within backticks. Both regular identifiers and delimited identifiers are case insensitive.


Both regular identifiers and delimited identifiers are case insensitive.

This behaivour depneds on spark.sql.caseSensitive?

Are regular identifiers and delimited identifiers common words in database-like systems? Actually, the latter one is a table (or relation) identifier? Anyway, I think its better to use consistent words across SQL docs. For example, it seems the ANSI document just uses identifiers like ...as identifiers for table, view, column, function, alias, etc.
https://github.com/apache/spark/blob/master/docs/sql-ref-ansi-compliance.md#sql-keywords

Also, I think we need some examples there like regular identifiers (e.g., alias names, xxx, ...)

Regular identifiers and delimited identifiers are common words. By standard, delimited identifiers are case sensitive.
https://en.wikibooks.org/wiki/SQL_Dialects_Reference/Data_structure_definition/Delimited_identifiers

maropu · 2020-04-22T00:34:40Z

docs/sql-ref-identifier.md

+{% highlight sql %}
+{ letter | digit | '_' } [ , ... ]
+{% endhighlight %}
+Note: If `spark.sql.ansi.enabled` is set to true, ANSI SQL reserved keywords cannot be used as identifiers. If `spark.sql.ansi.enabled` is set to false (this is the default), strict-non-reserved keywords cannot be used as table aliases. Please refer to [ANSI Compliance](sql-ref-ansi-compliance.html) for a complete list of the keywords.


How about just saying it like "Note: If spark.sql.ansi.enabled is set to true, ANSI SQL reserved keywords cannot be used as identifiers. For more details, please refer to ANSI Compliance for a complete list of the keywords."? That's because we don't define what strict-non-reserved is in this page.

maropu · 2020-04-22T00:36:16Z

docs/sql-ref-identifier.md

+<dl>
+  <dt><code><em>c</em></code></dt>
+  <dd>
+    Any character from the character set. Use <code>`</code> to escape <code>`</code>.


How about using the same words in https://github.com/apache/spark/pull/28237/files#diff-3b65d142b02e13e9889a0f558c0e7bc8R46 ?

maropu · 2020-04-22T00:39:11Z

docs/sql-ref-identifier.md

+CREATE TABLE test (a.b int);
+
+-- This CREATE TABLE works
+CREATE TABLE test (`a.b` int);


We don't need to describe a multi-part name like a.b.c for DSv2? @cloud-fan

It's still evolving, we don't mention anything about it in the sql reference for now.

maropu · 2020-04-23T07:30:29Z

Ur, I forgot to submit my reviews last night... some comments might be stale...

SparkQA · 2020-04-23T22:37:24Z

Test build #121703 has finished for PR 28277 at commit 837a691.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2020-04-23T23:19:11Z

Looks fine now cc: @srowen @cloud-fan

cloud-fan · 2020-04-24T04:28:19Z

docs/sql-ref-identifier.md

+
+### Description
+
+An identifier is a string used to identify a database object such as a table, view, schema, column, etc. Spark SQL has regular identifiers and delimited identifiers, which are enclosed within backticks. When `spark.sql.caseSensitive` is set to false (default behavior since Spark 2.4), both regular identifiers and delimited identifiers are case-insensitive.


spark.sql.caseSensitive is an internal config. Shall we ignore it and just say Spark is case insensitive in the doc?

Sure. Will update.

cloud-fan · 2020-04-24T04:40:33Z

docs/sql-ref-identifier.md

+CREATE TABLE test (`a.b` int);
+
+-- This CREATE TABLE fails
+CREATE TABLE test1 (`a`b` int);


this should fail the parser?

Yes. This throws ParseException. Do you want me to change L71 to
-- This CREATE TABLE fails with ParseException?

Can we include the error message in the example?

Updated. Thanks!

SparkQA · 2020-04-24T06:28:45Z

Test build #121722 has finished for PR 28277 at commit 15f0402.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-04-24T06:58:47Z

Test build #121724 has finished for PR 28277 at commit 21b6eb1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-04-24T08:05:40Z

thanks, merging to master/3.0!

### What changes were proposed in this pull request? Document identifier in SQL Reference ### Why are the changes needed? make SQL Reference complete ### Does this PR introduce any user-facing change? Yes <img width="1049" alt="Screen Shot 2020-04-23 at 11 14 10 PM" src="https://user-images.githubusercontent.com/13592258/80180695-2f2a4f00-85b8-11ea-819b-f96872956d05.png"> <img width="1050" alt="Screen Shot 2020-04-23 at 11 32 32 PM" src="https://user-images.githubusercontent.com/13592258/80182062-e6c06080-85ba-11ea-9502-1c38358c97c9.png"> ### How was this patch tested? Manually build and check Closes #28277 from huaxingao/identifier. Authored-by: Huaxin Gao <huaxing@us.ibm.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit b14b980) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

huaxingao · 2020-04-24T16:18:39Z

Thank you all!

probot-autolabeler bot added the DOCS label Apr 21, 2020

huaxingao added 2 commits April 22, 2020 23:57

[SPARK-31502][SQL][DOCS] Document identifier in SQL Reference

3e16242

add blank lines'

b0b1262

huaxingao force-pushed the identifier branch from 2dc2910 to b0b1262 Compare April 23, 2020 06:57

maropu reviewed Apr 23, 2020

View reviewed changes

address comments

837a691

maropu approved these changes Apr 23, 2020

View reviewed changes

cloud-fan reviewed Apr 24, 2020

View reviewed changes

address comments

15f0402

include the error message in the example

21b6eb1

cloud-fan closed this in b14b980 Apr 24, 2020

huaxingao deleted the identifier branch April 24, 2020 16:18


		### Description

		An identifier is a string used to identify a database object such as a table, view, schema, column, etc. Spark SQL has regular identifiers and delimited identifiers, which are enclosed within backticks. Both regular identifiers and delimited identifiers are case insensitive.

[SPARK-31502][SQL][DOCS] Document identifier in SQL Reference #28277

[SPARK-31502][SQL][DOCS] Document identifier in SQL Reference #28277

Uh oh!

Conversation

huaxingao commented Apr 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

SparkQA commented Apr 21, 2020

Uh oh!

gatorsmile commented Apr 21, 2020

Uh oh!

huaxingao commented Apr 21, 2020

Uh oh!

SparkQA commented Apr 23, 2020

Uh oh!

huaxingao commented Apr 23, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan Apr 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maropu commented Apr 23, 2020

Uh oh!

SparkQA commented Apr 23, 2020

Uh oh!

maropu commented Apr 23, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 24, 2020

Uh oh!

SparkQA commented Apr 24, 2020

Uh oh!

cloud-fan commented Apr 24, 2020

Uh oh!

huaxingao commented Apr 24, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

huaxingao commented Apr 21, 2020 •

edited

Loading

cloud-fan Apr 24, 2020 •

edited

Loading