Skip to content

Conversation

@tanelk
Copy link
Contributor

@tanelk tanelk commented Sep 13, 2020

What changes were proposed in this pull request?

Made sure, that all the expressions in the FunctionRegistry have the fields usage, examples and since filled in their ExpressionDescription. Added UT to ExpressionInfoSuite, to make sure, that all new expressions will also fill those fields.

Why are the changes needed?

Documentation improvement

Does this PR introduce any user-facing change?

Better generated SQL built in functions documentation

How was this patch tested?

Checked the fix version in the following jiras:
SPARK-1251 - UnaryMinus, Add, Subtract, Multiply, Divide, Remainder, Explode, Not, In, And, Or, Equals, LessThan, LessThanOrEqual, GreaterThan, GreaterThanOrEqual, If, Cast
SPARK-2053 - CaseWhen
SPARK-2665 - EqualNullSafe
SPARK-3176 - Abs
SPARK-6542 - CreateStruct
SPARK-7135 - MonotonicallyIncreasingID
SPARK-7152 - SparkPartitionID
SPARK-7295 - bitwiseAND, bitwiseOR, bitwiseXOR, bitwiseNOT
SPARK-8005 - InputFileName
SPARK-8203 - Greatest
SPARK-8204 - Least
SPARK-8220 - UnaryPositive
SPARK-8221 - Pmod
SPARK-8230 - Size
SPARK-8231 - ArrayContains
SPARK-8232 - SortArray
SPARK-8234 - md5
SPARK-8235 - sha1
SPARK-8236 - crc32
SPARK-8237 - sha2
SPARK-8240 - Concat
SPARK-8246 - GetJsonObject
SPARK-8407 - CreateNamedStruct
SPARK-9617 - JsonTuple
SPARK-10810 - CurrentDatabase
SPARK-12480 - Murmur3Hash
SPARK-14061 - CreateMap
SPARK-14160 - TimeWindow
SPARK-14580 - AssertTrue
SPARK-16274 - XPathBoolean
SPARK-16278 - MapKeys
SPARK-16279 - MapValues
SPARK-16284 - CallMethodViaReflection
SPARK-16286 - Stack
SPARK-16288 - Inline
SPARK-16289 - PosExplode
SPARK-16318 - XPathShort, XPathInt, XPathLong, XPathFloat, XPathDouble, XPathString, XPathList
SPARK-16730 - Cast aliases
SPARK-17495 - HiveHash
SPARK-18702 - InputFileBlockStart, InputFileBlockLength
SPARK-20910 - UUID

@tanelk
Copy link
Contributor Author

tanelk commented Sep 13, 2020

cc @maropu and @HyukjinKwon , you reviewed the previous similar pull request

@tanelk tanelk changed the title [WIP][SPARK-32870][DOCS][SQL] Make sure that all expressions have their ExpressionDescription filled [SPARK-32870][DOCS][SQL] Make sure that all expressions have their ExpressionDescription filled Sep 13, 2020
@HyukjinKwon
Copy link
Member

ok to test

@HyukjinKwon
Copy link
Member

@HyukjinKwon
Copy link
Member

Thanks for working on this, @tanelk.

@SparkQA
Copy link

SparkQA commented Sep 14, 2020

Test build #128610 has finished for PR 29743 at commit c7488e0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 14, 2020

Test build #128629 has finished for PR 29743 at commit 9b9bfca.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 14, 2020

Test build #128636 has finished for PR 29743 at commit 802f338.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 14, 2020

Test build #128640 has finished for PR 29743 at commit b67c967.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hard work, @tanelk! This update looks very useful. I've checked them and the versions I think you investigated in this PR looks correct.

@SparkQA
Copy link

SparkQA commented Sep 14, 2020

Test build #128646 has finished for PR 29743 at commit 98287e4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 14, 2020

Test build #128641 has finished for PR 29743 at commit 930394c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 14, 2020

Test build #128644 has finished for PR 29743 at commit 4e9291e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 14, 2020

Test build #128655 has finished for PR 29743 at commit 3ba2515.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

assert(info.getUsage.nonEmpty)
assert(info.getExamples.startsWith("\n Examples:\n"))
assert(info.getExamples.endsWith("\n "))
assert(info.getSince.matches("[0-9]+\\.[0-9]+\\.[0-9]+"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks these checks have already been done in the ExpressionInfo constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These ones are a bit more strict.
For example this extra space made the Example for typeof not appear in http://spark.apache.org/docs/latest/api/sql/#typeof

https://github.com/apache/spark/pull/29743/files#diff-25282ab1377a3d87999d1d0d7a8ec270R208-R214

I didn't want to add these checks to the constructor, because I'm afraid that they might break some UDFs and also I have no way to exclude some from the checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to add these checks to the constructor, because I'm afraid that they might break some UDFs and also I have no way to exclude some from the checks.

If we make the checks more strict in the ctor and we find existing expressions throwing exceptions, I think it is okay just to fix them. Any problem there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, problems with current ones - only some dummy functions in tests.

@SparkQA
Copy link

SparkQA commented Sep 15, 2020

Test build #128692 has finished for PR 29743 at commit 499c32c.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 15, 2020

Test build #128697 has finished for PR 29743 at commit 2bcc663.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

# Conflicts:
#	sql/core/src/test/resources/sql-functions/sql-expression-schema.md
@SparkQA
Copy link

SparkQA commented Sep 19, 2020

Test build #128892 has finished for PR 29743 at commit 4162b41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but I will leave it to @maropu.

@maropu
Copy link
Member

maropu commented Sep 23, 2020

Thanks! Merged to master.

@maropu maropu closed this Sep 23, 2020
maropu pushed a commit that referenced this pull request Sep 23, 2020
…pressionDescription filled

### What changes were proposed in this pull request?

Made sure, that all the expressions in the `FunctionRegistry ` have the fields `usage`, `examples` and `since` filled in their `ExpressionDescription`. Added UT to `ExpressionInfoSuite`, to make sure, that all new expressions will also fill those fields.

### Why are the changes needed?

Documentation improvement

### Does this PR introduce _any_ user-facing change?

Better generated SQL built in functions documentation

### How was this patch tested?

Checked the fix version in the following jiras:
SPARK-1251 - UnaryMinus, Add, Subtract, Multiply, Divide, Remainder, Explode, Not, In, And, Or, Equals, LessThan, LessThanOrEqual, GreaterThan, GreaterThanOrEqual, If, Cast
SPARK-2053 - CaseWhen
SPARK-2665 - EqualNullSafe
SPARK-3176 - Abs
SPARK-6542 - CreateStruct
SPARK-7135 - MonotonicallyIncreasingID
SPARK-7152 - SparkPartitionID
SPARK-7295 - bitwiseAND, bitwiseOR, bitwiseXOR, bitwiseNOT
SPARK-8005 - InputFileName
SPARK-8203 - Greatest
SPARK-8204 - Least
SPARK-8220 - UnaryPositive
SPARK-8221 - Pmod
SPARK-8230 - Size
SPARK-8231 - ArrayContains
SPARK-8232 - SortArray
SPARK-8234 - md5
SPARK-8235 - sha1
SPARK-8236 - crc32
SPARK-8237 - sha2
SPARK-8240 - Concat
SPARK-8246 - GetJsonObject
SPARK-8407 - CreateNamedStruct
SPARK-9617 - JsonTuple
SPARK-10810 - CurrentDatabase
SPARK-12480 - Murmur3Hash
SPARK-14061 - CreateMap
SPARK-14160 - TimeWindow
SPARK-14580 - AssertTrue
SPARK-16274 - XPathBoolean
SPARK-16278 - MapKeys
SPARK-16279 - MapValues
SPARK-16284 - CallMethodViaReflection
SPARK-16286 - Stack
SPARK-16288 - Inline
SPARK-16289 - PosExplode
SPARK-16318 - XPathShort, XPathInt, XPathLong, XPathFloat, XPathDouble, XPathString, XPathList
SPARK-16730 - Cast aliases
SPARK-17495 - HiveHash
SPARK-18702 - InputFileBlockStart, InputFileBlockLength
SPARK-20910 - UUID

Closes #29743 from tanelk/SPARK-32870.

Authored-by: [email protected] <[email protected]>
Signed-off-by: Takeshi Yamamuro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants