-
Notifications
You must be signed in to change notification settings - Fork 29k
[MINOR][DOCS] Match several documentation changes in Scala to R/Python #17429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Collection function: returns True if the array contains the given value. The collection | ||
| elements and value must be of the same type. | ||
| Collection function: returns null if the array is null, true if the array contains the | ||
| given value, and false otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other documentation in this file use true rather than True. So, I matach this to true. I am willing to sweep if anyone feels this should be fixed.
The reason I removed The collection elements and value must be of the same type is it seems we can provide other types that are implicitly castable.
This is not documented in Scala/R too. So, I instead provided a doctest as an example below in the Python documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like my other comment, probably should say True when in Python, @holdenk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Test build #75211 has finished for PR 17429 at commit
|
| #' array_contains | ||
| #' | ||
| #' Returns true if the array contain the value. | ||
| #' Returns null if the array is null, true if the array contains the value, and false otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for null, we need to be more careful - null in JVM should show up as NA in R.
also, should true be TRUE and false be FALSE to match R type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I agree with being careful. For this PR, I just followed the others. I skimmed again and it seems we have not used the notation for None, True and False in functions.py, and NA, TRUE and FALSE in functions.R.
I can grep and replace.
| Collection function: returns True if the array contains the given value. The collection | ||
| elements and value must be of the same type. | ||
| Collection function: returns null if the array is null, true if the array contains the | ||
| given value, and false otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like my other comment, probably should say True when in Python, @holdenk?
|
I'm fine as-is and then another iteration to handle R/python difference. |
python/pyspark/sql/functions.py
Outdated
| >>> df.select(array_contains(df.data, "a")).collect() | ||
| [Row(array_contains(data, a)=True), Row(array_contains(data, a)=False)] | ||
| >>> df = spark.createDataFrame([(["1", "2", "3"],), ([],)], ['data']) | ||
| >>> df.select(array_contains(df.data, 1)).collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally docstring we keep as simple as possible - any particular reason to add "1" vs the existing "a"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, the data is string array and the given value is a number. This case is about implicit casting case. I added this example and removed The collection elements and value must be of the same type. This one is the bit I described in #17429 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah got it. it's quite subtle though, I wonder if it might confuse than help more...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I could just remove this.. though another advantage of this is to test this function with a implicit cast case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless Holden thought otherwise, let's remove this from docstring and ok to add to test explicitly :)
|
Test build #75247 has finished for PR 17429 at commit
|
|
Test build #75248 has finished for PR 17429 at commit
|
|
thanks, merged to master, since part of the fix in scala was in master only. if you think it should also be in branch-2.1, let me know. |
|
Thank you @felixcheung, IMO, I think it is fine. Let me open a backport including the part of the fix if anyone feels it should be. |
What changes were proposed in this pull request?
This PR proposes to match minor documentations changes in #17399 and #17380 to R/Python.
How was this patch tested?
Manual tests in Python , Python tests via
./python/run-tests.py --module=pyspark-sqland lint-checks for Python/R.