Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR proposes to match minor documentations changes in #17399 and #17380 to R/Python.

How was this patch tested?

Manual tests in Python , Python tests via ./python/run-tests.py --module=pyspark-sql and lint-checks for Python/R.

Collection function: returns True if the array contains the given value. The collection
elements and value must be of the same type.
Collection function: returns null if the array is null, true if the array contains the
given value, and false otherwise.
Copy link
Member Author

@HyukjinKwon HyukjinKwon Mar 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other documentation in this file use true rather than True. So, I matach this to true. I am willing to sweep if anyone feels this should be fixed.
The reason I removed The collection elements and value must be of the same type is it seems we can provide other types that are implicitly castable.
This is not documented in Scala/R too. So, I instead provided a doctest as an example below in the Python documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like my other comment, probably should say True when in Python, @holdenk?

Copy link
Member Author

@HyukjinKwon HyukjinKwon Mar 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am willing to grep and replace too. Please let me know @holdenk.

FWIW, If deciding this takes a while and holds off this PR, I would like to ask to merge this as is if you and @holdenk do not strongly feel about this.

@SparkQA
Copy link

SparkQA commented Mar 25, 2017

Test build #75211 has finished for PR 17429 at commit 5f0ebdd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

#' array_contains
#'
#' Returns true if the array contain the value.
#' Returns null if the array is null, true if the array contains the value, and false otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for null, we need to be more careful - null in JVM should show up as NA in R.
also, should true be TRUE and false be FALSE to match R type?

Copy link
Member Author

@HyukjinKwon HyukjinKwon Mar 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I agree with being careful. For this PR, I just followed the others. I skimmed again and it seems we have not used the notation for None, True and False in functions.py, and NA, TRUE and FALSE in functions.R.

I can grep and replace.

Collection function: returns True if the array contains the given value. The collection
elements and value must be of the same type.
Collection function: returns null if the array is null, true if the array contains the
given value, and false otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like my other comment, probably should say True when in Python, @holdenk?

@felixcheung
Copy link
Member

I'm fine as-is and then another iteration to handle R/python difference.

>>> df.select(array_contains(df.data, "a")).collect()
[Row(array_contains(data, a)=True), Row(array_contains(data, a)=False)]
>>> df = spark.createDataFrame([(["1", "2", "3"],), ([],)], ['data'])
>>> df.select(array_contains(df.data, 1)).collect()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally docstring we keep as simple as possible - any particular reason to add "1" vs the existing "a"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the data is string array and the given value is a number. This case is about implicit casting case. I added this example and removed The collection elements and value must be of the same type. This one is the bit I described in #17429 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah got it. it's quite subtle though, I wonder if it might confuse than help more...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I could just remove this.. though another advantage of this is to test this function with a implicit cast case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless Holden thought otherwise, let's remove this from docstring and ok to add to test explicitly :)

@SparkQA
Copy link

SparkQA commented Mar 27, 2017

Test build #75247 has finished for PR 17429 at commit 33f1328.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 27, 2017

Test build #75248 has finished for PR 17429 at commit d05aba5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

thanks, merged to master, since part of the fix in scala was in master only. if you think it should also be in branch-2.1, let me know.

@asfgit asfgit closed this in 3fbf0a5 Mar 27, 2017
@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Mar 27, 2017

Thank you @felixcheung, IMO, I think it is fine. Let me open a backport including the part of the fix if anyone feels it should be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants