Skip to content

Conversation

@xuanyuanking
Copy link
Member

@xuanyuanking xuanyuanking commented Sep 9, 2018

What changes were proposed in this pull request?

Update the document for the behavior change in PySpark Row creation #22140.

How was this patch tested?

Existing UT.

@SparkQA
Copy link

SparkQA commented Sep 9, 2018

Test build #95842 has finished for PR 22369 at commit d257a38.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@BryanCutler BryanCutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xuanyuanking! I had some thoughts on slightly different wording.

## Upgrading From Spark SQL 2.3.0 to 2.3.1 and above

- As of version 2.3.1 Arrow functionality, including `pandas_udf` and `toPandas()`/`createDataFrame()` with `spark.sql.execution.arrow.enabled` set to `True`, has been marked as experimental. These are still evolving and not currently recommended for use in production.
- In version 2.3.1 and earlier, it is possible for PySpark to create a Row object by providing more value than column number through the customized Row class. Since Spark 2.3.3, Spark will confirm value length is less or equal than column length in PySpark. See [SPARK-25072](https://issues.apache.org/jira/browse/SPARK-25072) for details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe say ..by providing more values than number of fields through a customized Row class. As of Spark 2.3.3, PySpark will raise a ValueError if the number of values are more than the number of fields. See...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bryan, I'll address this after discussion.

@HyukjinKwon
Copy link
Member

@xuanyuanking, no need to rush. Let's wait and discuss a bit more before proposing a change.

@xuanyuanking
Copy link
Member Author

Got it, thanks @HyukjinKwon.

@xuanyuanking
Copy link
Member Author

As the comment in #22140 (comment), I think this doc change is no more needed, I just close this, thanks @BryanCutler and @HyukjinKwon !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants