Skip to content

[SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark#27251

Closed
HyukjinKwon wants to merge 2 commits intoapache:masterfrom
HyukjinKwon:SPARK-30539
Closed

[SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark#27251
HyukjinKwon wants to merge 2 commits intoapache:masterfrom
HyukjinKwon:SPARK-30539

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

#26809 added Dataset.tail API. It should be good to have it in PySpark API as well.

Why are the changes needed?

To support consistent APIs.

Does this PR introduce any user-facing change?

No. It adds a new API.

How was this patch tested?

Manually tested and doctest was added.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (only a minor comment on doc-string)

@SparkQA
Copy link

SparkQA commented Jan 17, 2020

Test build #116900 has finished for PR 27251 at commit 565e3be.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 17, 2020

Test build #116912 has finished for PR 27251 at commit 438c8c9.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jan 17, 2020

Test build #116920 has finished for PR 27251 at commit 438c8c9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Also, cc @BryanCutler .

@dongjoon-hyun
Copy link
Member

Merged to master. Thank you, @HyukjinKwon .

@HyukjinKwon
Copy link
Member Author

Thanks @dongjoon-hyun!

@zero323
Copy link
Member

zero323 commented Jan 18, 2020

Just thinking out loud ‒ should there be tailToPandas variant? For head we can easily do df.limit(n).toPandas().

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Jan 20, 2020

We could have things like df.limit(n, reverse=True).toPandas(); however, this implementation was not added yet because we should think how it works with other SQL APIs. It looks pretty clear to have it in Scala/Python APIs; however, I wasn't sure about SQL.

@HyukjinKwon HyukjinKwon deleted the SPARK-30539 branch March 3, 2020 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants