[SPARK-41212][CONNECT][PYTHON] Implement `DataFrame.isEmpty` #38734

zhengruifeng · 2022-11-21T03:44:00Z

What changes were proposed in this pull request?

Implement DataFrame.isEmpty

Why are the changes needed?

API Coverage

Does this PR introduce any user-facing change?

Yes, new api

How was this patch tested?

added UT

zhengruifeng · 2022-11-22T01:05:48Z

@HyukjinKwon @amaliujia @cloud-fan @grundprinzip

amaliujia · 2022-11-22T01:54:28Z

python/pyspark/sql/connect/dataframe.py

I think we talked about that do not do cache now but thinking about build general caching layer cc @HyukjinKwon

here #38546 (comment)

I am fine with that but we can even do the caching in the DataFrame instead of Spark Connect I guess (?)

oh, let me update it. Maybe it is time to design how to do the caching, I will work on it.

amaliujia · 2022-11-22T02:20:28Z

python/pyspark/sql/connect/dataframe.py

Nit:

When I read this, I actually not sure if it means the size of DataFrame or it means DataFrame is None. I guess it is the size. Not sure if there is a better way to clarify this.

this doc was copied from pyspark and scala api. I think the dataframe itself is expected to be not None, otherwise, this method can not be called.

amaliujia · 2022-11-22T02:20:47Z

LGTM

cloud-fan · 2022-11-22T07:33:42Z

python/pyspark/sql/connect/dataframe.py

+        bool
+            Whether it's empty DataFrame or not.
+        """
+        return len(self.take(1)) == 0


This makes me think that maybe we should have some shared code between pyspark and python connect client, to share some API definitions (api doc, function signature) and functions with default implementations. cc @HyukjinKwon

zhengruifeng · 2022-11-22T09:16:11Z

merged into master, thanks all for reviews

### What changes were proposed in this pull request? Implement `DataFrame.isEmpty` ### Why are the changes needed? API Coverage ### Does this PR introduce _any_ user-facing change? Yes, new api ### How was this patch tested? added UT Closes apache#38734 from zhengruifeng/connect_df_is_empty. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>

github-actions bot added CONNECT CORE PYTHON SQL labels Nov 21, 2022

amaliujia reviewed Nov 22, 2022

View reviewed changes

HyukjinKwon approved these changes Nov 22, 2022

View reviewed changes

amaliujia reviewed Nov 22, 2022

View reviewed changes

zhengruifeng added 2 commits November 22, 2022 13:14

init

c6712bf

remove cache

34accba

zhengruifeng force-pushed the connect_df_is_empty branch from e6769ba to 34accba Compare November 22, 2022 05:15

cloud-fan approved these changes Nov 22, 2022

View reviewed changes

cloud-fan reviewed Nov 22, 2022

View reviewed changes

zhengruifeng closed this in e16dd7c Nov 22, 2022

zhengruifeng deleted the connect_df_is_empty branch November 22, 2022 09:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-41212][CONNECT][PYTHON] Implement `DataFrame.isEmpty` #38734

[SPARK-41212][CONNECT][PYTHON] Implement `DataFrame.isEmpty` #38734

Uh oh!

zhengruifeng commented Nov 21, 2022

Uh oh!

zhengruifeng commented Nov 22, 2022

Uh oh!

amaliujia Nov 22, 2022

Uh oh!

amaliujia Nov 22, 2022

Uh oh!

HyukjinKwon Nov 22, 2022

Uh oh!

zhengruifeng Nov 22, 2022

Uh oh!

amaliujia Nov 22, 2022 •

edited

Loading

Uh oh!

zhengruifeng Nov 22, 2022

Uh oh!

amaliujia commented Nov 22, 2022

Uh oh!

cloud-fan Nov 22, 2022

Uh oh!

zhengruifeng commented Nov 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-41212][CONNECT][PYTHON] Implement DataFrame.isEmpty #38734

[SPARK-41212][CONNECT][PYTHON] Implement DataFrame.isEmpty #38734

Uh oh!

Conversation

zhengruifeng commented Nov 21, 2022

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

zhengruifeng commented Nov 22, 2022

Uh oh!

amaliujia Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

amaliujia Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

zhengruifeng Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

amaliujia Nov 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhengruifeng Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

amaliujia commented Nov 22, 2022

Uh oh!

cloud-fan Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

zhengruifeng commented Nov 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-41212][CONNECT][PYTHON] Implement `DataFrame.isEmpty` #38734

[SPARK-41212][CONNECT][PYTHON] Implement `DataFrame.isEmpty` #38734

amaliujia Nov 22, 2022 •

edited

Loading