Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Implement DataFrame.isEmpty

Why are the changes needed?

API Coverage

Does this PR introduce any user-facing change?

Yes, new api

How was this patch tested?

added UT

@zhengruifeng
Copy link
Contributor Author

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we talked about that do not do cache now but thinking about build general caching layer cc @HyukjinKwon

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with that but we can even do the caching in the DataFrame instead of Spark Connect I guess (?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, let me update it. Maybe it is time to design how to do the caching, I will work on it.

Copy link
Contributor

@amaliujia amaliujia Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

When I read this, I actually not sure if it means the size of DataFrame or it means DataFrame is None. I guess it is the size. Not sure if there is a better way to clarify this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doc was copied from pyspark and scala api. I think the dataframe itself is expected to be not None, otherwise, this method can not be called.

@amaliujia
Copy link
Contributor

LGTM

bool
Whether it's empty DataFrame or not.
"""
return len(self.take(1)) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me think that maybe we should have some shared code between pyspark and python connect client, to share some API definitions (api doc, function signature) and functions with default implementations. cc @HyukjinKwon

@zhengruifeng
Copy link
Contributor Author

merged into master, thanks all for reviews

@zhengruifeng zhengruifeng deleted the connect_df_is_empty branch November 22, 2022 09:16
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
### What changes were proposed in this pull request?
Implement `DataFrame.isEmpty`

### Why are the changes needed?
API Coverage

### Does this PR introduce _any_ user-facing change?
Yes, new api

### How was this patch tested?
added UT

Closes apache#38734 from zhengruifeng/connect_df_is_empty.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 15, 2022
### What changes were proposed in this pull request?
Implement `DataFrame.isEmpty`

### Why are the changes needed?
API Coverage

### Does this PR introduce _any_ user-facing change?
Yes, new api

### How was this patch tested?
added UT

Closes apache#38734 from zhengruifeng/connect_df_is_empty.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
### What changes were proposed in this pull request?
Implement `DataFrame.isEmpty`

### Why are the changes needed?
API Coverage

### Does this PR introduce _any_ user-facing change?
Yes, new api

### How was this patch tested?
added UT

Closes apache#38734 from zhengruifeng/connect_df_is_empty.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants