Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions python/pyspark/sql/connect/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,18 @@ def withPlan(cls, plan: plan.LogicalPlan, session: "RemoteSparkSession") -> "Dat
new_frame._plan = plan
return new_frame

def isEmpty(self) -> bool:
"""Returns ``True`` if this :class:`DataFrame` is empty.
Copy link
Contributor

@amaliujia amaliujia Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

When I read this, I actually not sure if it means the size of DataFrame or it means DataFrame is None. I guess it is the size. Not sure if there is a better way to clarify this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doc was copied from pyspark and scala api. I think the dataframe itself is expected to be not None, otherwise, this method can not be called.


.. versionadded:: 3.4.0

Returns
-------
bool
Whether it's empty DataFrame or not.
"""
return len(self.take(1)) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me think that maybe we should have some shared code between pyspark and python connect client, to share some API definitions (api doc, function signature) and functions with default implementations. cc @HyukjinKwon


def select(self, *cols: "ExpressionOrString") -> "DataFrame":
return DataFrame.withPlan(plan.Project(self._plan, *cols), session=self._session)

Expand Down
5 changes: 5 additions & 0 deletions python/pyspark/sql/tests/connect/test_connect_basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,11 @@ def test_empty_dataset(self):
self.assertEqual(1, len(pdf.columns)) # one column
self.assertEqual("X", pdf.columns[0])

def test_is_empty(self):
# SPARK-41212: Test is empty
self.assertFalse(self.connect.sql("SELECT 1 AS X").isEmpty())
self.assertTrue(self.connect.sql("SELECT 1 AS X LIMIT 0").isEmpty())

def test_session(self):
self.assertEqual(self.connect, self.connect.sql("SELECT 1").sparkSession())

Expand Down