Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce df.spark.repartition #1864

Merged
merged 15 commits into from
Oct 24, 2020
Prev Previous commit
Next Next commit
Disable df.repartition
xinrong-meng committed Oct 22, 2020

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit 133baedb81e973f3e72489dac048c4160e5f2cec
7 changes: 0 additions & 7 deletions databricks/koalas/frame.py
Original file line number Diff line number Diff line change
@@ -9286,13 +9286,6 @@ def gen_names(v, curnames):
else:
return DataFrame(internal)

def repartition(
self, numPartitions: Union[int, str], index_col: Optional[Union[str, List[str]]] = None
) -> "ks.DataFrame":
return self.spark.repartition(numPartitions, index_col)

repartition.__doc__ = SparkFrameMethods.repartition.__doc__

def keys(self):
"""
Return alias for columns.
4 changes: 2 additions & 2 deletions databricks/koalas/tests/test_dataframe.py
Original file line number Diff line number Diff line change
@@ -4549,11 +4549,11 @@ def test_explain_hint(self):
def test_repartition(self):
kdf = ks.DataFrame({"age": [5, 5, 2, 2], "name": ["Bob", "Bob", "Alice", "Alice"]})

kdf = kdf.repartition(7)
kdf = kdf.spark.repartition(7)
self.assertEqual(kdf.to_spark().rdd.getNumPartitions(), 7)

kdf = kdf.set_index("age")
nkdf = kdf.repartition(5, "age")
nkdf = kdf.spark.repartition(5, "age")
self.assertEqual(nkdf.to_spark().rdd.getNumPartitions(), 5)
self.assertEqual(kdf.index.name, nkdf.index.name)