Skip to content

Commit be95298

Browse files
panbingkunzhengruifeng
authored andcommitted
[SPARK-44524][BUILD] Balancing pyspark-pandas-connect and pyspark-pandas-slow-connect GA testing time
### What changes were proposed in this pull request? The pr aims to balancing `pyspark-pandas-connect` and `pyspark-pandas-slow-connect` GA testing time. ### Why are the changes needed? After pr: #42146, the difference in testing time between `pyspark-pandas-connect` and `pyspark-pandas-slow-connect` is a bit significant, which affects the overall running time. In order to make GA operation more efficient and stable. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Pass GA. - Manually monitor GA. Closes #42115 from panbingkun/free_disk_space. Lead-authored-by: panbingkun <[email protected]> Co-authored-by: panbingkun <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
1 parent 55391f6 commit be95298

File tree

1 file changed

+17
-18
lines changed

1 file changed

+17
-18
lines changed

dev/sparktestsupport/modules.py

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -891,7 +891,7 @@ def __hash__(self):
891891

892892
pyspark_pandas_connect = Module(
893893
name="pyspark-pandas-connect",
894-
dependencies=[pyspark_connect, pyspark_pandas],
894+
dependencies=[pyspark_connect, pyspark_pandas, pyspark_pandas_slow],
895895
source_file_regexes=[
896896
"python/pyspark/pandas",
897897
],
@@ -949,23 +949,6 @@ def __hash__(self):
949949
"pyspark.pandas.tests.connect.test_parity_utils",
950950
"pyspark.pandas.tests.connect.test_parity_window",
951951
"pyspark.pandas.tests.connect.indexes.test_parity_base",
952-
],
953-
excluded_python_implementations=[
954-
"PyPy" # Skip these tests under PyPy since they require numpy, pandas, and pyarrow and
955-
# they aren't available there
956-
],
957-
)
958-
959-
960-
# This module should contain the same test list with 'pyspark_pandas_slow' for maintenance.
961-
pyspark_pandas_slow_connect = Module(
962-
name="pyspark-pandas-slow-connect",
963-
dependencies=[pyspark_connect, pyspark_pandas_slow],
964-
source_file_regexes=[
965-
"python/pyspark/pandas",
966-
],
967-
python_test_goals=[
968-
# pandas-on-Spark unittests
969952
"pyspark.pandas.tests.connect.indexes.test_parity_datetime",
970953
"pyspark.pandas.tests.connect.indexes.test_parity_align",
971954
"pyspark.pandas.tests.connect.indexes.test_parity_indexing",
@@ -985,6 +968,22 @@ def __hash__(self):
985968
"pyspark.pandas.tests.connect.computation.test_parity_melt",
986969
"pyspark.pandas.tests.connect.computation.test_parity_missing_data",
987970
"pyspark.pandas.tests.connect.computation.test_parity_pivot",
971+
],
972+
excluded_python_implementations=[
973+
"PyPy" # Skip these tests under PyPy since they require numpy, pandas, and pyarrow and
974+
# they aren't available there
975+
],
976+
)
977+
978+
979+
pyspark_pandas_slow_connect = Module(
980+
name="pyspark-pandas-slow-connect",
981+
dependencies=[pyspark_connect, pyspark_pandas, pyspark_pandas_slow],
982+
source_file_regexes=[
983+
"python/pyspark/pandas",
984+
],
985+
python_test_goals=[
986+
# pandas-on-Spark unittests
988987
"pyspark.pandas.tests.connect.frame.test_parity_attrs",
989988
"pyspark.pandas.tests.connect.frame.test_parity_constructor",
990989
"pyspark.pandas.tests.connect.frame.test_parity_conversion",

0 commit comments

Comments
 (0)