Skip to content

Commit eba4f5c

Browse files
committed
[SPARK-37531][INFRA][PYTHON][TESTS] Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job
### What changes were proposed in this pull request? This PR aims to use `PyArrow 6.0.0` in `Python 3.9` unit tests at GitHub Action jobs. Although the new change is removing `<5.0.0' limitation, there are other minor changes because it's built more recently, too. - dongjoon-hyun/ApacheSparkGitHubActionImage@4f7408f ``` - RUN python3.9 -m pip install numpy 'pyarrow<5.0.0' pandas scipy xmlrunner plotly>=4.8 sklearn 'mlflow>=1.0' + RUN python3.9 -m pip install numpy pyarrow pandas scipy xmlrunner plotly>=4.8 sklearn 'mlflow>=1.0' ``` ``` $ docker run -it --rm dongjoon/apache-spark-github-action-image:20211116 pip3.9 list > 20211116 $ docker run -it --rm dongjoon/apache-spark-github-action-image:20210930 pip3.9 list > 20210930 $ diff 20210930 20211116 # The following is manually formatted for simplicity. ... Jinja2 3.0.1 3.0.3 mlflow 1.20.2 1.21.0 numpy 1.21.2 1.21.4 pandas 1.3.3 1.3.4 plotly 5.3.1 5.4.0 pyarrow 4.0.1 6.0.0 scikit-learn 1.0 1.0.1 scipy 1.7.1 1.7.2 ``` ### Why are the changes needed? SPARK-37342 upgrade Apache Arrow to 6.0.0 in Java/Scala. This is a corresponding upgrade in PySpark. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the GitHub Action. Closes #34793 from dongjoon-hyun/SPARK-37531. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent f99e2e6 commit eba4f5c

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

.github/workflows/build_and_test.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ jobs:
254254
name: "Build modules (${{ format('{0}, {1} job', needs.configure-jobs.outputs.branch, needs.configure-jobs.outputs.type) }}): ${{ matrix.modules }}"
255255
runs-on: ubuntu-20.04
256256
container:
257-
image: dongjoon/apache-spark-github-action-image:20210930
257+
image: dongjoon/apache-spark-github-action-image:20211116
258258
strategy:
259259
fail-fast: false
260260
matrix:
@@ -358,7 +358,7 @@ jobs:
358358
name: "Build modules: sparkr"
359359
runs-on: ubuntu-20.04
360360
container:
361-
image: dongjoon/apache-spark-github-action-image:20210930
361+
image: dongjoon/apache-spark-github-action-image:20211116
362362
env:
363363
HADOOP_PROFILE: ${{ needs.configure-jobs.outputs.hadoop }}
364364
HIVE_PROFILE: hive2.3
@@ -425,7 +425,7 @@ jobs:
425425
PYSPARK_DRIVER_PYTHON: python3.9
426426
PYSPARK_PYTHON: python3.9
427427
container:
428-
image: dongjoon/apache-spark-github-action-image:20210930
428+
image: dongjoon/apache-spark-github-action-image:20211116
429429
steps:
430430
- name: Checkout Spark repository
431431
uses: actions/checkout@v2

0 commit comments

Comments
 (0)