Skip to content

[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.5#35907

Closed
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-38563-followup
Closed

[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.5#35907
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-38563-followup

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR is a retry of #35871 with bumping up the version to 0.10.9.5.
It was reverted because of Python 3.10 is broken, and Python 3.10 was not officially supported in Py4J.

In Py4J 0.10.9.5, the issue was fixed (py4j/py4j#475), and it added Python 3.10 support officially with CI set up (py4j/py4j#477).

Why are the changes needed?

See #35871

Does this PR introduce any user-facing change?

See #35871

How was this patch tested?

Py4J sets up Python 3.10 CI now, and I manually tested PySpark with Python 3.10 with this patch:

./bin/pyspark
import py4j
py4j.__version__
spark.range(10).show()
Using Python version 3.10.0 (default, Mar  3 2022 03:57:21)
Spark context Web UI available at http://172.30.5.50:4040
Spark context available as 'sc' (master = local[*], app id = local-1647571387534).
SparkSession available as 'spark'.
>>> import py4j
>>> py4j.__version__
'0.10.9.5'
>>> spark.range(10).show()
+---+
| id|
+---+
...

@HyukjinKwon
Copy link
Member Author

cc @dongjoon-hyun @wangyum FYI

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we cannot use the same JIRA ID because branch-3.2 has Py4J 0.10.9.4 already with SPARK-38563. Could you use a new JIRA ID for Py4J 0.10.9.5? You can still land it to branch-3.2 too.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Mar 18, 2022

SPARK-38563 solved the resource leakage in branch-3.2 and new JIRA adds Python 3.10 support (on top of it)

@HyukjinKwon
Copy link
Member Author

Oh no. That's not released yet. i reverted it from branch-3.2 too.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Mar 18, 2022

BTW, Python 3.10 already works with Spark 3.2 too - Py4J upgrade broke that (unofficial) support.

@HyukjinKwon
Copy link
Member Author

Technically what you said is correct because branch-3.2 only officially supports Python up to 3.9 but I will port this together to branch-3.2 if you don't mind just to reduce the breakage (although it's not official). The risk here is very small.

@dongjoon-hyun
Copy link
Member

Oh, got it. If you reverted cleanly from all branches, there is no problem.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@HyukjinKwon
Copy link
Member Author

Merged to master, branch-3.3 and branch-3.2.

HyukjinKwon added a commit that referenced this pull request Mar 18, 2022
### What changes were proposed in this pull request?

This PR is a retry of #35871 with bumping up the version to 0.10.9.5.
It was reverted because of Python 3.10 is broken, and Python 3.10 was not officially supported in Py4J.

In Py4J 0.10.9.5, the issue was fixed (py4j/py4j#475), and it added Python 3.10 support officially with CI set up (py4j/py4j#477).

### Why are the changes needed?

See #35871

### Does this PR introduce _any_ user-facing change?

See #35871

### How was this patch tested?

Py4J sets up Python 3.10 CI now, and I manually tested PySpark with Python 3.10 with this patch:

```bash
./bin/pyspark
```

```
import py4j
py4j.__version__
spark.range(10).show()
```

```
Using Python version 3.10.0 (default, Mar  3 2022 03:57:21)
Spark context Web UI available at http://172.30.5.50:4040
Spark context available as 'sc' (master = local[*], app id = local-1647571387534).
SparkSession available as 'spark'.
>>> import py4j
>>> py4j.__version__
'0.10.9.5'
>>> spark.range(10).show()
+---+
| id|
+---+
...
```

Closes #35907 from HyukjinKwon/SPARK-38563-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 97335ea)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
HyukjinKwon added a commit that referenced this pull request Mar 18, 2022
This PR is a retry of #35871 with bumping up the version to 0.10.9.5.
It was reverted because of Python 3.10 is broken, and Python 3.10 was not officially supported in Py4J.

In Py4J 0.10.9.5, the issue was fixed (py4j/py4j#475), and it added Python 3.10 support officially with CI set up (py4j/py4j#477).

See #35871

See #35871

Py4J sets up Python 3.10 CI now, and I manually tested PySpark with Python 3.10 with this patch:

```bash
./bin/pyspark
```

```
import py4j
py4j.__version__
spark.range(10).show()
```

```
Using Python version 3.10.0 (default, Mar  3 2022 03:57:21)
Spark context Web UI available at http://172.30.5.50:4040
Spark context available as 'sc' (master = local[*], app id = local-1647571387534).
SparkSession available as 'spark'.
>>> import py4j
>>> py4j.__version__
'0.10.9.5'
>>> spark.range(10).show()
+---+
| id|
+---+
...
```

Closes #35907 from HyukjinKwon/SPARK-38563-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 97335ea)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
This PR is a retry of apache#35871 with bumping up the version to 0.10.9.5.
It was reverted because of Python 3.10 is broken, and Python 3.10 was not officially supported in Py4J.

In Py4J 0.10.9.5, the issue was fixed (py4j/py4j#475), and it added Python 3.10 support officially with CI set up (py4j/py4j#477).

See apache#35871

See apache#35871

Py4J sets up Python 3.10 CI now, and I manually tested PySpark with Python 3.10 with this patch:

```bash
./bin/pyspark
```

```
import py4j
py4j.__version__
spark.range(10).show()
```

```
Using Python version 3.10.0 (default, Mar  3 2022 03:57:21)
Spark context Web UI available at http://172.30.5.50:4040
Spark context available as 'sc' (master = local[*], app id = local-1647571387534).
SparkSession available as 'spark'.
>>> import py4j
>>> py4j.__version__
'0.10.9.5'
>>> spark.range(10).show()
+---+
| id|
+---+
...
```

Closes apache#35907 from HyukjinKwon/SPARK-38563-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 97335ea)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@HyukjinKwon HyukjinKwon deleted the SPARK-38563-followup branch January 15, 2024 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants