[SPARK-44433][3.5][PYTHON][CONNECT][SS][FOLLOWUP] Terminate listener process with removeListener and improvements #42340

WweiL · 2023-08-04T07:19:49Z

Master Branch PR: #42283

What changes were proposed in this pull request?

This is a followup to #42116. It addresses the following issues:

When removeListener is called upon one listener, before the python process is left running, now it also get stopped.
When multiple removeListener is called on the same listener, in non-connect mode, subsequent calls will be noop. But before this PR, in connect it actually throws an error, which doesn't align with existing behavior, this PR addresses it.
Set the socket timeout to be None (\infty) for foreachBatch_worker and listener_worker, because there could be a long time between each microbatch. If not setting this, the socket will timeout and won't be able to process new data.

scala> Streaming query listener worker is starting with url sc://localhost:15002/;user_id=wei.liu and sessionId 886191f0-2b64-4c44-b067-de511f04b42d.
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/wei.liu/oss-spark/python/lib/pyspark.zip/pyspark/sql/connect/streaming/worker/listener_worker.py", line 95, in <module>
  File "/home/wei.liu/oss-spark/python/lib/pyspark.zip/pyspark/sql/connect/streaming/worker/listener_worker.py", line 82, in main
  File "/home/wei.liu/oss-spark/python/lib/pyspark.zip/pyspark/serializers.py", line 557, in loads
  File "/home/wei.liu/oss-spark/python/lib/pyspark.zip/pyspark/serializers.py", line 594, in read_int
  File "/usr/lib/python3.9/socket.py", line 704, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

Why are the changes needed?

Necessary improvements

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manual test + unit test

WweiL · 2023-08-04T07:24:14Z

core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala

+      pythonWorkerFactory = Some(workerFactory)
+    } finally {
+      conf.set(PYTHON_USE_DAEMON, prevConf)
+    }


This and the stop() method are different from master branch since the createPythonWorker method doesn't support custom modules at that time:

https://github.com/WweiL/oss-spark/blob/f8b312a22eae3ce1176da49a693182832c1f1402/core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala#L72-L74

cc @ueshin to double check this

core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala

ueshin

Otherwise, LGTM, pending tests.

core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala

ueshin · 2023-08-05T02:40:13Z

Thanks! merging to 3.5.

…process with removeListener and improvements ### Master Branch PR: #42283 ### What changes were proposed in this pull request? This is a followup to #42116. It addresses the following issues: 1. When `removeListener` is called upon one listener, before the python process is left running, now it also get stopped. 2. When multiple `removeListener` is called on the same listener, in non-connect mode, subsequent calls will be noop. But before this PR, in connect it actually throws an error, which doesn't align with existing behavior, this PR addresses it. 3. Set the socket timeout to be None (\infty) for `foreachBatch_worker` and `listener_worker`, because there could be a long time between each microbatch. If not setting this, the socket will timeout and won't be able to process new data. ``` scala> Streaming query listener worker is starting with url sc://localhost:15002/;user_id=wei.liu and sessionId 886191f0-2b64-4c44-b067-de511f04b42d. Traceback (most recent call last): File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/wei.liu/oss-spark/python/lib/pyspark.zip/pyspark/sql/connect/streaming/worker/listener_worker.py", line 95, in <module> File "/home/wei.liu/oss-spark/python/lib/pyspark.zip/pyspark/sql/connect/streaming/worker/listener_worker.py", line 82, in main File "/home/wei.liu/oss-spark/python/lib/pyspark.zip/pyspark/serializers.py", line 557, in loads File "/home/wei.liu/oss-spark/python/lib/pyspark.zip/pyspark/serializers.py", line 594, in read_int File "/usr/lib/python3.9/socket.py", line 704, in readinto return self._sock.recv_into(b) socket.timeout: timed out ``` ### Why are the changes needed? Necessary improvements ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual test + unit test Closes #42340 from WweiL/SPARK-44433-listener-followup-3.5. Authored-by: Wei Liu <[email protected]> Signed-off-by: Takuya UESHIN <[email protected]>

resolve conflict

8a47233

WweiL changed the base branch from master to branch-3.5 August 4, 2023 07:20

github-actions bot added SQL ML MLLIB STRUCTURED STREAMING KUBERNETES WEB UI GRAPHX MESOS BUILD SPARK SHELL YARN EXAMPLES DOCS CORE INFRA PYTHON R AVRO PANDAS API ON SPARK CONNECT PROTOBUF labels Aug 4, 2023

WweiL commented Aug 4, 2023

View reviewed changes

ueshin reviewed Aug 4, 2023

View reviewed changes

core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala Show resolved Hide resolved

remove conf recover in stop runner

0c0b4e9

github-actions bot removed ML MLLIB KUBERNETES WEB UI labels Aug 4, 2023

github-actions bot removed GRAPHX MESOS BUILD SPARK SHELL YARN EXAMPLES DOCS INFRA R AVRO PANDAS API ON SPARK PROTOBUF labels Aug 4, 2023

WweiL requested a review from ueshin August 4, 2023 18:22

ueshin approved these changes Aug 4, 2023

View reviewed changes

core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala Outdated Show resolved Hide resolved

WweiL added 2 commits August 4, 2023 12:41

address comments

94b2578

minor

5f09d6f

ueshin closed this Aug 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-44433][3.5][PYTHON][CONNECT][SS][FOLLOWUP] Terminate listener process with removeListener and improvements #42340

[SPARK-44433][3.5][PYTHON][CONNECT][SS][FOLLOWUP] Terminate listener process with removeListener and improvements #42340

Uh oh!

WweiL commented Aug 4, 2023 •

edited

Loading

Uh oh!

WweiL Aug 4, 2023

Uh oh!

Uh oh!

ueshin left a comment

Uh oh!

Uh oh!

ueshin commented Aug 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-44433][3.5][PYTHON][CONNECT][SS][FOLLOWUP] Terminate listener process with removeListener and improvements #42340

[SPARK-44433][3.5][PYTHON][CONNECT][SS][FOLLOWUP] Terminate listener process with removeListener and improvements #42340

Uh oh!

Conversation

WweiL commented Aug 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Master Branch PR: #42283

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

WweiL Aug 4, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ueshin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ueshin commented Aug 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WweiL commented Aug 4, 2023 •

edited

Loading