-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35949][CORE]Add keep-spark-context-alive arg for to prevent closing spark context after invoking main for some case
#33154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
cc @kotlovs, @dongjoon-hyun, @mridulm FYI |
|
@sunpe Could you please tell in more details, what is the problem with client mode? This code is called after exit from main() method, when the application is moving towards termination. |
|
Agree with @kotlovs - this is happening when application is terminating. |
|
Issue SPARK-34674 said the spark context could not close on k8s. But this pr, will close context only not shell or thrift server. In my case, I use springboot and spark together to create a web app. This app wait for user's request, and do some job on spark. I regist 'SparkSession' object as a spring bean. Like this. And then Pack the spring application as a jar. Use the As testing, I add 3 log in core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala. While application started, app logs will like this. While the app started, the spark context will stop soonly. And after fixes. |
|
Thanks for detailed explanation. Now I understand your problem. You create a server inside the app and await user's requests. I just think it would be a more universal solution. I even already have code adding such flag in one of my branches. I could make this PR if this solution would be acceptable. |
|
Hello @kotlovs. Thanks for reply. After discussing with my partner, I think adding another arg is a good solution. |
is-server arg for to prevent closing spark context on server mode
is-server arg for to prevent closing spark context on server modeis-server arg for to prevent closing spark context when starting as a server.
is-server arg for to prevent closing spark context when starting as a server.keep-saprk-context-alive arg for to prevent closing spark context after invoking main for some case
keep-saprk-context-alive arg for to prevent closing spark context after invoking main for some casekeep-spark-context-alive arg for to prevent closing spark context after invoking main for some case
|
Hello @kotlovs . I added a arg called |
…psWithState in Structured Streaming
This PR aims to add support for specifying a user defined initial state for arbitrary structured streaming stateful processing using [flat]MapGroupsWithState operator.
Users can load previous state of their stateful processing as an initial state instead of redoing the entire processing once again.
Yes this PR introduces new API
```
def mapGroupsWithState[S: Encoder, U: Encoder](
timeoutConf: GroupStateTimeout,
initialState: KeyValueGroupedDataset[K, S])(
func: (K, Iterator[V], GroupState[S]) => U): Dataset[U]
def flatMapGroupsWithState[S: Encoder, U: Encoder](
outputMode: OutputMode,
timeoutConf: GroupStateTimeout,
initialState: KeyValueGroupedDataset[K, S])(
func: (K, Iterator[V], GroupState[S]) => Iterator[U])
```
Through unit tests in FlatMapGroupsWithStateSuite
Closes #33093 from rahulsmahadev/flatMapGroupsWithState.
Authored-by: Rahul Mahadev <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
|
Hello, this Any chance to see this PR merged? |
Hello @clementguillot. Thank you for attention. This bug has been fixed in v3.2.0. Please refer to this code https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L963 for more detail. |
|
Hello @sunpe, thank you for your very fast answer. Please let me give you some more context, I am using Spark v3.3.0 in K8s using Spark on K8S operator. I tried the following:
Here are my findings: Spark Operator pod submits the application using this command: When Driver pod starts, I have the following logs: As you can see, Do you have a clue why the context is still being closed? |
hello @clementguillot . I understand the problem now. This bug is Introduced in this commit c625eb4 in version v3.1. In your case, starting a server in spark on k8s is still has some problem. Because this commit fd3e9ce. This code In my view, we should delete code in SparkSubmit.scala L963-L969 , and we should stop spark context in signal hook. @dongjoon-hyun . |
SPARK-35949
What changes were proposed in this pull request?
From v3.1, spark context will close after invoking main method. In some case, it is necessary to keep spark context alive. Such as start app as a server. I add
keep-saprk-context-alivearg to set whether should keep spark context alive after main method.Why are the changes needed?
Due to pr c625eb4#diff-f8564df81d845c0cd2f621bc2ed22761cbf9731f28cb2828d9cbd0491f4e7584. In client mode, the spark context will be stopped on application start. it is necessary to keep spark context alive. Such as start app as a server
Does this PR introduce any user-facing change?
Yes. Added a
keep-saprk-context-aliveargs to set keeping the spark context alive until the app exit. Usagespark-submit --keep-saprk-context-alive trueHow was this patch tested?
Manually test.