Skip to content

Conversation

@martin-g
Copy link
Member

What changes were proposed in this pull request?

Sort the debug info printed by SparkSubmit when -verbose is enabled.

Why are the changes needed?

This way it is easier to find settings/properties and their values.

Does this PR introduce any user-facing change?

Yes. The log output changes - some data is sorted.

How was this patch tested?

Old unit tests still pass & manually check the sorted output.

 Main class:
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication
Arguments:
--main-class
--primary-java-resource
local:///opt/spark/examples/jars/spark-examples_2.13-3.3.0-SNAPSHOT.jar
org.apache.spark.examples.SparkPi
Spark config:
(spark.app.name,spark-on-k8s-app)
(spark.app.submitTime,1645106476125)
(spark.driver.cores,1)
(spark.driver.extraJavaOptions,-Dio.netty.tryReflectionSetAccessible=true)
(spark.driver.memory,2048m)
(spark.dynamicAllocation.enabled,true)
(spark.dynamicAllocation.shuffleTracking.enabled,true)
(spark.executor.cores,2)
(spark.executor.extraJavaOptions,-Dio.netty.tryReflectionSetAccessible=true)
(spark.executor.instances,3)
(spark.executor.memory,2048m)
(spark.jars,local:///opt/spark/examples/jars/spark-examples_2.13-3.3.0-SNAPSHOT.jar)
(spark.kubernetes.allocation.batch.delay,1)
(spark.kubernetes.allocation.batch.size,3)
(spark.kubernetes.authenticate.driver.serviceAccountName,spark-account-name)
(spark.kubernetes.driver.container.image,spark/spark:3.3.0-SNAPSHOT-scala_2.13-11-jre-slim)
(spark.kubernetes.executor.container.image,spark/spark:3.3.0-SNAPSHOT-scala_2.13-11-jre-slim)
(spark.kubernetes.namespace,spark-on-k8s)
(spark.master,k8s://https://192.168.49.2:8443)
(spark.network.timeout,300)
(spark.submit.deployMode,cluster)
(spark.submit.pyFiles,)
Classpath elements:

@github-actions github-actions bot added the CORE label Feb 17, 2022
@martin-g martin-g changed the title [SPARK-38242][SS] Sort the SparkSubmit debug output [SPARK-38242][CORE] Sort the SparkSubmit debug output Feb 17, 2022
Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me. I always fear it breaks somebody relying on the output, but I can't see a reasonable use case for that

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait a sec, we can't sort these. Order of args matters; you might have --flag value. You can see the issue in your example, where the arg order doesn't reflect what was passed and loses this information

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah .. I wouldn't sort it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Reverted it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise classpath order matters, so not sure we should do this.
Spark configs ... yeah order should not matter, so that should be sorted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted it!

@srowen
Copy link
Member

srowen commented Feb 23, 2022

Hm, I think the test failures may be legitimate here. They are tests related to conf. Can you check the failed tests and see if they need an adjustment after the output order changes?

@martin-g martin-g closed this Feb 24, 2022
@martin-g martin-g reopened this Feb 24, 2022
@srowen
Copy link
Member

srowen commented Feb 24, 2022

@HyukjinKwon do you know why it seems like the main tests weren't run here? does he need to enable tests in his branch?

@HyukjinKwon
Copy link
Member

Ah, after enabling, he needs to push an empty commit or rebase to trigger properly. ... I think probably this is a bug we need to fix

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Feb 25, 2022

I pushed (and removed back) some arbitrary changes to trigger the CI

@martin-g
Copy link
Member Author

Thank you, @HyukjinKwon !
Nice trick with the Github suggestions feature!
Allow edits and access to secrets by maintainers is also enabled, so you could even push an empty commit

I used close/reopen PR to re-trigger the checks but this needs https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request and this event is not enabled at

. Do you want me to open a new PR to add this event or it is not there intentionally ?
The change is to add these two lines - https://github.com/apache/avro/blob/f10fa6b9e30e38ec5cde6e00288f0b31b24bf1de/.github/workflows/test-lang-c.yml#L21-L22

@srowen srowen closed this in 9eab255 Feb 26, 2022
@srowen
Copy link
Member

srowen commented Feb 26, 2022

Merged to master

@martin-g martin-g deleted the sort-debug-info-SparkSubmit branch February 26, 2022 20:22
@martin-g
Copy link
Member Author

Thank you, @srowen and @HyukjinKwon !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants