-
Notifications
You must be signed in to change notification settings - Fork 263
feat: transfer Apache Spark runtime conf to native engine #1649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Some part were rolled back in #1101 |
| debug = COMET_DEBUG_ENABLED.get(), | ||
| explain = COMET_EXPLAIN_NATIVE_ENABLED.get()) | ||
| explain = COMET_EXPLAIN_NATIVE_ENABLED.get(), | ||
| sparkConfig = SparkEnv.get.conf.getAll.toMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't toMap return a Scala Map (and not a Java HashMap)? Does this translate correctly to a JMap?
Also, SparkConf.getAll will return all Spark confs. Should we filter based on Comet conf and Spark confs of interest only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments are super valid and everything done in local branch yes, haven't yet pushed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented because I thought we were planning to get this in the 0.8.0 release :)
native/core/src/execution/jni_api.rs
Outdated
| task_attempt_id: jlong, | ||
| debug_native: jboolean, | ||
| explain_native: jboolean, | ||
| spark_conf: JObject, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternate approach could be to encode the Spark config in protobuf format and pass those bytes into native code rather than have native code make calls to the JVM to iterate over the map.
|
When testing I just realized the Apache Spark already send defaultFS with schema to the logical plan and then to Comet So this test works and Spark sends However when running spark-shell another param should be used |
|
@parthchandra @andygrove I'm planning to close the PR as spark transparently sends the conf to native, however keeping the Spark conf code in DataFusion like style in this PR for future if we still need it. |
|
Code in 3e93995 |
Which issue does this PR close?
Related #1360.
Rationale for this change
Very often the native engine behavior depends on external Spark job params (HDFS configuration, INT96, etc) but there is no access from native code to Spark configuration.
What changes are included in this PR?
Extending
ExecutionContextto enclose Spark params from JVMHow are these changes tested?