-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-44284][CONNECT] Create simple conf system for sql/api #41838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * See [[get]] for more information. | ||
| */ | ||
| def setSQLConfGetter(getter: () => SQLConf): Unit = { | ||
| SqlApiConf.setConfGetter(getter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a bit more efficient.
| * Check if `this` and `other` are the same data type when ignoring nullability | ||
| * (`StructField.nullable`, `ArrayType.containsNull`, and `MapType.valueContainsNull`). | ||
| */ | ||
| def sameType(left: DataType, right: DataType): Boolean = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move this back to DataType.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure we should do it.
|
cc @amaliujia |
| * then it values are bound to the currently set SQLConf. In connect mode it will default to | ||
| * hardcoded values. | ||
| */ | ||
| private[sql] trait SqlApiConf { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no biggie and just dropping a comment. I wonder if we should name it like just SQLConfiguration or just SQLConf (since org.apache.spark.sql.internal.SQLConf at sql/core isn't API anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not name as SQLConf, as that will triggers a huge amount of refactorings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is org.apache.spark.SparkConf a public API? How about naming it SparkSQLConf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SparkSQLConf SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SparkSQLConfSGTM
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really should be an internal thing. It is marked private[sql] for this reason. Users should only interact with RuntimeConf, and perhaps SparkSession.Builder to get/set options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if it's internal, I am good with whatever name :-)
amaliujia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Only see other people have some minor comments. Should be good once those comments are addressed.
Co-authored-by: Hyukjin Kwon <[email protected]>
Co-authored-by: Hyukjin Kwon <[email protected]>
|
Merging. |
| def get: SqlApiConf = confGetter.get()() | ||
|
|
||
| // Force load SQLConf. This will trigger the installation of a confGetter that points to SQLConf. | ||
| Try(SparkClassUtils.classForName("org.apache.spark.sql.internal.SQLConf$")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @hvanhovell We meet dead lock issue in a iceberg compaction job and seems related with this patch.
I have raised ticket https://issues.apache.org/jira/browse/SPARK-50620 and provide the thread dump: https://issues.apache.org/jira/secure/attachment/13073575/iceberg_compaction.txt
"Executor 92 task launch worker for task 2910, task 69.0 in stage 7.0 (TID 2910)" #152 daemon prio=5 os_prio=0 cpu=616.25ms elapsed=258408.34s tid=0x00007f77d67cc330 nid=0x22c9e in Object.wait() [0x00007f77755fb000]
java.lang.Thread.State: RUNNABLE
at org.apache.spark.sql.internal.SQLConf$.<init>(SQLConf.scala:184)
- waiting on the Class initialization monitor for org.apache.spark.sql.internal.SqlApiConf$
at org.apache.spark.sql.internal.SQLConf$.<clinit>(SQLConf.scala)
"Executor 92 task launch worker for task 5362, task 521.0 in stage 10.0 (TID 5362)" #123 daemon prio=5 os_prio=0 cpu=2443.78ms elapsed=258409.29s tid=0x00007f77d60ecad0 nid=0x22c7c in Object.wait() [0x00007f777e591000]
java.lang.Thread.State: RUNNABLE
at java.lang.Class.forName0([email protected]/Native Method)
- waiting on the Class initialization monitor for org.apache.spark.sql.internal.SQLConf$
at java.lang.Class.forName([email protected]/Class.java:467)
at org.apache.spark.util.SparkClassUtils.classForName(SparkClassUtils.scala:41)
at org.apache.spark.util.SparkClassUtils.classForName$(SparkClassUtils.scala:36)
at org.apache.spark.util.SparkClassUtils$.classForName(SparkClassUtils.scala:83)
at org.apache.spark.sql.internal.SqlApiConf$.$anonfun$new$1(SqlApiConf.scala:73)
For the first task: at org.apache.spark.sql.internal.SQLConf$.<clinit>(SQLConf.scala) - waiting on the Class initialization monitor for org.apache.spark.sql.internal.SqlApiConf$.
For the second one: at org.apache.spark.sql.internal.SqlApiConf$.$anonfun$new$1(SqlApiConf.scala:73) - waiting on the Class initialization monitor for org.apache.spark.sql.internal.SQLConf$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved by #44602, please ignore
What changes were proposed in this pull request?
This PR introduces a configuration system for classes that are in sql/api.
Why are the changes needed?
We are moving a number of components into sql/api that rely on confs being set when used with sql/core. The conf system added here gives us that flexibility to do so.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Existing tests.