Skip to content
This repository was archived by the owner on Nov 15, 2024. It is now read-only.

Commit ff93071

Browse files
dongjoon-hyunMatthewRBruce
authored andcommitted
[SPARK-20256][SQL] SessionState should be created more lazily
## What changes were proposed in this pull request? `SessionState` is designed to be created lazily. However, in reality, it created immediately in `SparkSession.Builder.getOrCreate` ([here](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L943)). This PR aims to recover the lazy behavior by keeping the options into `initialSessionOptions`. The benefit is like the following. Users can start `spark-shell` and use RDD operations without any problems. **BEFORE** ```scala $ bin/spark-shell java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder' ... Caused by: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.security.AccessControlException: Permission denied: user=spark, access=READ, inode="/apps/hive/warehouse":hive:hdfs:drwx------ ``` As reported in SPARK-20256, this happens when the warehouse directory is not allowed for this user. **AFTER** ```scala $ bin/spark-shell ... Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.0-SNAPSHOT /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112) Type in expressions to have them evaluated. Type :help for more information. scala> sc.range(0, 10, 1).count() res0: Long = 10 ``` ## How was this patch tested? Manual. This closes apache#18512 . Author: Dongjoon Hyun <[email protected]> Closes apache#18501 from dongjoon-hyun/SPARK-20256. (cherry picked from commit 1b50e0e) Signed-off-by: gatorsmile <[email protected]>
1 parent c617986 commit ff93071

File tree

1 file changed

+10
-2
lines changed

1 file changed

+10
-2
lines changed

sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,12 @@ class SparkSession private(
112112
existingSharedState.getOrElse(new SharedState(sparkContext))
113113
}
114114

115+
/**
116+
* Initial options for session. This options are applied once when sessionState is created.
117+
*/
118+
@transient
119+
private[sql] val initialSessionOptions = new scala.collection.mutable.HashMap[String, String]
120+
115121
/**
116122
* State isolated across sessions, including SQL configurations, temporary tables, registered
117123
* functions, and everything else that accepts a [[org.apache.spark.sql.internal.SQLConf]].
@@ -127,9 +133,11 @@ class SparkSession private(
127133
parentSessionState
128134
.map(_.clone(this))
129135
.getOrElse {
130-
SparkSession.instantiateSessionState(
136+
val state = SparkSession.instantiateSessionState(
131137
SparkSession.sessionStateClassName(sparkContext.conf),
132138
self)
139+
initialSessionOptions.foreach { case (k, v) => state.conf.setConfString(k, v) }
140+
state
133141
}
134142
}
135143

@@ -935,7 +943,7 @@ object SparkSession {
935943
}
936944

937945
session = new SparkSession(sparkContext, None, None, extensions)
938-
options.foreach { case (k, v) => session.sessionState.conf.setConfString(k, v) }
946+
options.foreach { case (k, v) => session.initialSessionOptions.put(k, v) }
939947
defaultSession.set(session)
940948

941949
// Register a successfully instantiated context to the singleton. This should be at the

0 commit comments

Comments
 (0)