[K8S] Using hive conf to check whether to apply HIVE_DELEGATION_TOKEN #4835

turboFei · 2023-05-13T04:50:34Z

Why are the changes needed?

Now we check the sparkContext.hadoopConfiguration to determine whether to apply HIVE_DELEGATION_TOKEN

Here is the method to create sparkContext hadoopConguration.
And it will add __spark_hadoop_conf__.xml to hadoop configuration resource.

  /**
   * Return an appropriate (subclass) of Configuration. Creating config can initialize some Hadoop
   * subsystems.
   */
  def newConfiguration(conf: SparkConf): Configuration = {
    val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
    hadoopConf.addResource(SparkHadoopUtil.SPARK_HADOOP_CONF_FILE)
    hadoopConf
  }

  /**
   * Name of the file containing the gateway's Hadoop configuration, to be overlayed on top of the
   * cluster's Hadoop config. It is up to the Spark code launching the application to create
   * this file if it's desired. If the file doesn't exist, it will just be ignored.
   */
  private[spark] val SPARK_HADOOP_CONF_FILE = "__spark_hadoop_conf__.xml"

Per the code, this file is only created in yarn module.

Spark on yarn

after unzip __spark_conf__.zip in spark staging dir, there is a file named __spark_hadoop_conf__.xml.

 grep hive.metastore.uris  __spark_hadoop_conf__.xml
<property><name>hive.metastore.uris</name><value>thrift://*******:9083</value><source>programatically</source></property>

Spark on K8S

Seems for spark on k8s, there is no file named __spark_hadoop_conf__.xml

We need to check the hiveConf instead of hadoopConf.

How was this patch tested?

Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before make a pull request

codecov-commenter · 2023-05-13T06:05:47Z

Codecov Report

Merging #4835 (7657cbb) into master (1e310a0) will increase coverage by 0.03%.
The diff coverage is 80.00%.

@@             Coverage Diff              @@
##             master    #4835      +/-   ##
============================================
+ Coverage     58.03%   58.07%   +0.03%     
  Complexity       13       13              
============================================
  Files           583      583              
  Lines         32561    32570       +9     
  Branches       4318     4320       +2     
============================================
+ Hits          18897    18914      +17     
+ Misses        11845    11836       -9     
- Partials       1819     1820       +1

Impacted Files	Coverage Δ
...ubi/engine/spark/SparkTBinaryFrontendService.scala	`81.63% <80.00%> (-0.39%)`	⬇️

... and 6 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

pan3793 · 2023-05-14T16:00:34Z

For spark on k8s, the hive-site.xml is placed to working directory after spark context launched. See details in apache/spark#37417

Sorry, I don't get your point, w/ apache/spark#37417, Spark downloads and adds those files during spark submit phase, which should happen before SparkContext initialization

turboFei · 2023-05-15T02:55:51Z

Not sure why hive-site.xml is not loaded.

Anyway, I think hiveConf is the correct one to check. @pan3793

turboFei · 2023-05-15T03:09:15Z

  /**
   * Return an appropriate (subclass) of Configuration. Creating config can initialize some Hadoop
   * subsystems.
   */
  def newConfiguration(conf: SparkConf): Configuration = {
    val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
    hadoopConf.addResource(SparkHadoopUtil.SPARK_HADOOP_CONF_FILE)
    hadoopConf
  }

  /**
   * Name of the file containing the gateway's Hadoop configuration, to be overlayed on top of the
   * cluster's Hadoop config. It is up to the Spark code launching the application to create
   * this file if it's desired. If the file doesn't exist, it will just be ignored.
   */
  private[spark] val SPARK_HADOOP_CONF_FILE = "__spark_hadoop_conf__.xml"

Seems for spark on k8s, there is no file named __spark_hadoop_conf__.xml

 grep hive.metastore.uris  __spark_hadoop_conf__.xml
<property><name>hive.metastore.uris</name><value>thrift://*******:9083</value><source>programatically</source></property>

turboFei · 2023-05-15T03:14:00Z

@pan3793 updated the pr description

…ELEGATION_TOKEN ### _Why are the changes needed?_ Now we check the sparkContext.hadoopConfiguration to determine whether to apply HIVE_DELEGATION_TOKEN Here is the method to create sparkContext hadoopConguration. And it will add `__spark_hadoop_conf__.xml` to hadoop configuration resource. ``` /** * Return an appropriate (subclass) of Configuration. Creating config can initialize some Hadoop * subsystems. */ def newConfiguration(conf: SparkConf): Configuration = { val hadoopConf = SparkHadoopUtil.newConfiguration(conf) hadoopConf.addResource(SparkHadoopUtil.SPARK_HADOOP_CONF_FILE) hadoopConf } ``` ``` /** * Name of the file containing the gateway's Hadoop configuration, to be overlayed on top of the * cluster's Hadoop config. It is up to the Spark code launching the application to create * this file if it's desired. If the file doesn't exist, it will just be ignored. */ private[spark] val SPARK_HADOOP_CONF_FILE = "__spark_hadoop_conf__.xml" ``` <img width="1091" alt="image" src="https://github.com/apache/kyuubi/assets/6757692/f2a87a23-4565-4164-9eaa-5f7e166519de"> Per the code, this file is only created in yarn module. #### Spark on yarn after unzip `__spark_conf__.zip` in spark staging dir, there is a file named `__spark_hadoop_conf__.xml`. ``` grep hive.metastore.uris __spark_hadoop_conf__.xml <property><name>hive.metastore.uris</name><value>thrift://*******:9083</value><source>programatically</source></property> ``` #### Spark on K8S Seems for spark on k8s, there is no file named `__spark_hadoop_conf__.xml` <img width="1580" alt="image" src="https://github.com/apache/kyuubi/assets/6757692/99de73d0-3519-4af3-8f0a-90967949ec0e"> <img width="875" alt="image" src="https://github.com/apache/kyuubi/assets/6757692/f7c477a5-23ca-4b25-8638-4b040b78899d"> We need to check the `hiveConf` instead of `hadoopConf`. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4835 from turboFei/hive_token. Closes #4835 7657cbb [fwang12] hive conf 7c0af67 [fwang12] save Authored-by: fwang12 <[email protected]> Signed-off-by: fwang12 <[email protected]> (cherry picked from commit 6b5c138) Signed-off-by: fwang12 <[email protected]>

turboFei added 2 commits May 12, 2023 21:00

save

7c0af67

hive conf

7657cbb

github-actions bot added the module:spark label May 13, 2023

turboFei self-assigned this May 13, 2023

turboFei requested review from pan3793 and yaooqinn May 13, 2023 04:54

turboFei changed the title ~~Using hive conf to check whether HIVE_DELEGATION_TOKEN is needed~~ [K8S] Using hive conf to check whether HIVE_DELEGATION_TOKEN is needed May 13, 2023

turboFei changed the title ~~[K8S] Using hive conf to check whether HIVE_DELEGATION_TOKEN is needed~~ [K8S] Using hive conf to check whether to need HIVE_DELEGATION_TOKEN May 13, 2023

turboFei changed the title ~~[K8S] Using hive conf to check whether to need HIVE_DELEGATION_TOKEN~~ [K8S] Using hive conf to check whether to apply HIVE_DELEGATION_TOKEN May 13, 2023

pan3793 approved these changes May 15, 2023

View reviewed changes

turboFei closed this in 6b5c138 May 15, 2023

pan3793 added this to the v1.7.2 milestone May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[K8S] Using hive conf to check whether to apply HIVE_DELEGATION_TOKEN #4835

[K8S] Using hive conf to check whether to apply HIVE_DELEGATION_TOKEN #4835

Uh oh!

turboFei commented May 13, 2023 •

edited

Loading

Uh oh!

codecov-commenter commented May 13, 2023

Uh oh!

pan3793 commented May 14, 2023

Uh oh!

turboFei commented May 15, 2023 •

edited

Loading

Uh oh!

turboFei commented May 15, 2023 •

edited

Loading

Uh oh!

turboFei commented May 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[K8S] Using hive conf to check whether to apply HIVE_DELEGATION_TOKEN #4835

[K8S] Using hive conf to check whether to apply HIVE_DELEGATION_TOKEN #4835

Uh oh!

Conversation

turboFei commented May 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are the changes needed?

Spark on yarn

Spark on K8S

How was this patch tested?

Uh oh!

codecov-commenter commented May 13, 2023

Codecov Report

Uh oh!

pan3793 commented May 14, 2023

Uh oh!

turboFei commented May 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turboFei commented May 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turboFei commented May 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

turboFei commented May 13, 2023 •

edited

Loading

turboFei commented May 15, 2023 •

edited

Loading

turboFei commented May 15, 2023 •

edited

Loading