Skip to content

Conversation

@ulysses-you
Copy link
Contributor

@ulysses-you ulysses-you commented Mar 25, 2021

ulysses-you Closes #451 200 17 27 Target Issue Test Plan Feature Powered by Pull Request Badge

Why are the changes needed?

Manual cherry-pick some Spark patch into Kyuubi.

  1. Support query auto timeout cancel on thriftserver
  2. Add config to control if cancel invoke interrupt task on thriftserver

In order to keep backward with early Spark version, we hard code the config key instead of refer to Spark SQLConf.

Note that, the exists timeout of operator (kyuubi.operation.idle.timeout) is to cancel that client has no access with engine. That said if a query run a long time and the client is alive, the query would not be cancelled. Then the new added config spark.sql.thriftServer.queryTimeout can handle this case.

How was this patch tested?

Add new test.

@ulysses-you
Copy link
Contributor Author

cc @pan3793 @yaooqinn


package org.apache.kyuubi.engine.spark

object Constants {
Copy link
Member

@yaooqinn yaooqinn Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should define kyuubi.server.query.timeout at the server-side rather than the engine-side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can't infer time unit from kyuubi.server.query.timeout, and there are some configurations defined in second and others defined in millisecond, should we make the configuration name more clear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pan3793
I see the point, but the exists conf already named with *.timeout.. How about enhance the kyuubi-defaults.conf.template ? I guess it's useful if we write some common config into it.

@codecov
Copy link

codecov bot commented Mar 25, 2021

Codecov Report

Merging #451 (212f579) into master (cfa2d04) will increase coverage by 0.15%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #451      +/-   ##
==========================================
+ Coverage   80.67%   80.82%   +0.15%     
==========================================
  Files         109      109              
  Lines        3871     3907      +36     
  Branches      460      465       +5     
==========================================
+ Hits         3123     3158      +35     
  Misses        504      504              
- Partials      244      245       +1     
Impacted Files Coverage Δ
...uubi/engine/spark/operation/ExecuteStatement.scala 77.19% <68.42%> (-5.31%) ⬇️
...ache/kyuubi/operation/KyuubiOperationManager.scala 60.56% <85.71%> (+2.10%) ⬆️
...ine/spark/operation/SparkSQLOperationManager.scala 91.89% <100.00%> (ø)
...in/scala/org/apache/kyuubi/config/KyuubiConf.scala 93.59% <100.00%> (+0.28%) ⬆️
...org/apache/kyuubi/operation/ExecuteStatement.scala 79.01% <100.00%> (+3.06%) ⬆️
...n/scala/org/apache/kyuubi/service/Serverable.scala 44.44% <0.00%> (-2.23%) ⬇️
...rg/apache/kyuubi/engine/spark/SparkSQLEngine.scala 69.23% <0.00%> (-1.14%) ⬇️
.../scala/org/apache/kyuubi/server/KyuubiServer.scala 47.22% <0.00%> (-0.15%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cfa2d04...212f579. Read the comment docs.

}
}

test("SPARK-26533: Support query auto timeout cancel on thriftserver - setQueryTimeout") {
Copy link
Member

@yaooqinn yaooqinn Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to move these tests to the superclass to test both the server and engine.

@ulysses-you ulysses-you marked this pull request as draft March 25, 2021 03:00

// If a timeout value `queryTimeout` is specified by users and it is smaller than
// a global timeout value, we use the user-specified value.
// This code follows the Hive timeout behaviour (See #29933 for details).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about replace #29933 by SPARK-26533?

@yaooqinn
Copy link
Member

a local test run is needed when new configs are comming

import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.operation.JDBCTestUtils

class SparkEngineSuites extends KyuubiFunSuite {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new test class is for easily test which need modify the global config.

.createWithDefault(true)

val OPERATION_QUERY_TIMEOUT: ConfigEntry[Long] =
buildConf("operation.query.timeout")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decide to hide the Spark thing that in SparkThriftServer, so define the config with Kyuubi style @pan3793

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Kyuubi style timeConf defined in ms? The comments here is in seconds.

Copy link
Member

@pan3793 pan3793 Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think we don't need to specific time unit in kyuubi timeConf style, user should not care about what we use internal, they just define it via ISO-8601 duration string

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed this comment. What @pan3793 pointed out is true, we don't need to tell the unit of a time conf, it's Duration ISO-8601 strings..

Here is a bug at caller side! the client timeout is conf, and the system timeout is ms... @ulysses-you

@ulysses-you ulysses-you marked this pull request as ready for review March 25, 2021 09:31
@ulysses-you
Copy link
Contributor Author

After the refector, provide some summary for reviewer:

  1. Add the query timeout at Kyuubi Server and that can be modified by session global config and statement query tiemout.
  2. Engine side just receive the query timeout and to do the timeout check.
  3. Add a interrupt flag to Spark local property with true default that is also a session global config.
  4. In order to add engine test in one class eaily, add a new class IndividualSparkSuite and change some related test code place.

@pan3793 pan3793 added the kind:feature Feature request label Mar 25, 2021
@pan3793 pan3793 added this to the v1.2.0 milestone Mar 25, 2021
Copy link
Member

@yaooqinn yaooqinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, if the test flakiness fixed

@yaooqinn yaooqinn closed this in fecdba3 Mar 26, 2021
yaooqinn pushed a commit that referenced this pull request Mar 26, 2021
![ulysses-you](https://badgen.net/badge/Hello/ulysses-you/green) [![Closes #451](https://badgen.net/badge/Preview/Closes%20%23451/blue)](https://github.com/yaooqinn/kyuubi/pull/451) ![200](https://badgen.net/badge/%2B/200/red) ![17](https://badgen.net/badge/-/17/green) ![27](https://badgen.net/badge/commits/27/yellow) ![Target Issue](https://badgen.net/badge/Missing/Target%20Issue/ff0000) ![Test Plan](https://badgen.net/badge/Missing/Test%20Plan/ff0000) ![Feature](https://badgen.net/badge/Label/Feature/) [<img width="16" alt="Powered by Pull Request Badge" src="https://user-images.githubusercontent.com/1393946/111216524-d2bb8e00-85d4-11eb-821b-ed4c00989c02.png">](https://pullrequestbadge.com/?utm_medium=github&utm_source=yaooqinn&utm_campaign=badge_info)<!-- PR-BADGE: PLEASE DO NOT REMOVE THIS COMMENT -->

<!--
Thanks for sending a pull request!

Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html
  2. If the PR is related to an issue in https://github.com/yaooqinn/kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->

### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
  1. If you add a feature, you can talk about the use case of it.
  2. If you fix a bug, you can clarify why it is a bug.
-->
Manual cherry-pick some Spark patch into Kyuubi.
1. [Support query auto timeout cancel on thriftserver](apache/spark#29933)
2. [Add config to control if cancel invoke interrupt task on thriftserver](apache/spark#30481)

In order to keep backward with early Spark version, we hard code the config key instead of refer to Spark SQLConf.

Note that, the exists timeout of operator (`kyuubi.operation.idle.timeout`) is to cancel that client has no access with engine. That said if a query run a long time and the client is alive, the query would not be cancelled. Then the new added config `spark.sql.thriftServer.queryTimeout` can handle this case.

### _How was this patch tested?_
Add new test.

Closes #451 from ulysses-you/query-timeout.

212f579 [ulysses-you] docs
9206538 [ulysses-you] empty flaky test
ddab9bf [ulysses-you] flaty test
1da02a0 [ulysses-you] flaty test
edfadf1 [ulysses-you] nit
3f9920b [ulysses-you] address comment
9492c48 [ulysses-you] correct timeout
5df997e [ulysses-you] nit
2124952 [ulysses-you] address comment
192fdcc [ulysses-you] fix tets
d684af6 [ulysses-you] global config
1d1adda [ulysses-you] empty
967a63e [ulysses-you] correct import
128948e [ulysses-you] add session conf in session
144d51b [ulysses-you] fix
a90248b [ulysses-you] unused import
c90386f [ulysses-you] timeout move to operation manager
d780965 [ulysses-you] update docs
a5f7138 [ulysses-you] fix test
f7c7308 [ulysses-you] config name
7f3fb3d [ulysses-you] change conf place
97a011e [ulysses-you] unnecessary change
0953a76 [ulysses-you] move test
38ac0c0 [ulysses-you] Merge branch 'master' of https://github.com/yaooqinn/kyuubi into query-timeout
71bea97 [ulysses-you] refector implementation
35ef6f9 [ulysses-you] update conf
0cad8e2 [ulysses-you] Support query auto timeout cancel on thriftserver

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
(cherry picked from commit fecdba3)
Signed-off-by: Kent Yao <[email protected]>
@yaooqinn
Copy link
Member

thanks, merged to master for v1.2.0 and branch 1.1 for v1.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:feature Feature request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants