Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Sep 20, 2019

What changes were proposed in this pull request?

This PR enable ThriftServerQueryTestSuite and fix previously flaky test by:

  1. Start thriftserver in beforeAll().
  2. Disable spark.sql.hive.thriftServer.async.

Why are the changes needed?

Improve test coverage.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

build/sbt "hive-thriftserver/test-only *.ThriftServerQueryTestSuite "  -Phive-thriftserver
build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.hive.thriftserver.ThriftServerQueryTestSuite test  -Phive-thriftserver

@SparkQA
Copy link

SparkQA commented Sep 20, 2019

Test build #111064 has finished for PR 25868 at commit 2a422ea.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 20, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 20, 2019

Test build #111066 has finished for PR 25868 at commit 2a422ea.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

cc @zsxwing

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111144 has finished for PR 25868 at commit 0b75d5b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111146 has finished for PR 25868 at commit 1fede27.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111160 has finished for PR 25868 at commit eeda699.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum changed the title [WIP][SPARK-28527] Enable ThriftServerQueryTestSuite [SPARK-28527][test-maven] Enable ThriftServerQueryTestSuite Sep 22, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

@wangyum wangyum changed the title [SPARK-28527][test-maven] Enable ThriftServerQueryTestSuite [WIP][SPARK-28527][test-hadoop3.2][test-java11] Enable ThriftServerQueryTestSuite Sep 22, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

@wangyum wangyum changed the title [WIP][SPARK-28527][test-hadoop3.2][test-java11] Enable ThriftServerQueryTestSuite [WIP][SPARK-28527][SQL][test-hadoop3.2] Enable ThriftServerQueryTestSuite Sep 22, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111155 has finished for PR 25868 at commit 1fede27.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111162 has finished for PR 25868 at commit eeda699.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111161 has finished for PR 25868 at commit eeda699.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

@wangyum wangyum changed the title [WIP][SPARK-28527][SQL][test-hadoop3.2] Enable ThriftServerQueryTestSuite [SPARK-28527][SQL] Enable ThriftServerQueryTestSuite Sep 22, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111169 has finished for PR 25868 at commit eeda699.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111170 has finished for PR 25868 at commit eeda699.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

2 similar comments
@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111174 has finished for PR 25868 at commit eeda699.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111179 has finished for PR 25868 at commit 2eac612.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 23, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111183 has finished for PR 25868 at commit 2eac612.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 23, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111184 has finished for PR 25868 at commit 2eac612.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum changed the title [SPARK-28527][SQL] Enable ThriftServerQueryTestSuite [SPARK-28527][SQL][test-hadoop3.2][test-java11] Enable ThriftServerQueryTestSuite Sep 23, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 23, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111186 has finished for PR 25868 at commit 2eac612.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 23, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111234 has finished for PR 25868 at commit 2eac612.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

So, this adds 9 min 35 seconds additionally?

@wangyum
Copy link
Member Author

wangyum commented Sep 24, 2019

Yes. It will add 9 min 35 seconds for maven testing.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-28527][SQL][test-hadoop3.2][test-java11] Enable ThriftServerQueryTestSuite [SPARK-28527][SQL] Enable ThriftServerQueryTestSuite Sep 24, 2019
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@dongjoon-hyun
Copy link
Member

Thank you, @wangyum . Please merge this after PST midnight to avoid Jenkins reset.

@wangyum wangyum changed the title [SPARK-28527][SQL] Enable ThriftServerQueryTestSuite [SPARK-28527][SQL][TEST] Enable ThriftServerQueryTestSuite Sep 24, 2019
@wangyum wangyum closed this in b8b67ae Sep 24, 2019
@wangyum wangyum deleted the SPARK-28527-enable branch September 24, 2019 07:45
@wangyum
Copy link
Member Author

wangyum commented Sep 24, 2019

Merged to master.

@wangyum
Copy link
Member Author

wangyum commented Sep 24, 2019

Thank you @HyukjinKwon @dongjoon-hyun

override def sparkConf: SparkConf = super.sparkConf
// Hive Thrift server should not executes SQL queries in an asynchronous way
// because we may set session configuration.
.set(HiveUtils.HIVE_THRIFT_SERVER_ASYNC, false)
Copy link
Member

@zsxwing zsxwing Sep 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean this is broken and the user should also turn if off? If so, should we change the default value? Otherwise, our tests are actually testing something that's rarely used.

Hive Thrift server should not executes SQL queries in an asynchronous way because we may set session configuration.

Could you clarify what's the exact issue? Is it because the background thread is missing some thread-local variables because threads are reused? Can we copy them from the parent thread here? https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala#L186

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, this suite is not aiming concurrency stress test. It's just targeting SQL execution one by one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it exposes some bugs in the default mode. Seems worth to fix the bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, that doesn't imply a bug in the default mode. That means we want to run one by one simply in this test suite.

Copy link
Member

@dongjoon-hyun dongjoon-hyun Sep 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, if there is a bug, definitely we should fix it. But, let's not enable that in this test suite. That is completely a separate issue, isn't it? With a separate UT and a patch, that should be handled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With HiveUtils.HIVE_THRIFT_SERVER_ASYNC enabled the Thriftserver will still execute queries one by one. The difference is that it will not block the request:

  • With HIVE_THRIFT_SERVER_ASYNC=false client sends a query in an TExecuteStatementReq. The query executes, and only after it finishes the server responds with a TExecuteStatementResp. Then the client calls TGetOperationStatusReq to see if the result was a success or failure, and then potentially continues fetching results...
  • With HIVE_THRIFT_SERVER_ASYNC=true client sends a query in an TExecuteStatementReq, and the server starts it in a background thread and immediately returns a handle in the response. Then the client periodically polls with TGetOperationStatusReq until the query is finished, an then potentailly continues fetching results...

In both cases, the Hive JDBC driver executes one query at once and there is no concurrency.

I think this setting does not need to be set.

Copy link
Contributor

@juliuszsompolski juliuszsompolski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangyum late post-review as I only now noticed this PR.

override def sparkConf: SparkConf = super.sparkConf
// Hive Thrift server should not executes SQL queries in an asynchronous way
// because we may set session configuration.
.set(HiveUtils.HIVE_THRIFT_SERVER_ASYNC, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With HiveUtils.HIVE_THRIFT_SERVER_ASYNC enabled the Thriftserver will still execute queries one by one. The difference is that it will not block the request:

  • With HIVE_THRIFT_SERVER_ASYNC=false client sends a query in an TExecuteStatementReq. The query executes, and only after it finishes the server responds with a TExecuteStatementResp. Then the client calls TGetOperationStatusReq to see if the result was a success or failure, and then potentially continues fetching results...
  • With HIVE_THRIFT_SERVER_ASYNC=true client sends a query in an TExecuteStatementReq, and the server starts it in a background thread and immediately returns a handle in the response. Then the client periodically polls with TGetOperationStatusReq until the query is finished, an then potentailly continues fetching results...

In both cases, the Hive JDBC driver executes one query at once and there is no concurrency.

I think this setting does not need to be set.


override def beforeEach(): Unit = {
override def beforeAll(): Unit = {
super.beforeAll()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an earlier PR I commented that the flakiness may be because of async issues.
I meant that calling startThriftServer actually starts some things asynchronously in the background before the server becomes ready.
Moving it to beforeAll instead of beforeEach should help the flakiness by having this race only at the start of the suite and not before every test, but I think adding a 3 or 5 s sleep would make sure it never happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to be hitting it in the first test of the suite quite often:

[info] ThriftServerQueryTestSuite:
[info] - query_regex_column.sql *** FAILED *** (184 milliseconds)
[info]   java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:15421: java.net.ConnectException: Connection refused (Connection refused)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pgSQL/text.sql should fail if spark.sql.hive.thriftServer.async is enabled: #25567 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, let's try to enable it: #26172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants