Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Oct 19, 2019

What changes were proposed in this pull request?

This PR test ThriftServerQueryTestSuite in an asynchronous way.

Why are the changes needed?

The default value of spark.sql.hive.thriftServer.async is true.

Does this PR introduce any user-facing change?

No

How was this patch tested?

build/sbt "hive-thriftserver/test-only *.ThriftServerQueryTestSuite" -Phive-thriftserver
build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.hive.thriftserver.ThriftServerQueryTestSuite test -Phive-thriftserver

@SparkQA
Copy link

SparkQA commented Oct 19, 2019

Test build #112309 has finished for PR 26172 at commit b3a26e1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

cc @zsxwing since he also asked this before.

The way I see this is that a new extended test coverage.

@wangyum
Copy link
Member Author

wangyum commented Oct 20, 2019

It seems the reason is SQLConf.get usage in ParseDriver.scala#L91-L103. We can reproduce this issue after this commit:

build/sbt "hive-thriftserver/test-only *.ThriftServerQueryTestSuite -- -z postgreSQL"  -Phive-thriftserver

The output is:

================================================================================================
statement: set spark.sql.ansi.enabled=false
spark.sql.ansi.enabled before: true
SQLConf.get.ansiEnabled: true
spark.sql.ansi.enabled after: false
================================================================================================
statement: select i, left('ahoj', i), right('ahoj', i) from range(-5, 6) t(i) order by i
spark.sql.ansi.enabled before: false
SQLConf.get.ansiEnabled: true
================================================================================================
statement: set spark.sql.ansi.enabled=true
spark.sql.ansi.enabled before: false
SQLConf.get.ansiEnabled: true
spark.sql.ansi.enabled after: true
[info] - postgreSQL/text.sql *** FAILED *** (1 second, 489 milliseconds)
[info]   "-1            
[info]   -2             
[info]   -3             
[info]   -4             
[info]   -5             
[info]   0              
[info]   1      a       j
[info]   2      ah      oj
[info]   3      aho     hoj
[info]   4      ahoj    ahoj
[info]   5      ahoj    ahoj" did not contain "Exception" Exception did not match for query #19
[info]   select i, left('ahoj', i), right('ahoj', i) from range(-5, 6) t(i) order by i, expected: -1            
[info]   -2             
[info]   -3             
[info]   -4             
[info]   -5             
[info]   0              
[info]   1      a       j
[info]   2      ah      oj
[info]   3      aho     hoj
[info]   4      ahoj    ahoj
[info]   5      ahoj    ahoj, but got: java.sql.SQLException
[info]   Error running query: org.apache.spark.sql.catalyst.parser.ParseException: 
[info]   no viable alternative at input 'left'(line 1, pos 10)
[info]   
[info]   == SQL ==
[info]   select i, left('ahoj', i), right('ahoj', i) from range(-5, 6) t(i) order by i
[info]   ----------^^^ (ThriftServerQueryTestSuite.scala:225)

@wangyum
Copy link
Member Author

wangyum commented Oct 20, 2019

@SparkQA
Copy link

SparkQA commented Oct 20, 2019

Test build #112325 has finished for PR 26172 at commit 5c896ce.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu
Copy link
Contributor

AngersZhuuuu commented Oct 21, 2019

This error happen because in SparkSession.sql() method, parse process is not under current SparkSession's conf. So some configuration about parser such as spark.sql.ansi.enable get failed in thread level

lexer.legacy_setops_precedence_enbled = SQLConf.get.setOpsPrecedenceEnforced

@SparkQA
Copy link

SparkQA commented Oct 21, 2019

Test build #112367 has finished for PR 26172 at commit 8fd1efd.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Oct 21, 2019

retest this please

* @since 2.0.0
*/
def sql(sqlText: String): DataFrame = {
SparkSession.setActiveSession(this)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep it in scope of thriftserver, could we set it in SparkExecuteStatementOperation.execute, using the sqlContext available there?
I'm not sure what other implications it can have here, it would be safer to set there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep it in scope of thriftserver, could we set it in SparkExecuteStatementOperation.execute, using the sqlContext available there?
I'm not sure what other implications it can have here, it would be safer to set there.

It's ok to set in SparkExecuteStatementOperation.execute.
#26187 (comment)
I think it ok to set this before parse sql, It was the right thing to do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Let the reviewers of #26187 decide it there. (@cloud-fan @zsxwing )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issue with async queries was that the parser was using the wrong dialect fallback, because it was picking the wrong SQLConf, because it was not set in the background execution thread?
@wangyum @AngersZhuuuu Does fixing this potentially fix any other of the failures in ThriftServerQueryTestSuite that are currently blacklisted to be ignored?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juliuszsompolski Other failures can't be fixed by #26187:

// Missing UDF
"postgreSQL/boolean.sql",
"postgreSQL/case.sql",
// SPARK-28624
"date.sql",
// SPARK-28620
"postgreSQL/float4.sql",
// SPARK-28636
"decimalArithmeticOperations.sql",
"literals.sql",
"subquery/scalar-subquery/scalar-subquery-predicate.sql",
"subquery/in-subquery/in-limit.sql",
"subquery/in-subquery/in-group-by.sql",
"subquery/in-subquery/simple-in.sql",
"subquery/in-subquery/in-order-by.sql",
"subquery/in-subquery/in-set-operations.sql"

@SparkQA
Copy link

SparkQA commented Oct 21, 2019

Test build #112369 has finished for PR 26172 at commit 8fd1efd.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

cloud-fan pushed a commit that referenced this pull request Oct 22, 2019
### What changes were proposed in this pull request?
As I have comment in  [SPARK-29516](#26172 (comment))
SparkSession.sql() method parse process not under current sparksession's conf, so some configuration about parser is not valid in multi-thread situation.

In this pr, we add a SQLConf parameter to AbstractSqlParser and initial it with SessionState's conf.
Then for each SparkSession's parser process. It will use's it's own SessionState's SQLConf and to be thread safe

### Why are the changes needed?
Fix bug

### Does this PR introduce any user-facing change?
NO

### How was this patch tested?
NO

Closes #26187 from AngersZhuuuu/SPARK-29530.

Authored-by: angerszhu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@SparkQA
Copy link

SparkQA commented Oct 22, 2019

Test build #112415 has finished for PR 26172 at commit 8f08b3d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • abstract class AbstractSqlParser(conf: SQLConf) extends ParserInterface with Logging
  • class CatalystSqlParser(conf: SQLConf) extends AbstractSqlParser(conf)
  • class SparkSqlParser(conf: SQLConf) extends AbstractSqlParser(conf)

@SparkQA
Copy link

SparkQA commented Oct 22, 2019

Test build #112416 has finished for PR 26172 at commit f291ed4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 22, 2019

Test build #112430 has finished for PR 26172 at commit 175c6f4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class BitAndAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes
  • case class BitOrAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes
  • case class RepairTableStatement(tableName: Seq[String]) extends ParsedStatement

@wangyum wangyum changed the title [WIP][SPARK-29516][SQL][TEST] Test ThriftServerQueryTestSuite asynchronously [SPARK-29516][SQL][TEST][test-maven] Test ThriftServerQueryTestSuite asynchronously Oct 22, 2019
@wangyum
Copy link
Member Author

wangyum commented Oct 22, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Oct 22, 2019

Test build #112435 has finished for PR 26172 at commit 175c6f4.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class BitAndAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes
  • case class BitOrAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes
  • case class RepairTableStatement(tableName: Seq[String]) extends ParsedStatement

Copy link
Contributor

@juliuszsompolski juliuszsompolski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@wangyum wangyum changed the title [SPARK-29516][SQL][TEST][test-maven] Test ThriftServerQueryTestSuite asynchronously [SPARK-29516][SQL][TEST] Test ThriftServerQueryTestSuite asynchronously Oct 22, 2019
@wangyum wangyum closed this in 3163b6b Oct 22, 2019
@wangyum wangyum deleted the SPARK-29516 branch October 22, 2019 10:21
@wangyum
Copy link
Member Author

wangyum commented Oct 22, 2019

Thank you all!

@wangyum
Copy link
Member Author

wangyum commented Oct 22, 2019

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants