Skip to content

Conversation

@peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Oct 4, 2019

What changes were proposed in this pull request?

This PR adds 2 changes regarding exception handling in SQLQueryTestSuite and ThriftServerQueryTestSuite

  • fixes an expected output sorting issue in ThriftServerQueryTestSuite as if there is an exception then there is no need for sort
  • introduces common exception handling in those 2 suites with a new handleExceptions method

Why are the changes needed?

Currently ThriftServerQueryTestSuite passes on master, but it fails on one of my PRs (#23531) with this error (https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/111651/testReport/org.apache.spark.sql.hive.thriftserver/ThriftServerQueryTestSuite/sql_3/):

org.scalatest.exceptions.TestFailedException: Expected "
[Recursion level limit 100 reached but query has not exhausted, try increasing spark.sql.cte.recursion.level.limit
org.apache.spark.SparkException]
", but got "
[org.apache.spark.SparkException
Recursion level limit 100 reached but query has not exhausted, try increasing spark.sql.cte.recursion.level.limit]
" Result did not match for query #4 WITH RECURSIVE r(level) AS (   VALUES (0)   UNION ALL   SELECT level + 1 FROM r ) SELECT * FROM r

The unexpected reversed order of expected output (error message comes first, then the exception class) is due to this line: https://github.com/apache/spark/pull/26028/files#diff-b3ea3021602a88056e52bf83d8782de8L146. It should not sort the expected output if there was an error during execution.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing UTs.

@peter-toth
Copy link
Contributor Author

Besides I think this change is useful, I run into some UT failures while testing #23531 which can be fixed by this PR.

@peter-toth
Copy link
Contributor Author

cc @wangyum

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29359] Better exception handling in SQLQueryTestSuite and ThriftServerQueryTestSuite [SPARK-29359][SQL] Better exception handling in SQLQueryTestSuite and ThriftServerQueryTestSuite Oct 4, 2019
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29359][SQL] Better exception handling in SQLQueryTestSuite and ThriftServerQueryTestSuite [SPARK-29359][SQL][TESTS] Better exception handling in (SQL|ThriftServer)QueryTestSuite Oct 4, 2019
select 30 day day
-- !query 22 schema
struct<>

Copy link
Member

@dongjoon-hyun dongjoon-hyun Oct 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but let's keep the original form. Most of the changes are due to this, but the first contribution seems to be on the edge.

Copy link
Contributor Author

@peter-toth peter-toth Oct 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dongjoon-hyun for the review, let me try to convince you that these changes make sense, but if you still disagree just let me know and I will drop them.

The only cases where I replaced the expected schema from struct<> to nothing are those where some error occurs and an exception is thrown. In those cases there is no data returned, so there is no schema at all, not even an empty struct.

-- !query 22
select 30 day day
-- !query 22 schema

-- !query 22 output
org.apache.spark.sql.catalyst.parser.ParseException

IMHO struct<> makes sense where a statement was successful but data returned is empty and there are no columns in it. In those cases I left the expected output intact.

-- !query 0
CREATE OR REPLACE TEMPORARY VIEW view1 AS SELECT 2 AS i1
-- !query 0 schema
struct<>
-- !query 0 output

Empty expected schema is also useful to easily recognize a statement that ended up in an error (otherwise we probably need to check the output for containing exception which seems less elegant, or the schema containing struct<> and output being non-empty which seems less intuitive).
I utilized empty expected schema to fix a sorting issue in ThriftServerQueryTestSuite: https://github.com/apache/spark/pull/26028/files#diff-b3ea3021602a88056e52bf83d8782de8R147, and there might be other cases in the future where this change could help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully understood that, but it's not worth of this huge change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, I've dropped that part of changes.

@SparkQA
Copy link

SparkQA commented Oct 5, 2019

Test build #111787 has finished for PR 26028 at commit 2e1c235.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll review this again after the removal of the first one. Thanks.

@peter-toth peter-toth force-pushed the SPARK-29359-better-exception-handling branch from 2e1c235 to 58e1cf1 Compare October 6, 2019 17:46
@peter-toth
Copy link
Contributor Author

I'll review this again after the removal of the first one. Thanks.

Ok, thanks. I removed the first one.

@dongjoon-hyun
Copy link
Member

Thank you for updating, @peter-toth !

@dongjoon-hyun
Copy link
Member

@peter-toth . For the rest of the contribution, they are a kind of preventive approach and there is no change in the current generated result, right?

Why are the changes needed?

For more robust exception handling.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give us more concrete example when this PR becomes more meaningful? For now, this seems to be not required urgently.

@peter-toth
Copy link
Contributor Author

Could you give us more concrete example when this PR becomes more meaningful? For now, this seems to be not required urgently.

Currently ThriftServerQueryTestSuite passes on master, but it fails on one of my PRs (#23531) with this error (https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/111651/testReport/org.apache.spark.sql.hive.thriftserver/ThriftServerQueryTestSuite/sql_3/):

org.scalatest.exceptions.TestFailedException: Expected "
[Recursion level limit 100 reached but query has not exhausted, try increasing spark.sql.cte.recursion.level.limit
org.apache.spark.SparkException]
", but got "
[org.apache.spark.SparkException
Recursion level limit 100 reached but query has not exhausted, try increasing spark.sql.cte.recursion.level.limit]
" Result did not match for query #4 WITH RECURSIVE r(level) AS (   VALUES (0)   UNION ALL   SELECT level + 1 FROM r ) SELECT * FROM r

The unexpected reversed order of expected output (error message comes first, then the exception class) is due to this line: https://github.com/apache/spark/pull/26028/files#diff-b3ea3021602a88056e52bf83d8782de8L146. It should not sort the expected output if there was an error during execution.

Other changes belong to the second point, a minor improvement to handle exceptions at a common place. There is no change in generated expected result.

@SparkQA
Copy link

SparkQA commented Oct 6, 2019

Test build #111821 has finished for PR 26028 at commit 58e1cf1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Oct 7, 2019

Got it. Thanks, @peter-toth ! I updated the second section of the PR description with your example.

case _ => plan.children.iterator.exists(isSorted)
}

protected def handleExceptions(result: => (String, Seq[String])): (String, Seq[String]) = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a function description because we override this differently?

  • SQLQueryTestSuite seems to return (struct<>, ...)
  • ThriftServerQueryTestSuite seems to return ("", answer.sorted)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, both returns a (String, Seq[String]) tuple where the first is the schema and the second is the result. Since it's impossible to get the exact spark schema back from a java.sql.ResultSet we use empty string in ThriftServerQueryTestSuite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a description to it and to its override.

// with a generic pattern "###".
val msg = if (a.plan.nonEmpty) a.getSimpleMessage else a.getMessage
(StructType(Seq.empty), Seq(a.getClass.getName, msg.replaceAll("#\\d+", "#x")))
(emptySchema, Seq(a.getClass.getName, msg.replaceAll("#\\d+", "#x")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test case which this is required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular test case. Since I touched this method and StructType(Seq.empty) was used 3 times so I just moved it to a val.

@dongjoon-hyun
Copy link
Member

Also, cc @wangyum .

@SparkQA
Copy link

SparkQA commented Oct 7, 2019

Test build #111852 has finished for PR 26028 at commit 019e37f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@peter-toth
Copy link
Contributor Author

peter-toth commented Oct 10, 2019

@dongjoon-hyun @wangyum do you think this PR is ok now?

Copy link
Member

@wangyum wangyum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dongjoon-hyun
Copy link
Member

Hi, @wangyum . You can merge this after manual testing.
Or, we can wait until our Jenkins is back again.

@wangyum
Copy link
Member

wangyum commented Oct 12, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Oct 12, 2019

Test build #111951 has finished for PR 26028 at commit 019e37f.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member

wangyum commented Oct 12, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Oct 12, 2019

Test build #111958 has finished for PR 26028 at commit 019e37f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum closed this in 9e12c94 Oct 13, 2019
@wangyum
Copy link
Member

wangyum commented Oct 13, 2019

Thank you @peter-toth @dongjoon-hyun

@wangyum
Copy link
Member

wangyum commented Oct 13, 2019

Merged to master.

@peter-toth
Copy link
Contributor Author

Thank you @dongjoon-hyun and @wangyum for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants