-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29359][SQL][TESTS] Better exception handling in (SQL|ThriftServer)QueryTestSuite #26028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29359][SQL][TESTS] Better exception handling in (SQL|ThriftServer)QueryTestSuite #26028
Conversation
|
Besides I think this change is useful, I run into some UT failures while testing #23531 which can be fixed by this PR. |
|
cc @wangyum |
| select 30 day day | ||
| -- !query 22 schema | ||
| struct<> | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but let's keep the original form. Most of the changes are due to this, but the first contribution seems to be on the edge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dongjoon-hyun for the review, let me try to convince you that these changes make sense, but if you still disagree just let me know and I will drop them.
The only cases where I replaced the expected schema from struct<> to nothing are those where some error occurs and an exception is thrown. In those cases there is no data returned, so there is no schema at all, not even an empty struct.
-- !query 22
select 30 day day
-- !query 22 schema
-- !query 22 output
org.apache.spark.sql.catalyst.parser.ParseException
IMHO struct<> makes sense where a statement was successful but data returned is empty and there are no columns in it. In those cases I left the expected output intact.
-- !query 0
CREATE OR REPLACE TEMPORARY VIEW view1 AS SELECT 2 AS i1
-- !query 0 schema
struct<>
-- !query 0 output
Empty expected schema is also useful to easily recognize a statement that ended up in an error (otherwise we probably need to check the output for containing exception which seems less elegant, or the schema containing struct<> and output being non-empty which seems less intuitive).
I utilized empty expected schema to fix a sorting issue in ThriftServerQueryTestSuite: https://github.com/apache/spark/pull/26028/files#diff-b3ea3021602a88056e52bf83d8782de8R147, and there might be other cases in the future where this change could help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fully understood that, but it's not worth of this huge change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, I've dropped that part of changes.
|
Test build #111787 has finished for PR 26028 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll review this again after the removal of the first one. Thanks.
…ftServerQueryTestSuite
2e1c235 to
58e1cf1
Compare
Ok, thanks. I removed the first one. |
|
Thank you for updating, @peter-toth ! |
|
@peter-toth . For the rest of the contribution, they are a kind of preventive approach and there is no change in the current generated result, right?
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give us more concrete example when this PR becomes more meaningful? For now, this seems to be not required urgently.
Currently The unexpected reversed order of expected output (error message comes first, then the exception class) is due to this line: https://github.com/apache/spark/pull/26028/files#diff-b3ea3021602a88056e52bf83d8782de8L146. It should not sort the expected output if there was an error during execution. Other changes belong to the second point, a minor improvement to handle exceptions at a common place. There is no change in generated expected result. |
|
Test build #111821 has finished for PR 26028 at commit
|
|
Got it. Thanks, @peter-toth ! I updated the second section of the PR description with your example. |
| case _ => plan.children.iterator.exists(isSorted) | ||
| } | ||
|
|
||
| protected def handleExceptions(result: => (String, Seq[String])): (String, Seq[String]) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a function description because we override this differently?
- SQLQueryTestSuite seems to return
(struct<>, ...) - ThriftServerQueryTestSuite seems to return
("", answer.sorted)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, both returns a (String, Seq[String]) tuple where the first is the schema and the second is the result. Since it's impossible to get the exact spark schema back from a java.sql.ResultSet we use empty string in ThriftServerQueryTestSuite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a description to it and to its override.
| // with a generic pattern "###". | ||
| val msg = if (a.plan.nonEmpty) a.getSimpleMessage else a.getMessage | ||
| (StructType(Seq.empty), Seq(a.getClass.getName, msg.replaceAll("#\\d+", "#x"))) | ||
| (emptySchema, Seq(a.getClass.getName, msg.replaceAll("#\\d+", "#x"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test case which this is required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular test case. Since I touched this method and StructType(Seq.empty) was used 3 times so I just moved it to a val.
|
Also, cc @wangyum . |
|
Test build #111852 has finished for PR 26028 at commit
|
|
@dongjoon-hyun @wangyum do you think this PR is ok now? |
wangyum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Hi, @wangyum . You can merge this after manual testing. |
|
retest this please |
|
Test build #111951 has finished for PR 26028 at commit
|
|
retest this please |
|
Test build #111958 has finished for PR 26028 at commit
|
|
Thank you @peter-toth @dongjoon-hyun |
|
Merged to master. |
|
Thank you @dongjoon-hyun and @wangyum for the review. |
What changes were proposed in this pull request?
This PR adds 2 changes regarding exception handling in
SQLQueryTestSuiteandThriftServerQueryTestSuiteThriftServerQueryTestSuiteas if there is an exception then there is no need for sorthandleExceptionsmethodWhy are the changes needed?
Currently
ThriftServerQueryTestSuitepasses on master, but it fails on one of my PRs (#23531) with this error (https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/111651/testReport/org.apache.spark.sql.hive.thriftserver/ThriftServerQueryTestSuite/sql_3/):The unexpected reversed order of expected output (error message comes first, then the exception class) is due to this line: https://github.com/apache/spark/pull/26028/files#diff-b3ea3021602a88056e52bf83d8782de8L146. It should not sort the expected output if there was an error during execution.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Existing UTs.