-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-32412][SQL] Unify error handling for spark thrift server operations #29204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #126411 has finished for PR 29204 at commit
|
| } | ||
| previousFetchEndOffset = resultOffset | ||
| log.info(s"Returning result set with ${curRow} rows from offsets " + | ||
| logInfo(s"Returning result set with ${curRow} rows from offsets " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is log.info and logInfo the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logInfo will check log.isInfoEnabled
| throw new HiveSQLException("The background threadpool cannot accept" + | ||
| " new task for execution, please retry the operation", rejected) | ||
| case NonFatal(e) => | ||
| logError(s"Error executing query in background", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we keep the message ... in background in https://github.com/apache/spark/pull/29204/files#diff-424526b50bfb53733a4c2e6c6a3ddd8dR121 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can keep this.
|
Test build #126463 has finished for PR 29204 at commit
|
|
Test build #126479 has finished for PR 29204 at commit
|
|
retest this please |
|
Test build #126606 has finished for PR 29204 at commit
|
| protected def sqlContext: SQLContext | ||
|
|
||
| protected var statementId = getHandle().getHandleIdentifier().getPublicId().toString() | ||
| protected val statementId: String = getHandle.getHandleIdentifier.toString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we revert this change? I see places that use statementId = UUID.randomUUID().toString. It's better to have a separated PR to unify this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HandleIdentifer.publicId is actually UUID.randomUUID() too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will revert this change in this pr
|
Test build #126622 has finished for PR 29204 at commit
|
| throw new HiveSQLException("The background threadpool cannot accept" + | ||
| " new task for execution, please retry the operation", rejected) | ||
| case _ => | ||
| val tips = if (shouldRunAsync()) "in background" else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val tips = if (shouldRunAsync()) " in background" else "" and ...$getType$tips ${e.getMessage}, to avoid extra space.
|
Test build #126633 has finished for PR 29204 at commit
|
|
gentle ping @cloud-fan |
|
thanks, merging to master! |
| " new task for execution, please retry the operation", rejected) | ||
| case _ => | ||
| val tips = if (shouldRunAsync()) " in background" else "" | ||
| throw new HiveSQLException(s"Error operating $getType$tips: ${e.getMessage}", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like we missed some details:
- Shall we expose the internal details (
getType) to users or just sayError running query:? - Shall we include the full exception string (
e.toStringlike https://github.com/apache/spark/pull/29204/files#diff-72dcd8f81a51c8a815159fdf0332acdcL316), not just the message?
Since the PR is small, can we revert and resend it after proper discussion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…er operations" ### What changes were proposed in this pull request? This reverts commit 510a165. ### Why are the changes needed? see #29204 (comment) ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? pass ci tools Closes #29531 from yaooqinn/revert. Authored-by: Kent Yao <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Log error/warn message only once at the server-side for both sync and async modes
Why are the changes needed?
In b151194 we make the error logging for SparkExecuteStatementOperation with
runInBackground=truenot duplicated, but the operations with runInBackground=false and other metadata operation still will be log twice which happened in the operation'srunInternalmethod and ThriftCLIService.In this PR, I propose to reflect the logic to get a unified error handling approach.
Does this PR introduce any user-facing change?
Yes, when spark.sql.hive.thriftServer.async=false and people call sync APIs the error message will be logged only once at server-side.
How was this patch tested?
locally verified the result in target/unit-test.log
add unit tests.