-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-23711][SPARK-25140][SQL] Catch correct exceptions when expr codegen fails #22154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #94961 has finished for PR 22154 at commit
|
|
IIUC |
| case CodegenError(_) => createInterpretedObject(in) | ||
| case _: Exception => | ||
| // We should have already see error message in `CodeGenerator` | ||
| logError("Expr codegen disabled and falls back to the interpreter mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expr codegen error and falls back to interpreter mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| case CodegenError(_) => createInterpretedObject(in) | ||
| case _: Exception => | ||
| // We should have already see error message in `CodeGenerator` | ||
| logError("Expr codegen disabled and falls back to the interpreter mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logWarning instead of logError? It uses logWarning in WholeStageCodegenExec too.
| case CodegenError(_) => InterpretedUnsafeProjection.createProjection(unsafeExprs) | ||
| case _: Exception => | ||
| // We should have already see error message in `CodeGenerator` | ||
| logError("Expr codegen disabled and falls back to the interpreter mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto. logWarning.
| case CodegenError(_) => InterpretedUnsafeProjection.createProjection(unsafeExprs) | ||
| case _: Exception => | ||
| // We should have already see error message in `CodeGenerator` | ||
| logError("Expr codegen error and falls back to interpreter mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logWarning too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I missed...
|
LGTM |
|
Test build #94983 has finished for PR 22154 at commit
|
|
Test build #94985 has finished for PR 22154 at commit
|
|
retest this please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for adding the logging and also catching that we weren't catching the correct exception type! (Pun intended ;-)
Mostly LGTM. Can I ask for an additional test case of the CODEGEN_ONLY mode with codegen failure, where we should assert an exception was thrown and the cause of the exception is the intended Janino exception type? Thanks!
| override protected def createCodeGeneratedObject(in: Seq[Expression]): UnsafeProjection = { | ||
| val invalidCode = new CodeAndComment("invalid code", Map.empty) | ||
| CodeGenerator.compile(invalidCode) | ||
| null.asInstanceOf[UnsafeProjection] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: do we need the cast here?
| } catch { | ||
| case CodegenError(_) => InterpretedUnsafeProjection.createProjection(unsafeExprs) | ||
| case _: Exception => | ||
| // We should have already see error message in `CodeGenerator` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Perhaps // the error message is already logged in CodeGenerator? Or // We should have already seen the error message in CodeGenerator?
| case CodegenError(_) => InterpretedUnsafeProjection.createProjection(unsafeExprs) | ||
| case _: Exception => | ||
| // We should have already see error message in `CodeGenerator` | ||
| logWarning("Expr codegen error and falls back to interpreter mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I prefer "falling back" instead of "falls back" here. e.g. "Expr codegen error, falling back to interpreter mode"
| } | ||
|
|
||
| test("fallback to the interpreter mode") { | ||
| val input = Seq(IntegerType).zipWithIndex.map(x => BoundReference(x._2, x._1, true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: if we only wanted a one-element-sequence, would it be cleaner to just write
Seq(BoundReference(0, IntegerType, true))?
Or for those that prefer the cons list syntax, BoundReference(0, IntegerType, true) :: Nil. I prefer the former one but I don't have a strong opinion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| val fallback = CodegenObjectFactoryMode.FALLBACK.toString | ||
| withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> fallback) { | ||
| val obj = FailedCodegenProjection.createObject(input) | ||
| assert(obj.isInstanceOf[InterpretedUnsafeProjection]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should/could we also assert on the log message being printed out? Perhaps something similar to https://github.com/apache/spark/pull/22103/files#diff-cf187b40d98ff322d4bde4185701baa2R508 ?
This is optional; the fact that we've gotten back an InterpretedUnsafeProjection is a good enough indicator to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, we could. If other reviewers say +1, I'll update (either's fine to me).
| } catch { | ||
| case CodegenError(_) => createInterpretedObject(in) | ||
| case _: Exception => | ||
| // We should have already see error message in `CodeGenerator` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as the comment below.
|
Test build #94992 has finished for PR 22154 at commit
|
|
retest this please. |
|
@rednaxelafx Thanks for your checks ;) addressed. |
|
Test build #95004 has finished for PR 22154 at commit
|
rednaxelafx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot!
|
Test build #95014 has finished for PR 22154 at commit
|
|
retest this please |
|
Test build #95025 has finished for PR 22154 at commit
|
|
Test build #95015 has finished for PR 22154 at commit
|
|
retest this please |
| createCodeGeneratedObject(in) | ||
| } catch { | ||
| case CodegenError(_) => createInterpretedObject(in) | ||
| case _: Exception => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NonFatal ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah that's a good point. +1 on using the NonFatal extractor, and we probably wanna do the same change for whole-stage codegen's fallback logic as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
| case CodegenError(_) => createInterpretedObject(in) | ||
| case _: Exception => | ||
| // We should have already seen the error message in `CodeGenerator` | ||
| logWarning("Expr codegen error and falling back to interpreter mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to include the exception type we captured in the message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could; although the detail exception type/message/stack trace info would already be printed in the preceding logging from the CodeGenerator so I wouldn't push strongly for adding the exception type here.
(It certainly is easier to do data science on just single lines of logs instead of having to search across a few lines for context, so I do see the benefits of adding the type information albeit redundant. I'll leave it up to you guys @maropu @viirya @cloud-fan @gatorsmile )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to keep the current message. CodeGenerator already has printed many infos for errors.
| GenerateUnsafeProjection.generate(unsafeExprs, subexpressionEliminationEnabled) | ||
| } catch { | ||
| case CodegenError(_) => InterpretedUnsafeProjection.createProjection(unsafeExprs) | ||
| case _: Exception => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same here.
| object CodegenError { | ||
| def unapply(throwable: Throwable): Option[Exception] = throwable match { | ||
| case e: InternalCompilerException => Some(e) | ||
| case e: CompileException => Some(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not have a test case for checking the fallback before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't in the original PR. @maropu -san added one in this PR which is great, plugs the hole.
|
Test build #95047 has finished for PR 22154 at commit
|
|
Test build #95060 has finished for PR 22154 at commit
|
|
Re: Your build failure ('statefulOperators.scala:95: value asJava is not a member of scala.collection.immutable.Map[String,Long]). I am also seeing this in my fork on my laptop (I just updated my fork about 10-15 minutes ago). ..and it was fixed: #22175 |
|
(it it jus a note) btw, currently, if expr codegen fails, the many error messages could happen in both a driver side and executor sides. I feel this is a little noisy for users. I think it'd be super nice if we could validate if all the expr codegen can works well in a driver side in a similar way of |
|
Test build #95069 has finished for PR 22154 at commit
|
|
LGTM Thanks! Merged to master. |
|
|
||
| override protected def createCodeGeneratedObject(in: Seq[Expression]): UnsafeProjection = { | ||
| val invalidCode = new CodeAndComment("invalid code", Map.empty) | ||
| // We assume this compilation throws an exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use this comment as part of an exception (say IllegalStateException or similar) that should be thrown rather than returning null. I think that would make the comment part of the code itself and can be checked in tests (by catching the exception).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggested change is only for making this test suite cleaner, right? In that case I'd +1 with the suggestion of being able to clearly check we're catching the exception we know we're throwing.
Would you like to submit a PR for it?
rather than returning
null
The intent is never to actually reach the null-return, but always cause an exception to be thrown atCodeGenerator.compile()and abruptly return to the caller with the exception. To make the compiler happy you'll have to have some definite-returning statement to end the function, so a useless null-return would probably have to be there anyway (since the compiler can't tell you'll always be throwing an exception unless you do a throw inline)
What changes were proposed in this pull request?
This pr is to fix bugs when expr codegen fails; we need to catch
java.util.concurrent.ExecutionExceptioninstead ofInternalCompilerExceptionandCompileException. This handling is the same with theWholeStageCodegenExecone:spark/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
Line 585 in 60af250
How was this patch tested?
Added tests in
CodeGeneratorWithInterpretedFallbackSuite