-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25113][SQL] Add logging to CodeGenerator when any generated method's bytecode size goes above HugeMethodLimit #22103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1258,7 +1258,8 @@ abstract class CodeGenerator[InType <: AnyRef, OutType <: AnyRef] extends Loggin | |
|
|
||
| object CodeGenerator extends Logging { | ||
|
|
||
| // This is the value of HugeMethodLimit in the OpenJDK JVM settings | ||
| // This is the default value of HugeMethodLimit in the OpenJDK HotSpot JVM, | ||
| // beyond which methods will be rejected from JIT compilation | ||
| final val DEFAULT_JVM_HUGE_METHOD_LIMIT = 8000 | ||
|
|
||
| // The max valid length of method parameters in JVM. | ||
|
|
@@ -1385,9 +1386,15 @@ object CodeGenerator extends Logging { | |
| try { | ||
| val cf = new ClassFile(new ByteArrayInputStream(classBytes)) | ||
| val stats = cf.methodInfos.asScala.flatMap { method => | ||
| method.getAttributes().filter(_.getClass.getName == codeAttr.getName).map { a => | ||
| method.getAttributes().filter(_.getClass eq codeAttr).map { a => | ||
| val byteCodeSize = codeAttrField.get(a).asInstanceOf[Array[Byte]].length | ||
| CodegenMetrics.METRIC_GENERATED_METHOD_BYTECODE_SIZE.update(byteCodeSize) | ||
|
|
||
| if (byteCodeSize > DEFAULT_JVM_HUGE_METHOD_LIMIT) { | ||
| logInfo("Generated method too long to be JIT compiled: " + | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why info and not debug?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when we hit this, the JIT will very likely not work and performance may drop a lot. This even worth a warning... Since it's just an estimation and Spark SQL can still work, I think info is fine here.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd say either way is fine. They're different tenses and the nuances are slightly different. "This story is too good to be true" |
||
| s"${cf.getThisClassName}.${method.getName} is $byteCodeSize bytes") | ||
| } | ||
|
|
||
| byteCodeSize | ||
| } | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need change this condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
getNamewas accessing the class name unnecessarily and then doing string comparison unnecessarily. Just changing it when touching the code around it.The JVM guarantees that a
(defining class loader, full class name)pair is unique at runtime, in this case thejava.lang.Classinstance is guaranteed to be unique, so a reference equality check is fast and sufficient.There's no worry of cross class loader issue here, because if there is, the code that follows won't work anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know The current comparison is more strict. Although the previous comparison was only for name, the current comparison is for a pair of class loader and name.
I worried whether the strictness may change behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I can tell. And as I've mentioned above, although my new check is stricter, it doesn't make the behavior any "worse" than before, because we're reflectively accessing the
codefield immediately after viacodeAttrField.get(a), and that won't work unless the classes are matching exactly.The old code before my change would actually be too permissive -- in the case of class loader mismatch, the old check will allow the it go run to the reflective access site, but it'll then fail because reflection doesn't allow access from the wrong class.
This can be exemplified by the following pseudocode