[SPARK-22716][SQL] Avoid the creation of mutable states in addReferenceObj#19916
[SPARK-22716][SQL] Avoid the creation of mutable states in addReferenceObj#19916mgaido91 wants to merge 7 commits intoapache:masterfrom
Conversation
|
Test build #84575 has finished for PR 19916 at commit
|
|
Yea, I noticed that. Thanks. |
| val plan = df.queryExecution.executedPlan | ||
| val sortExec = plan.children.head.asInstanceOf[SortExec] | ||
| sortExec.produce(ctx, plan.asInstanceOf[CodegenSupport]) | ||
| // we expect 8 global variables: |
There was a problem hiding this comment.
This test and that one in BroadcastJoinSuite seems a bit overkill.
There was a problem hiding this comment.
Thanks, I was also not very happy with them too, but I have not found a better way to test them. Do you have any idea/suggestion? Thanks.
There was a problem hiding this comment.
If we make sure addReferenceMinorObj work well, the two tests may not be necessary, IMHO. Let's wait others options.
There was a problem hiding this comment.
+1, we don't need new tests if we trust addReferenceMinorObj
There was a problem hiding this comment.
thanks, I'll remove them
|
LGTM with one minor comment. |
|
Actually I think |
|
I also have the same feeling when I just looked at |
|
my only concern about this PR and about removing |
|
I am reporting the result of my tests. I run: In the compiled code there is an additional operation What do you think given these results? Should I go on with the PR and remove also the remaining |
|
Thanks for the test! Yea lets' remove all of them! |
|
Test build #84651 has finished for PR 19916 at commit
|
|
@mgaido91 Thank you for running the benchmark. I have two questions.
|
|
thanks for your answer @kiszk. I included the code in a main function and run it with the command I showed in the previous comment. For reference I can report here the bytecode: Did I answer properly to your questions? |
|
Thank you for checking Java bytecode. I am talking about native code, not about bytecode. Hotspot compiler may eliminate statements. I think that this elapsed time includes interpreter execution, JIT compilation, and native code execution. It would be good to add warm-up. Did you see the native code sequence generated by the HotSpot compiler? I think that it is not the best to write a long-running loop in How about such a code? WDYT? |
|
sure, I can make the test as you suggested, I'll report here the results in few minutes, thanks. |
|
Thank you again very much for your comment @kiszk! Results are different now and they show a difference in performance. and the results now are: Thus the time needed for casting is visible now. PS I edited the comment because I forget the warmup cycle yesterday. Thanks again to @kiszk for his suggestions about how to run properly this benchmark. |
|
In a real application I think the difference is much smaller than 2.4%. So I think it's ok to remove One problem is readability of the generated code, may be we can generate |
|
@cloud-fan thanks for your answer. I don't think that something like: would be readable. But it is just my opinion. Do you want me to add a local variable to generate more readable code? Or if you think that adding the comment is the right thing to do, I can do that. |
|
adding a local variable to generate more readable code is better, but that needs a lot of caller-side change, which may not worth. |
|
@cloud-fan I think a lot of caller-side changes would be needed as well, since now we are not passing any variable name when we are referencing with Sorry for this additional comment, I just want to make sure that I am providing you all the details to choose the best option. I swear that now if you say, let's add the comment, I'll do. |
I mean something like Then no caller side change is needed. In the future we can also remove |
|
@cloud-fan thanks I am doing this. Actually I already removed |
|
I have a question about benchmark. What is the purpose of this warmup? Why does this code perform warmup run for a loop will not be measured? |
|
@kiszk sorry, may I please ask you to elaborate a bit more what you meant? Thanks. |
876235a to
b36e470
Compare
b36e470 to
bfa3bae
Compare
| case other => | ||
| ev.copy(code = "", value = ctx.addReferenceMinorObj(value, ctx.javaType(dataType))) | ||
| case _ => | ||
| ev.copy(code = "", value = ctx.addReferenceObj("literalValue", value, |
There was a problem hiding this comment.
nit: literal instead of literalValue
| test(null, null, null) | ||
| } | ||
|
|
||
| test("SPARK-22716: UnixTimestamp should not use global variables") { |
There was a problem hiding this comment.
We don't need a lot of this kind of tests, just one test to make sure ctx.addReferenceObj doesn't add global variable.
|
Test build #84716 has finished for PR 19916 at commit
|
|
Test build #84762 has finished for PR 19916 at commit
|
|
Jenkins, retest this please |
|
LGTM, pending jenkins |
|
Test build #84767 has finished for PR 19916 at commit
|
|
Looks like a valid test failure. |
|
@viirya it is caused by the fact that the generated code now is bigger and it triggered |
|
@mgaido91 I am asking the following code. This code will be translated to the native code. However, I think that this code does not make any affect to |
|
@kiszk, yes, sorry, I see now that the code I pasted is not updated. I will update it immediately. I am calling the tho methods instead of that fake warmup cycle. Sorry, my bad. I am updating my previous comment with the right code, sorry. |
|
The PR title and description need to be modified accordingly. |
|
Test build #84771 has finished for PR 19916 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
We have two methods to reference an object
addReferenceMinorObjandaddReferenceObj. The latter creates a new global variable, which means new entries in the constant pool.The PR unifies the two method in a single
addReferenceObjwhich returns the code to access the object in thereferencesarray and doesn't add new mutable states.How was this patch tested?
added UTs.