-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25374][SQL] SafeProjection supports fallback to an interpreted mode #22468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #96251 has finished for PR 22468 at commit
|
96b0f62 to
d8f55da
Compare
|
Test build #96252 has finished for PR 22468 at commit
|
d8f55da to
0ee6a00
Compare
|
Test build #96261 has finished for PR 22468 at commit
|
0ee6a00 to
bc5f144
Compare
|
Test build #96262 has finished for PR 22468 at commit
|
|
retest this please |
|
Test build #96918 has finished for PR 22468 at commit
|
|
Test build #97393 has finished for PR 22468 at commit
|
|
cc: @cloud-fan @viirya |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does SafeProjection need to handle NoOp? It's only used with MutableProjection in aggregate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC the input expressions in UnsafeProjection possibly have NoOps passed from aggregate expressions? So, IIUC GenerateSafeProjection handles NoOps here:
Line 153 in 3b45567
| case (NoOp, _) => "" |
I'm not 100% sure though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeGenerator.isPrimitiveType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. BTW, NVM, we don't need isPrimitive is a general helper function, so can we move this func. from CodeGenerator to an other place, e.g., object DataType?isPrimitivieType here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not add this optimization at the beginning. We can add it later with a benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we share it with other interpreted projections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we change it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because we use MyDenseVectorUDT in UnsafeRowConverterSuite.scala for unit tests. (MyDenseVectorUDT is located in the core, but UnsafeRowConverterSuite located in the catalyst).
|
Test build #97891 has finished for PR 22468 at commit
|
|
Test build #97963 has finished for PR 22468 at commit
|
|
Test build #99592 has finished for PR 22468 at commit
|
|
retest this please |
|
Test build #99610 has finished for PR 22468 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rebase now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
Test build #99639 has finished for PR 22468 at commit
|
| } | ||
|
|
||
| /** | ||
| * A projection that could turn UnsafeRow into GenericInternalRow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep this comment?
| val exprs = Seq(Add(BoundReference(0, IntegerType, nullable = true), Literal.create(1)), NoOp) | ||
| val input = InternalRow.fromSeq(1 :: 1 :: Nil) | ||
| val expected = 2 :: null :: Nil | ||
| withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> codegenOnly) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use testWithBothCodegenAndIntepreted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, this is the code style in this test suite
|
Test build #99641 has finished for PR 22468 at commit
|
|
|
||
| // Since `ArrayBasedMapData` does not override `equals` and `hashCode`, | ||
| // we need to take care of it to compare rows. | ||
| def toComparable(d: Any): Any = d match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does nothing, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we cannot compare ArrayBasedMapDatas directly (that is, assert(mapResultRow === mapExpectedRow) fails), I just converted them into the Seqs of keys/values by this method.
|
|
||
| val mapResultRow = convertBackToInternalRow(mapRow, fields4).toSeq(fields4) | ||
| val mapExpectedRow = mapRow.toSeq(fields4) | ||
| // Since `ArrayBasedMapData` does not override `equals` and `hashCode`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should implement equals and hashCode in ArrayBasedMapData and UnsafeMapData.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can use ExpressionEvalHelper.checkResult here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed code to use ExpressionEvalHelper.checkResult.
I don't remember correctly though, we might have some historical reasons about that; ArrayBasedMapData has no hashCode and equals. Probably, somebody might know this... cc: @hvanhovell @viirya
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ArrayBasedMapData/UnsafeMapData does not have equals() or hashCode() implemented because we do not have a good story around map equality. Implementing equals/hashcode for map is only half of the solution, we would also need a comparable binary format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, thanks. I remember that its related to SPARK-18134.
|
Test build #99650 has finished for PR 22468 at commit
|
|
Test build #99647 has finished for PR 22468 at commit
|
|
retest this please |
|
Test build #99655 has finished for PR 22468 at commit
|
|
thanks, merging to master! |
… mode ## What changes were proposed in this pull request? In SPARK-23711, we have implemented the expression fallback logic to an interpreted mode. So, this pr fixed code to support the same fallback mode in `SafeProjection` based on `CodeGeneratorWithInterpretedFallback`. ## How was this patch tested? Add tests in `CodeGeneratorWithInterpretedFallbackSuite` and `UnsafeRowConverterSuite`. Closes apache#22468 from maropu/SPARK-25374-3. Authored-by: Takeshi Yamamuro <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
In SPARK-23711, we have implemented the expression fallback logic to an interpreted mode. So, this pr fixed code to support the same fallback mode in
SafeProjectionbased onCodeGeneratorWithInterpretedFallback.How was this patch tested?
Add tests in
CodeGeneratorWithInterpretedFallbackSuiteandUnsafeRowConverterSuite.