Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Sep 19, 2018

What changes were proposed in this pull request?

In SPARK-23711, we have implemented the expression fallback logic to an interpreted mode. So, this pr fixed code to support the same fallback mode in SafeProjection based on CodeGeneratorWithInterpretedFallback.

How was this patch tested?

Add tests in CodeGeneratorWithInterpretedFallbackSuite and UnsafeRowConverterSuite.

@SparkQA
Copy link

SparkQA commented Sep 19, 2018

Test build #96251 has finished for PR 22468 at commit 96b0f62.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class InterpretedSafeProjection(expressions: Seq[Expression]) extends Projection

@SparkQA
Copy link

SparkQA commented Sep 19, 2018

Test build #96252 has finished for PR 22468 at commit d8f55da.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class InterpretedSafeProjection(expressions: Seq[Expression]) extends Projection

@SparkQA
Copy link

SparkQA commented Sep 19, 2018

Test build #96261 has finished for PR 22468 at commit 0ee6a00.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class InterpretedSafeProjection(expressions: Seq[Expression]) extends Projection

@SparkQA
Copy link

SparkQA commented Sep 19, 2018

Test build #96262 has finished for PR 22468 at commit bc5f144.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class InterpretedSafeProjection(expressions: Seq[Expression]) extends Projection

@maropu
Copy link
Member Author

maropu commented Oct 4, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Oct 4, 2018

Test build #96918 has finished for PR 22468 at commit bc5f144.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class InterpretedSafeProjection(expressions: Seq[Expression]) extends Projection

@SparkQA
Copy link

SparkQA commented Oct 15, 2018

Test build #97393 has finished for PR 22468 at commit f658ac9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class InterpretedSafeProjection(expressions: Seq[Expression]) extends Projection

@maropu
Copy link
Member Author

maropu commented Oct 16, 2018

cc: @cloud-fan @viirya

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does SafeProjection need to handle NoOp? It's only used with MutableProjection in aggregate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the input expressions in UnsafeProjection possibly have NoOps passed from aggregate expressions? So, IIUC GenerateSafeProjection handles NoOps here:


I'm not 100% sure though...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeGenerator.isPrimitiveType

Copy link
Member Author

@maropu maropu Oct 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. BTW, isPrimitive is a general helper function, so can we move this func. from CodeGenerator to an other place, e.g., object DataType? NVM, we don't need isPrimitivieType here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add this optimization at the beginning. We can add it later with a benchmark.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we share it with other interpreted projections?

Copy link
Member Author

@maropu maropu Oct 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll brush up. The master doesn't have this logic yet in the other interpreted projections, and #22512 has the same logic. So, I'll fix #22512 first, then share it in this pr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we change it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because we use MyDenseVectorUDT in UnsafeRowConverterSuite.scala for unit tests. (MyDenseVectorUDT is located in the core, but UnsafeRowConverterSuite located in the catalyst).

@SparkQA
Copy link

SparkQA commented Oct 23, 2018

Test build #97891 has finished for PR 22468 at commit 9d7b519.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 24, 2018

Test build #97963 has finished for PR 22468 at commit 2b25d09.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 3, 2018

Test build #99592 has finished for PR 22468 at commit 1c5231e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Dec 3, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Dec 3, 2018

Test build #99610 has finished for PR 22468 at commit 1c5231e.

  • This patch fails build dependency tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can rebase now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99639 has finished for PR 22468 at commit cec8480.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

/**
* A projection that could turn UnsafeRow into GenericInternalRow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep this comment?

val exprs = Seq(Add(BoundReference(0, IntegerType, nullable = true), Literal.create(1)), NoOp)
val input = InternalRow.fromSeq(1 :: 1 :: Nil)
val expected = 2 :: null :: Nil
withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> codegenOnly) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use testWithBothCodegenAndIntepreted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, this is the code style in this test suite

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99641 has finished for PR 22468 at commit 0b23adb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
// we need to take care of it to compare rows.
def toComparable(d: Any): Any = d match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does nothing, isn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we cannot compare ArrayBasedMapDatas directly (that is, assert(mapResultRow === mapExpectedRow) fails), I just converted them into the Seqs of keys/values by this method.


val mapResultRow = convertBackToInternalRow(mapRow, fields4).toSeq(fields4)
val mapExpectedRow = mapRow.toSeq(fields4)
// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should implement equals and hashCode in ArrayBasedMapData and UnsafeMapData.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can use ExpressionEvalHelper.checkResult here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed code to use ExpressionEvalHelper.checkResult.

I don't remember correctly though, we might have some historical reasons about that; ArrayBasedMapData has no hashCode and equals. Probably, somebody might know this... cc: @hvanhovell @viirya

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArrayBasedMapData/UnsafeMapData does not have equals() or hashCode() implemented because we do not have a good story around map equality. Implementing equals/hashcode for map is only half of the solution, we would also need a comparable binary format.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, thanks. I remember that its related to SPARK-18134.

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99650 has finished for PR 22468 at commit fbfbbff.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99647 has finished for PR 22468 at commit 7ef5f86.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Dec 4, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Dec 4, 2018

Test build #99655 has finished for PR 22468 at commit fbfbbff.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 2612848 Dec 4, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
… mode

## What changes were proposed in this pull request?
In SPARK-23711, we have implemented the expression fallback logic to an interpreted mode. So, this pr fixed code to support the same fallback mode in `SafeProjection` based on `CodeGeneratorWithInterpretedFallback`.

## How was this patch tested?
Add tests in `CodeGeneratorWithInterpretedFallbackSuite` and `UnsafeRowConverterSuite`.

Closes apache#22468 from maropu/SPARK-25374-3.

Authored-by: Takeshi Yamamuro <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants