Skip to content
Closed
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1261,8 +1261,42 @@ case class InitializeJavaBean(beanInstance: Expression, setters: Map[String, Exp
override def children: Seq[Expression] = beanInstance +: setters.values.toSeq
override def dataType: DataType = beanInstance.dataType

override def eval(input: InternalRow): Any =
throw new UnsupportedOperationException("Only code-generated evaluation is supported.")
private lazy val resolvedSetters = {
assert(beanInstance.dataType.isInstanceOf[ObjectType])

val ObjectType(beanClass) = beanInstance.dataType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to put assert(beanInstance.dataType.isInstanceOf[ObjectType]) in the constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

setters.map {
case (name, expr) =>
// Looking for known type mapping first, then using Class attached in `ObjectType`.
// Finally also looking for general `Object`-type parameter for generic methods.
val paramTypes = CallMethodViaReflection.typeMapping.getOrElse(expr.dataType,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As #20753, we need to add other types into the mapping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, who is making those changes? You or @kiszk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have no time to do tonight, if @kiszk do not make those changes, I will do it tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the similar variable typeJavaMapping and typeJavaMapping. This is because typeMapping is used for CallMethodViaReflection whose comment says For now, only types defined in Reflect.typeMapping are supported (basically primitives and string) as input types.

If we can expand this support to additional classes (e.g. DateType, TimestampType, BinaryType, and CalendarIntervalType), I can merge three variables into one typeMapping.

WDYT? @hvanhovell and @viirya

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any special reason it only supports basically primitives and string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to expand the support, maybe we can have another PR to expand it in CallMethodViaReflection and merge those variables.

Copy link
Member

@kiszk kiszk Mar 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Is it better to have separate map variables, for now? cc @hvanhovell

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not coming back to this sooner. AFAIK CallMethodViaReflection expression was only designed to work with a couple of primitives. I think we are looking for something a little bit more complete here, i.e. support all types in Spark SQL's type system. I also don't think that we should put the mappings in CallMethodViaReflection because the mapping is now using in more expressions, ScalaReflection is IMO a better place for this logic.

And finally which PR will implement this. cc @maropu for visibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #20753 and #20797

Seq(expr.dataType.asInstanceOf[ObjectType].cls)) ++ Seq(classOf[Object])
val methods = paramTypes.flatMap { fieldClass =>
try {
Some(beanClass.getDeclaredMethod(name, fieldClass))
} catch {
case e: NoSuchMethodException => None
}
}
if (methods.isEmpty) {
throw new NoSuchMethodException(s"""A method named "$name" is not declared """ +
"in any enclosing class nor any supertype, nor through a static import")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message is copied from compiling error thrown in codegen execution. So we can test the same error message for both execution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the last bit , nor through a static import?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

}
methods.head -> expr
}
}

override def eval(input: InternalRow): Any = {
val instance = beanInstance.eval(input)
if (instance != null) {
val bean = instance.asInstanceOf[Object]
resolvedSetters.foreach {
case (setter, expr) =>
setter.invoke(bean, expr.eval(input).asInstanceOf[Object])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the cast to .asInstanceOf[Object]? And why not use AnyRef?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Without cast:

type mismatch;
[error]  found   : Any
[error]  required: Object
[error]           setter.invoke(bean, expr.eval(input))

As you seen it is required Object, but I think AnyRef is equivalent and works too. Do you prefer AnyRef?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnyRef is a bit more scala like, that is all.

}
}
instance
}

override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
val instanceGen = beanInstance.genCode(ctx)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,8 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks {

protected def checkEvaluation(
expression: => Expression, expected: Any, inputRow: InternalRow = EmptyRow): Unit = {
val expr = prepareEvaluation(expression)
// Make it as method to obtain fresh expression everytime.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content of bean instance will be changed after first evaluation of interpreted execution. For example, in the added unit test, the input bean of the later evaluation will become [1] not []. So the later evaluation result will be [1, 1].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using a literal? Ok, makes sense.

def expr = prepareEvaluation(expression)
val catalystValue = CatalystTypeConverters.convertToCatalyst(expected)
checkEvaluationWithoutCodegen(expr, catalystValue, inputRow)
checkEvaluationWithGeneratedMutableProjection(expr, catalystValue, inputRow)
Expand Down Expand Up @@ -111,12 +112,14 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks {
val errMsg = intercept[T] {
eval
}.getMessage
if (errMsg != expectedErrMsg) {
if (!errMsg.contains(expectedErrMsg)) {
Copy link
Member Author

@viirya viirya Mar 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For codegen error, it has very verbose message like:

Code generation of initializejavabean([], (nonexisting,1)) failed:
[info]   java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 22
: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 22: A method named "nonexisting
" is not declared in any enclosing class nor any supertype, nor through a static import
[info]   java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 22
: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 22: A method named "nonexisting
" is not declared in any enclosing class nor any supertype, nor through a static import
[info]          at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
[info]          at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)

So changes it to test if it contains the given error message.

fail(s"Expected error message is `$expectedErrMsg`, but `$errMsg` found")
}
}
}
val expr = prepareEvaluation(expression)

// Make it as method to obtain fresh expression everytime.
def expr = prepareEvaluation(expression)
checkException(evaluateWithoutCodegen(expr, inputRow), "non-codegen mode")
checkException(evaluateWithGeneratedMutableProjection(expr, inputRow), "codegen mode")
if (GenerateUnsafeProjection.canSupport(expr.dataType)) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,23 @@ class ObjectExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
mapEncoder.serializer.head, mapExpected, mapInputRow)
}

test("SPARK-23593: InitializeJavaBean should support interpreted execution") {
val list = new java.util.LinkedList[Int]()
list.add(1)

val initializeBean = InitializeJavaBean(Literal.fromObject(new java.util.LinkedList[Int]),
Map("add" -> Literal(1)))
checkEvaluation(initializeBean, list, InternalRow.fromSeq(Seq()))

val initializeWithNonexistingMethod = InitializeJavaBean(
Literal.fromObject(new java.util.LinkedList[Int]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test for when the parameter types do not match up?

Copy link
Member Author

@viirya viirya Mar 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added below.

Note that because for generic method, its parameter type is Object, so for LinkedList[Int], it doesn't make sense to test it with something like Map("add" -> Literal("a string")). So I add TestBean to test this case.

Map("nonexisting" -> Literal(1)))
checkExceptionInExpression[Exception](initializeWithNonexistingMethod,
InternalRow.fromSeq(Seq()),
"""A method named "nonexisting" is not declared in any enclosing class """ +
"nor any supertype, nor through a static import")
}

test("SPARK-23585: UnwrapOption should support interpreted execution") {
val cls = classOf[Option[Int]]
val inputObject = BoundReference(0, ObjectType(cls), nullable = true)
Expand Down