Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Jul 7, 2015

The current implementation can't handle List<> as a return type in Hive UDF and
throws meaningless Match Error.
We assume an UDF below;
public class UDFToListString extends UDF {
public List evaluate(Object o)
{ return Arrays.asList("xxx", "yyy", "zzz"); }
}
An exception of scala.MatchError is thrown as follows when the UDF used;
scala.MatchError: interface java.util.List (of class java.lang.Class)
at org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174)
at org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76)
at org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106)
at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106)
at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131)
at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95)
at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278)
...
To make udf developers more understood, we need to throw a more suitable exception.

@maropu
Copy link
Member Author

maropu commented Jul 7, 2015

@marmbrus Through the discussion of #5395, I think it is hard to support java List<> types in SparkSQL because of type erasure. ISTM that if udf developers use this type, they'd be better to use GenericUDF interfaces instead of UDF ones. So, I re-created a PR to throw a meaningful exception when this kind of types used.

Any thought?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assign the result of this function to a variable and check that the message is correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and Does it satisfy your comment?

@marmbrus
Copy link
Contributor

marmbrus commented Jul 7, 2015

ok to test

@marmbrus
Copy link
Contributor

marmbrus commented Jul 7, 2015

This looks great! One minor comment on the tests.

@maropu
Copy link
Member Author

maropu commented Jul 7, 2015

@marmbrus Ok and thanks.
After this patch merged, I'll make a same patch for Map<> because it has the same issue.

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36628 has finished for PR 7248 at commit 56305de.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

marmbrus commented Jul 7, 2015

Thanks! Merging to master.

@asfgit asfgit closed this in 1821fc1 Jul 7, 2015
@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36629 has finished for PR 7248 at commit 1c3df2a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Jul 8, 2015
…ap<K,V> types used in Hive UDF

To make UDF developers understood, throw an exception when unsupported Map<K,V> types used in Hive UDF. This fix is the same with #7248.

Author: Takeshi YAMAMURO <[email protected]>

Closes #7257 from maropu/ThrowExceptionWhenMapUsed and squashes the following commits:

916099a [Takeshi YAMAMURO] Fix style errors
7886dcc [Takeshi YAMAMURO] Throw an exception when Map<> used in Hive UDF
@maropu maropu deleted the FixBugInHiveInspectors branch July 5, 2017 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants