[SPARK-21533][SQL] Print warning messages when override function configure found in Hive UDFs#18768
[SPARK-21533][SQL] Print warning messages when override function configure found in Hive UDFs#18768maropu wants to merge 3 commits intoapache:masterfrom
configure found in Hive UDFs#18768Conversation
|
@gatorsmile @hvanhovell What do you think this? Thanks! |
|
Test build #80035 has finished for PR 18768 at commit
|
|
Test build #80037 has finished for PR 18768 at commit
|
|
Test build #80038 has finished for PR 18768 at commit
|
|
Jenkins, retest this please. |
| } | ||
|
|
||
| def validateHiveUserDefinedFunction(udfClass: Class[_]): Unit = { | ||
| if (hasInheritanceOf[GenericUDF]("configure", udfClass) || |
There was a problem hiding this comment.
When GenericUDF API has configure method? Seems GenericUDF at 0.10.0 has no such method?
https://hive.apache.org/javadocs/r0.10.0/api/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.html
There was a problem hiding this comment.
I checked it seemed this API had been implemented in v0.11.0 (https://github.com/apache/hive/blob/release-0.11.0/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). This is the commit for this API.
There was a problem hiding this comment.
If ran with a Hive version that configure is not implemented yet, is hasInheritanceOf safe from NoSuchMethodException ?
There was a problem hiding this comment.
IIUC spark always refers to hive-exec-1.2.1.spark2.jar , so it seems we have no chance to get the exception. But, I think it is not a bad idea to catch NoSuchMethodException there for understandability.
There was a problem hiding this comment.
Oh, that's right. As you said, it is still good to catch the exception.
|
Test build #80040 has finished for PR 18768 at commit
|
| } | ||
|
|
||
| private def isSubClassOf(t: Type, parent: Class[_]): Boolean = t match { | ||
| case cls: Class[_] => parent.isAssignableFrom(cls) |
There was a problem hiding this comment.
Will you pass in something not a Class[_]? If not, we can simply inline isAssignableFrom check into hasInheritanceOf. Then we don't need to have Type.
|
Btw, is it possibly to add an unit test for it? |
|
Test build #80052 has finished for PR 18768 at commit
|
|
Test build #80053 has finished for PR 18768 at commit
|
|
Is this the only function we miss for different types of Hive UD(A/T)Fs? |
|
Could we document this limit in the https://spark.apache.org/docs/latest/sql-programming-guide.html#unsupported-hive-functionality? |
| * Construct a [[FunctionBuilder]] based on the provided class that represents a function. | ||
| */ | ||
| private def makeFunctionBuilder(name: String, clazz: Class[_]): FunctionBuilder = { | ||
| validateHiveUserDefinedFunction(clazz) |
There was a problem hiding this comment.
I am neural to introduce a warning message for this case. Not sure how helpful the warning message will be.
There was a problem hiding this comment.
ok. So, I think it's ok to revisit this again if we have more reports from users.
|
@viirya I've no idea about a way to add tests for this kind of cases and I think the previous prs similar to this case (just add code for warning) also had no test. But, if we could, we'd better to do... @gatorsmile I checked these Hive classes and I found some other functions that Spark currently ignores... (e.g., |
|
Yes! Thanks for your efforts! Please update the documentation at first. |
|
ok, I'll make a pr later. Thanks! |
|
I think the document about these unsupported functions is enough for users, so I'll close this for now. If we have more response from users, we could revisit this again. Thanks. |
## What changes were proposed in this pull request? This pr added documents about unsupported functions in Hive UDF/UDTF/UDAF. This pr relates to apache#18768 and apache#18527. ## How was this patch tested? N/A Author: Takeshi Yamamuro <yamamuro@apache.org> Closes apache#18792 from maropu/HOTFIX-20170731.
What changes were proposed in this pull request?
This pr added code to print warning messages when an unsupported function
public void configure(MapredContext mapredContext)used inGenericUDFandGenericUDF. This is because Spark does not callconfigureinternally and this is error-prone.How was this patch tested?
Manually checked. If this pr applied, you hit a warning message below;