[SPARK-32668][SQL] HiveGenericUDTF initialize UDTF should use StructObjectInspector method#29490
[SPARK-32668][SQL] HiveGenericUDTF initialize UDTF should use StructObjectInspector method#29490ulysses-you wants to merge 11 commits intoapache:masterfrom
Conversation
| protected lazy val inputInspectors = children.map(toInspector) | ||
| protected lazy val inputInspectors = { | ||
| val inspectors = children.map(toInspector) | ||
| val fields = inspectors.indices.map(index => s"col$index").asJava |
There was a problem hiding this comment.
The field name is not important so use col0 col1 ...
|
Test build #127692 has finished for PR 29490 at commit
|
|
yes, I have the same problem while using hive UDTF in spark-sql or spark.sql because of not override |
|
Then what do you think about this ? cc @dongjoon-hyun @sunchao |
|
Thank you for pinging me, @ulysses-you . |
|
Test build #131822 has finished for PR 29490 at commit
|
|
oops sorry @ulysses-you just remembered that you pinged me on this PR. This looks mostly good to me exception one question: since the new API is added in 0.13 and Spark still support Hive 0.12, do we need to take care of backward compatibility here? cc @wangyum |
|
cc @somani |
|
thanks @sunchao . If don't miss something, Hive1.2 has been removed since SPARK-32981 with branch-3.1. Hive0.12 is so far for master branch so we don't need care about compatible with it. cc @dongjoon-hyun isn't it ? |
|
@sunchao , you think this will be affected by the Hive metastore client versions? |
This is the part I'm not sure, that is, Spark loading permanent UDF classes from a HMS with version 0.12. But apparently Hive 0.12 doesn't support permanent UDF so seems this is not an issue. |
|
BTW #30665 is trying to solve the same issue and we should consolidate on one PR. |
|
retest this please. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #133578 has finished for PR 29490 at commit
|
|
Test build #133580 has finished for PR 29490 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test status success |
|
Kubernetes integration test starting |
|
Test build #133581 has finished for PR 29490 at commit
|
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Test build #133582 has finished for PR 29490 at commit
|
|
Kubernetes integration test status success |
|
Test build #133583 has finished for PR 29490 at commit
|
sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala
Outdated
Show resolved
Hide resolved
sql/hive/src/test/java/org/apache/spark/sql/hive/execution/UDTFStack3.java
Outdated
Show resolved
Hide resolved
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
Outdated
Show resolved
Hide resolved
|
I think there could be cases where one registers Hive 0.12 UDFs (although not sure how rare this is) in a HMS that Spark talks to, and things may fail when calling the UDFs since they don't have the new |
|
@sunchao you mean user create a permanent udf which is from Hive0.12 build-in function ? If so I believe it's really rare .. |
|
@ulysses-you nvm please ignore my comment above :) I was thinking the case where Spark somehow loads the Hive 0.12 |
|
Do we still need this error message? |
|
Test build #133632 has finished for PR 29490 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Merged to master. |
|
thanks all! |
What changes were proposed in this pull request?
Use
initialize(StructObjectInspector argOIs)insteadinitialize(ObjectInspector[] args)inHiveGenericUDTF.Why are the changes needed?
In our case, we implement a Hive
GenericUDTFand overrideinitialize(StructObjectInspector argOIs). Then it's ok to execute with Hive, but failed with Spark SQL. Here is the Spark SQL error msg:The reason is Spark
HiveGenericUDTFcallinitialize(ObjectInspector[] argOIs)to init a UDTF, but it's a Deprecated method.We should use
initialize(StructObjectInspector argOIs)to do this so that we can be compatible both of the two method. Same as Hive.Does this PR introduce any user-facing change?
Yes, fix UDTF initialize method.
How was this patch tested?
manual test and passed
HiveUDFDynamicLoadSuite