[SPARK-33704][SQL] Support latest version of initialize() in HiveGenericUDTF#30665
[SPARK-33704][SQL] Support latest version of initialize() in HiveGenericUDTF#30665southernriver wants to merge 3 commits intoapache:branch-2.4from
Conversation
|
cc @wangyum |
|
ok to test |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #132480 has finished for PR 30665 at commit
|
|
@southernriver Could you make pr against master branch? |
wangyum
left a comment
There was a problem hiding this comment.
Remove build/._scala-2.12.10 and build/._zinc-0.3.15.
| try { | ||
| udfExpr = Some(HiveGenericUDTF(name, new HiveFunctionWrapper(clazz.getName), input)) | ||
| // Force it to check input data types. | ||
| udfExpr.get.asInstanceOf[HiveGenericUDTF].elementSchema | ||
| } catch { | ||
| case exception: Exception => | ||
| logInfo(s"HiveGenericUDTF initialize(ObjectInspector[] args) is deprecated, and" + | ||
| s" we will suit the latest version of initialize(StructObjectInspector argOIs).") | ||
| udfExpr = Some(HiveGenericUDTF(name, new HiveFunctionWrapper(clazz.getName), | ||
| input, false)) | ||
| udfExpr.get.asInstanceOf[HiveGenericUDTF].elementSchema |
There was a problem hiding this comment.
How about?
val funcWrapper = new HiveFunctionWrapper(clazz.getName)
try {
udfExpr = Some(HiveGenericUDTF(name, funcWrapper, input, true))
// Force it to check data types.
udfExpr.get.asInstanceOf[HiveGenericUDTF].elementSchema
} catch {
case e: IllegalStateException if e.getMessage.equals("Should not be called directly") =>
logInfo("Fallback to use the non deprecated UDTF constructor.")
udfExpr = Some(HiveGenericUDTF(name, funcWrapper, input, false))
// Force it to check data types.
udfExpr.get.asInstanceOf[HiveGenericUDTF].elementSchema
}| funcWrapper: HiveFunctionWrapper, | ||
| children: Seq[Expression]) | ||
| children: Seq[Expression], | ||
| deprecated: Boolean = true) |
There was a problem hiding this comment.
deprecated: Boolean = true -> isDeprecatedConstructor: Boolean?
| protected lazy val outputInspector = | ||
| if (deprecated) { | ||
| function.initialize(inputInspectors.toArray) | ||
| } else { | ||
| function.initialize(rowOI) | ||
| } |
There was a problem hiding this comment.
How about?
protected lazy val outputInspector = {
if (isDeprecatedConstructor) {
function.initialize(inputInspectors.toArray)
} else {
val rowOI = ObjectInspectorFactory.getStandardStructObjectInspector(
children.zipWithIndex.map(e => s"_col${e._2}").asJava, inputInspectors.asJava)
function.initialize(rowOI)
}
}| val num = | ||
| sql("SELECT udtf_stack2(2, 'A', 10, date '2015-01-01', 'B', 20, date '2016-01-01')").count() | ||
| assert(num === 2) |
There was a problem hiding this comment.
How about?
checkAnswer(
sql("SELECT udtf_stack2(2, 'A', 10, date '2015-01-01', 'B', 20, date '2016-01-01')"),
Seq(Row("A", 10, java.sql.Date.valueOf("2015-01-01")),
Row("B", 20, java.sql.Date.valueOf("2016-01-01"))))|
@wangyum Thanks a lot, that's good catch. I'll do some fix according to your suggestion. |
|
@southernriver Any update? |
| // Force it to check input data types. | ||
| udfExpr.get.asInstanceOf[HiveGenericUDTF].elementSchema | ||
| } catch { | ||
| case exception: Exception => |
There was a problem hiding this comment.
why do we need this fallback mechanism? can we just switch to the new API without the deprecated flag?
|
@southernriver is busy these days. Let's close this. |
What changes were proposed in this pull request?
For HiveGenericUDTF , there are two initialization methods:
As https://issues.apache.org/jira/browse/HIVE-5737 mentioned, hive provided StructObjectInspector for UDTFs rather than ObjectInspect[], but Spark SQL still only support deprecated function.
An exception will be reported before fix:
This pr will resolve the exception.
Why are the changes needed?
For the migration of Hive to Spark SQL,we face many UDTF functions throw exception because this issue,It’s really a great improvement for attracting users to Spark SQL.
Does this PR introduce any user-facing change?
NO
How was this patch tested?
manual