You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to apply these UDFs in spark with hive support. Here is the code:
// register UDF
spark.sql("create temporary function id_card_province as 'cc.shanruifeng.functions.card.UDFChinaIdCardProvince'");
// get file
Dataset<Row> rawdata = spark.read().csv("./src/main/resources/starM.csv");
// use UDF
rawdata.createOrReplaceTempView("starM");
Dataset<Row> udfModified = spark.sql("SELECT *, id_card_province(_c13) FROM starM");
udfModified.show();
and I got error:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text cc.shanruifeng.functions.card.UDFChinaIdCardProvince.evaluate(org.apache.hadoop.io.Text) on object cc.shanruifeng.functions.card.UDFChinaIdCardProvince@5c622859 of class cc.shanruifeng.functions.card.UDFChinaIdCardProvince with arguments {652423184510291234:org.apache.hadoop.io.Text} of size 1
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:981)
at org.apache.spark.sql.hive.HiveSimpleUDF.eval(hiveUDFs.scala:91)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply_6$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:235)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:957)
... 18 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:450)
at org.apache.hadoop.io.Text.set(Text.java:198)
at cc.shanruifeng.functions.card.UDFChinaIdCardProvince.evaluate(UDFChinaIdCardProvince.java:23)
... 23 more
Could you please give me some advice on this problem? The column is not null anyway.
The text was updated successfully, but these errors were encountered:
ranmx
changed the title
when apply the udf, appears null pointer error
[BUG]when apply the udf, appears null pointer error
Sep 28, 2017
Hello,
I tried to apply these UDFs in spark with hive support. Here is the code:
and I got error:
Could you please give me some advice on this problem? The column is not null anyway.
The text was updated successfully, but these errors were encountered: