spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shenghua Wan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-4854) Custom UDTF with Lateral View throws ClassNotFound exception in Spark SQL CLI
Date Tue, 16 Dec 2014 08:34:13 GMT

    [ https://issues.apache.org/jira/browse/SPARK-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247938#comment-14247938
] 

Shenghua Wan commented on SPARK-4854:
-------------------------------------

I was trying to debugging Spark source code to fix this issue. And I found the function name
resolution might not work, where normal situation like SELECT+CustomUDTF resolves the function
name as FQDN class name, but in situation like SELECT+LATERAL VIEW+CustomUDTF resolves the
function name as just the alias I gave in "create temporary function" clause rather than FQDN.
In other words, the function name to class name translation failed in that situation.

A workaround trick is discovered by debugging the Spark source code.

The trick is to remove the package name in the Java code of your custom UDTF, like "org.xxx".
In that case, your class name equals its FQDN. In addition the class name is used as the alias
function name in "create temporary function" clause. In that case, though Spark use the unresolved
alias function name, but this name can be resolved in the default name space. 

> Custom UDTF with Lateral View throws ClassNotFound exception in Spark SQL CLI
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-4854
>                 URL: https://issues.apache.org/jira/browse/SPARK-4854
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.1.0, 1.1.1
>            Reporter: Shenghua Wan
>
> Hello, 
> I met a problem when using Spark sql CLI. A custom UDTF with lateral view throws ClassNotFound
exception. I did a couple of experiments in same environment (spark version 1.1.0, 1.1.1):

> select + same custom UDTF (Passed) 
> select + lateral view + custom UDTF (ClassNotFoundException) 
> select + lateral view + built-in UDTF (Passed) 
> I have done some googling there days and found one related issue ticket of Spark 
> https://issues.apache.org/jira/browse/SPARK-4811
> which is about "Custom UDTFs not working in Spark SQL". 
> It should be helpful to put actual code here to reproduce the problem. However,  corporate
regulations might prohibit this. So sorry about this. Directly using explode's source code
in a jar will help anyway. 
> Here is a portion of stack print when exception, just in case: 
> java.lang.ClassNotFoundException: XXX 
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366) 
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355) 
>         at java.security.AccessController.doPrivileged(Native Method) 
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354) 
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425) 
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358) 
>         at org.apache.spark.sql.hive.HiveFunctionFactory$class.createFunction(hiveUdfs.scala:81)

>         at org.apache.spark.sql.hive.HiveGenericUdtf.createFunction(hiveUdfs.scala:247)

>         at org.apache.spark.sql.hive.HiveGenericUdtf.function$lzycompute(hiveUdfs.scala:254)

>         at org.apache.spark.sql.hive.HiveGenericUdtf.function(hiveUdfs.scala:254) 
>         at org.apache.spark.sql.hive.HiveGenericUdtf.outputInspectors$lzycompute(hiveUdfs.scala:261)

>         at org.apache.spark.sql.hive.HiveGenericUdtf.outputInspectors(hiveUdfs.scala:260)

>         at org.apache.spark.sql.hive.HiveGenericUdtf.outputDataTypes$lzycompute(hiveUdfs.scala:265)

>         at org.apache.spark.sql.hive.HiveGenericUdtf.outputDataTypes(hiveUdfs.scala:265)

>         at org.apache.spark.sql.hive.HiveGenericUdtf.makeOutput(hiveUdfs.scala:269) 
>         at org.apache.spark.sql.catalyst.expressions.Generator.output(generators.scala:60)

>         at org.apache.spark.sql.catalyst.plans.logical.Generate$$anonfun$1.apply(basicOperators.scala:50)

>         at org.apache.spark.sql.catalyst.plans.logical.Generate$$anonfun$1.apply(basicOperators.scala:50)

>         at scala.Option.map(Option.scala:145) 
>         at org.apache.spark.sql.catalyst.plans.logical.Generate.generatorOutput(basicOperators.scala:50)

>         at org.apache.spark.sql.catalyst.plans.logical.Generate.output(basicOperators.scala:60)

>         at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveChildren$1.apply(LogicalPlan.scala:79)

>         at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveChildren$1.apply(LogicalPlan.scala:79)

>         at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)

>         at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)

>         at scala.collection.immutable.List.foreach(List.scala:318) 
>         at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)

>         at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) 
> ....the rest is omitted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message