spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajith shetty <>
Subject Permanent UDF support across session
Date Wed, 19 Sep 2018 08:54:17 GMT
I have a question related to Permanent UDF for spark enabled hive support.

When we do create function, this is registered with hive via
 spark-sql>create function customfun as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDay'
using jar 'hdfs:///tmp/hive-exec.jar';
 call stack:

but when we call a registered UDF, we do ADD JAR call to hive
 spark-sql> select customfun('2015-08-22');
 call stack:

so is the ADD JAR call to hive necessary when we invoke a already registered UDF.? as i see
if we follow current code,
1. hive can lookup already registered UDFs without explicit add jar call from spark , Refer fixed via
( When the function is referenced for the first time by a Hive session, these resources will
be added to the environment. )
2. We cannot have across session as the new session again need to do add jar internally on
UDF call, which will fail as caller neeed to have a admin role set ( hive requires add jar
to be run only via admin role )

Please correct me if i am wrong, can we avoid add jar when we invoke a registered UDF.? any
side-effects if i modify this flow.?

View raw message