phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Volodymyr Kvych (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-4958) Hbase does not load updated UDF class simultaneously on whole cluster
Date Sat, 06 Oct 2018 08:33:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-4958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Volodymyr Kvych updated PHOENIX-4958:
-------------------------------------
    Description: 
To update UDF according to [https://phoenix.apache.org/udf.html] limitations, I do next steps:
 # Drop existing function and JAR file:
{code:sql}
DROP FUNCTION my_function;
DELETE JAR 'hdfs:/.../udf-v1.jar;{code}
 # Remove JAR file across cluster's local file system, like:
{code:java}
rm ${hbase.local.dir}/jars/udf-v1.jar{code}
 # Upload updated JAR file and create the same function:
{code:sql}
ADD JARS '/.../udf-v2.jar;
CREATE FUNCTION my_function(...) ... using jar 'hdfs:/.../udf-v2.jar';
{code}

The problem is, that every RegionServer could keep the previously loaded function undefined
period of time until GC decides to collect appropriate DynamicClassLoader instance which was
loaded old UDF class. As result, some RegionServers might execute new function's code, but
others - the old one. There is no way to ensure that the function was reloaded by whole cluster.

As a proposed fix, I'd updated the UDFExpression to keep DynamicClassLoaders per-tenant and
per-jar key. Since JAR name must be changed to correctly update the UDF, it's working for
described use case.

  was:
To update UDF according to [https://phoenix.apache.org/udf.html] limitations, I do next steps:
 # Drop existing function and JAR file:
{code:sql}
DROP FUNCTION my_function;
DELETE JAR 'hdfs:/.../udf-v1.jar;{code}

 # Remove JAR file across cluster's local file system, like:
{code:java}
rm ${hbase.local.dir}/jars/udf-v1.jar{code}

 # Upload updated JAR file and create the same function:
{code:sql}
ADD JARS '/.../udf-v2.jar;
CREATE FUNCTION my_function(...) ... using jar 'hdfs:/.../udf-v2.jar';
{code}

The problem is, that every RegionServer could keep the previously loaded function undefined
period of time until GC decides to collect appropriate DynamicClassLoader instance which was
loaded old UDF class. As result, some RegionServers might execute new function's code, but
others - the old one. There is no way to ensure that the function was reloaded by whole cluster.

As a proposed fix, I'd updated the UDFExpression to keep DynamicClassLoaders per-tenant and
per-jar key. Since JAR name must be changed to correctly update the UDF, it's working for
described use case.


> Hbase does not load updated UDF class simultaneously on whole cluster
> ---------------------------------------------------------------------
>
>                 Key: PHOENIX-4958
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4958
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Volodymyr Kvych
>            Priority: Major
>              Labels: UDF
>         Attachments: PHOENIX-4958.patch
>
>
> To update UDF according to [https://phoenix.apache.org/udf.html] limitations, I do next
steps:
>  # Drop existing function and JAR file:
> {code:sql}
> DROP FUNCTION my_function;
> DELETE JAR 'hdfs:/.../udf-v1.jar;{code}
>  # Remove JAR file across cluster's local file system, like:
> {code:java}
> rm ${hbase.local.dir}/jars/udf-v1.jar{code}
>  # Upload updated JAR file and create the same function:
> {code:sql}
> ADD JARS '/.../udf-v2.jar;
> CREATE FUNCTION my_function(...) ... using jar 'hdfs:/.../udf-v2.jar';
> {code}
> The problem is, that every RegionServer could keep the previously loaded function undefined
period of time until GC decides to collect appropriate DynamicClassLoader instance which was
loaded old UDF class. As result, some RegionServers might execute new function's code, but
others - the old one. There is no way to ensure that the function was reloaded by whole cluster.
> As a proposed fix, I'd updated the UDFExpression to keep DynamicClassLoaders per-tenant
and per-jar key. Since JAR name must be changed to correctly update the UDF, it's working
for described use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message