spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davies Liu <>
Subject Re: access javaobject in rdd map
Date Tue, 23 Sep 2014 17:43:31 GMT
Right now, there is no way to access JVM in Python worker, in order
to make this happen, we need to do:

1. setup py4j in Python worker
2. serialize the JVM objects and transfer to executors
3. link the JVM objects and py4j together to get an interface

Before these happens, maybe you could try to setup a service
for the model (such as RESTful service), access it map via RPC.

On Tue, Sep 23, 2014 at 9:48 AM, Tamas Jambor <> wrote:
> Hi Davies,
> Thanks for the reply. I saw that you guys do that way in the code. Is
> there no other way?
> I have implemented all the predict functions in scala, so I prefer not
> to reimplement the whole thing in python.
> thanks,
> On Tue, Sep 23, 2014 at 5:40 PM, Davies Liu <> wrote:
>> You should create a pure Python object (copy the attributes from Java object),
>>  then it could be used in map.
>> Davies
>> On Tue, Sep 23, 2014 at 8:48 AM, jamborta <> wrote:
>>> Hi all,
>>> I have a java object that contains a ML model which I would like to use for
>>> prediction (in python). I just want to iterate the data through a mapper and
>>> predict for each value. Unfortunately, this fails when it tries to serialise
>>> the object to sent it to the nodes.
>>> Is there a trick around this? Surely, this object could be picked up by
>>> reference at the nodes.
>>> many thanks,
>>> --
>>> View this message in context:
>>> Sent from the Apache Spark User List mailing list archive at
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message