spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <>
Subject Re: Creating a python port for a Scala Spark Projeect
Date Thu, 23 Jun 2016 02:12:37 GMT
PySpark RDDs are (on the Java side) are essentially RDD of pickled objects
and mostly (but not entirely) opaque to the JVM. It is possible (by using
some internals) to pass a PySpark DataFrame to a Scala library (you may or
may not find the talk I gave at Spark Summit useful as well as some of the Python
examples in
). Good luck! :)

On Wed, Jun 22, 2016 at 7:07 PM, Daniel Imberman <>

> Hi All,
> I've developed a spark module in scala that I would like to add a python
> port for. I want to be able to allow users to create a pyspark RDD and send
> it to my system. I've been looking into the pyspark source code as well as
> py4J and was wondering if there has been anything like this implemented
> before.
> Thank you

Cell : 425-233-8271

View raw message