spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anders Bennehag <and...@tajitsu.com>
Subject Re: pyspark: Importing other py-files in PYTHONPATH
Date Wed, 05 Mar 2014 13:45:45 GMT
I just discovered that putting myLib in /usr/local/python2-7/dist-packages/
on the worker-nodes will let me import the module in a pyspark-script...

That is a solution but it would be nice if modules in PYTHONPATH were
included as well.


On Wed, Mar 5, 2014 at 1:34 PM, Anders Bennehag <anders@tajitsu.com> wrote:

> Hi there,
>
> I am running spark 0.9.0 standalone on a cluster. The documentation
> http://spark.incubator.apache.org/docs/latest/python-programming-guide.htmlstates that
code-dependencies can be deployed through the pyFiles argument
> to the SparkContext.
>
> But in my case, the relevant code, lets call it myLib is already available
> in PYTHONPATH on the worker-nodes. However, when trying to access this code
> through a regular 'import myLib' in the script sent to pyspark, the
> spark-workers seem to hang in the middle of the script without any specific
> errors.
>
> If I start a regular python-shell on the workers, there is no problem
> importing myLib and accessing it.
>
> Why is this?
>
> /Anders
>
>

Mime
View raw message