spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Javier Domingo Cansino <>
Subject Python3 Spark execution problems
Date Tue, 11 Aug 2015 09:33:37 GMT

I have been trying to use spark for the processing I need to do in some
logs, and I have found several difficulties during the process. Most of
them I could overcome them, but I am really stuck in the last one.

I would really like to know how spark is supposed to be deployed. For now,
I have a ssh key in the master that can login in any worker. and work.

According to the docs, I crafted the following command:
 ~/projects/bigdata/spark/spark/bin/spark-submit --py-files
/home/javier/projects/bigdata/bdml/dist/ --master='spark://' ml/ /srv/bdml/raw2json/json-logs.gz

First, when I tried to deploy my project, it was an impossible quest. I was
all the time getting module import errors:
Traceback (most recent call last):
  File "/home/javier/projects/bigdata/bdml/ml/", line 10,
in <module>
    from .files import get_interesting_files

I tried everything, but there was a moment when I had to hop into scala
code to trace that error. Therefore I just merged all the functions of the
project in one file.

Then I started to get the following error:
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage
0.0 (TID 3, org.apache.spark.api.python.PythonExce
ption: Traceback (most recent call last):
  File "/root/spark/python/lib/", line 64, in
    ("%d.%d" % sys.version_info[:2], version))
Exception: Python in worker has different version 2.7 than that in driver
3.4, PySpark cannot run with different minor versions

I have specified #!/usr/bin/env python3 in the top of the file, and my on each worker contains the following lines.
export PYSPARK_PYTHON=python3.4

I had to specify the PYTHONHASHSEED because it wasn't propagating to the

I hope you can help me,
[image: Fon] <>Javier Domingo CansinoResearch &
Development Engineer+34 946545847Skype: javier.domingo.fonAll information
in this email is confidential <>

View raw message