spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "李奇平" <qiping....@alibaba-inc.com>
Subject Can't find pyspark when using PySpark on YARN
Date Tue, 10 Jun 2014 13:35:17 GMT
Dear all,

When I submit a pyspark application using this command:
./bin/spark-submit --master yarn-client examples/src/main/python/wordcount.py "hdfs://..."
I get the following exception:
Error from python worker:
Traceback (most recent call last):
File "/usr/ali/lib/python2.5/runpy.py", line 85, in run_module
loader = get_loader(mod_name)
File "/usr/ali/lib/python2.5/pkgutil.py", line 456, in get_loader
return find_loader(fullname)
File "/usr/ali/lib/python2.5/pkgutil.py", line 466, in find_loader
for importer in iter_importers(fullname):
File "/usr/ali/lib/python2.5/pkgutil.py", line 422, in iter_importers
__import__(pkg)
ImportError: No module named pyspark
PYTHONPATH was:
/home/xxx/spark/python:/home/xxx/spark_on_yarn/python/lib/py4j-0.8.1-src.zip:/disk11/mapred/tmp/usercache/xxxx/filecache/11/spark-assembly-1.0.0-hadoop2.0.0-ydh2.0.0.jar
Maybe `pyspark/python` and `py4j-0.8.1-src.zip` is not included in the YARN worker, How can
I distribute these files with my application? Can I use `--pyfiles python.zip, py4j-0.8.1-src.zip `?Or
how can I package modules in pyspark to a .egg file?


Mime
View raw message