Hi,

are you using Spark on one machine or many?

If on many, are you sure numpy is correctly installed on all machines?

To check that the environment is set-up correctly, you can try something like

import os
pythonpaths = sc.range(10).map(lambda i: os.environ.get("PYTHONPATH")).collect()
print(pythonpaths)

HTH

Eike

2016-06-02 15:32 GMT+02:00 Bhupendra Mishra <bhupendra.mishra@gmail.com>:
did not resolved. :(

On Thu, Jun 2, 2016 at 3:01 PM, Sergio Fernández <wikier@apache.org> wrote:

On Thu, Jun 2, 2016 at 9:59 AM, Bhupendra Mishra <bhupendra.mishra@gmail.com> wrote:
and i have already exported environment variable in spark-env.sh as follows.. error still there  error: ImportError: No module named numpy

export PYSPARK_PYTHON=/usr/bin/python

According the documentation at http://spark.apache.org/docs/latest/configuration.html#environment-variables the PYSPARK_PYTHON environment variable is for poniting to the Python interpreter binary.
 
If you check the programming guide https://spark.apache.org/docs/0.9.0/python-programming-guide.html#installing-and-configuring-pyspark it says you need to add your custom path to PYTHONPATH (the script automatically adds the bin/pyspark there).

So typically in Linux you would need to add the following (assuming you installed numpy there):

export PYTHONPATH=$PYTHONPATH:/usr/lib/python2.7/dist-packages

Hope that helps.




On Thu, Jun 2, 2016 at 12:04 AM, Julio Antonio Soto de Vicente <julio@esbet.es> wrote:
Try adding to spark-env.sh (renaming if you still have it with .template at the end):

PYSPARK_PYTHON=/path/to/your/bin/python

Where your bin/python is your actual Python environment with Numpy installed.


El 1 jun 2016, a las 20:16, Bhupendra Mishra <bhupendra.mishra@gmail.com> escribió:

I have numpy installed but where I should setup PYTHONPATH?


On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández <wikier@apache.org> wrote:
sudo pip install numpy

On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra <bhupendra.mishra@gmail.com> wrote:
Thanks .
How can this be resolved?

On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau <holden@pigscanfly.ca> wrote:
Generally this means numpy isn't installed on the system or your PYTHONPATH has somehow gotten pointed somewhere odd,

On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra <bhupendra.mishra@gmail.com> wrote:
If any one please can help me with following error.

 File "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", line 25, in <module>

ImportError: No module named numpy


Thanks in advance!




--




--
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co





--
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co




--
------------------------------------------------
Jan Eike von Seggern
Data Scientist

------------------------------------------------ 
Sevenval Technologies GmbH 

FRONT-END-EXPERTS SINCE 1999

Köpenicker Straße 154 | 10997 Berlin

office   +49 30 707 190 - 229
mail     eike.seggern@sevenval.com

 
Sitz: Köln, HRB 79823
Geschäftsführung: Jan Webering (CEO), Thorsten May, 
Sascha Langfus, Joern-Carlos Kuntze

Wir erhöhen den Return On Investment bei Ihren Mobile und Web-Projekten. Sprechen Sie uns an:
http://roi.sevenval.com/
-----------------------------------------------------------------------------------------------------------------------------------------------
FOLLOW US on
 
Sevenval blog

sevenval on twitter sevenval on linkedinsevenval on pinterest