spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Spico Florin <spicoflo...@gmail.com>
Subject Re: Run/install tensorframes on zeppelin pyspark
Date Fri, 10 Aug 2018 08:47:50 GMT
Hello!
  Thank you very much for your response.
As I understood, in order to use tensorframes in Zeppelin pyspark notebook
with spark master locally
1. we should run command pip install tensorframes
2. we should set up the PYSPARK_PYTHON in conf/zeppelin-env.sh

I have performed the above steps like this

python2.7 -m pip install tensorframes==0.2.7
export PYSPARK_PYTHON=python2.7 in  in conf/zeppelin-env.sh
"zeppelin.pyspark.python": "python2.7 in conf/interpreter.json

As you can see the installation and the configurations refers to the same
python2.7 version.
After performing all of these steps, I'm still getting the same error
 *"ImportError:
No module named tensorframes"*

I'm still puzzled how this import works fine in the pyspark command from
the spark and for example in python2.7 results in errors.
Also I've observed that pyspark shell from /spark/bin doesn't need the
tensorframes python package installed and this is more confusing.
Zeppelin pyspark interpreter is not using the same approach as spark
pyspark shell?

Is someone succeeded to import/use correctly tensorframes in Zeppelin with
default spark master setup (local[*]?) If yes how?

I look forward for your answers/

Regards,
 Florin

















On Thu, Aug 9, 2018 at 3:52 AM, Jeff Zhang <zjffdu@gmail.com> wrote:

>
> Make sure you use the correct python which has tensorframe installed.  Use PYSPARK_PYTHON
> to configure the python
>
>
>
> Spico Florin <spicoflorin@gmail.com>于2018年8月8日周三 下午9:59写道:
>
>> Hi!
>>
>> I would like to use tensorframes in my pyspark notebook.
>>
>> I have performed the following:
>>
>> 1. In the spark intepreter adde a new repository http://dl.bintray.
>> com/spark-packages/maven
>> 2. in the spark interpreter added the dependency databricks:
>> tensorframes:0.2.9-s_2.11
>> 3. pip install tensorframes
>>
>>
>> In both 0.7.3 and 0.8.0:
>> 1.  the following code resulted in error: "ImportError: No module named
>> tensorframes"
>>
>> %pyspark
>> import tensorframes as tfs
>>
>> 2. the following code succeeded
>> %spark
>> import org.tensorframes.{dsl => tf}
>> import org.tensorframes.dsl.Implicits._
>> val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")
>>
>> // As in Python, scoping is recommended to prevent name collisions.
>> val df2 = tf.withGraph {
>>     val a = df.block("a")
>>     // Unlike python, the scala syntax is more flexible:
>>     val out = a + 3.0 named "out"
>>     // The 'mapBlocks' method is added using implicits to dataframes.
>>     df.mapBlocks(out).select("a", "out")
>> }
>>
>> // The transform is all lazy at this point, let's execute it with collect:
>> df2.collect()
>>
>> I ran the code above directly with spark interpreter with the default
>> configurations (master set up to local[*] - so not via spark-submit
>> command) .
>>
>> Also, I have installed spark home locally and ran the command
>>
>> $SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11
>>
>> and the code below worked as expcted
>>
>> import tensorframes as tfs
>>
>>  Can you please help to solve this?
>>
>> Thanks,
>>
>>  Florin
>>
>>
>>
>>
>>
>>
>>
>>
>>

Mime
View raw message