spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "xiaobo " <guxiaobo1...@qq.com>
Subject Re: Does Pyspark Support Graphx?
Date Tue, 20 Feb 2018 04:40:41 GMT
When using the --jars option, we should include it every time we submit a job , it seems add
the jars to the classpath to every slave node a spark is only way to "install" spark packages.




------------------ Original ------------------
From: Nicholas Hakobian <nicholas.hakobian@rallyhealth.com>
Date: Tue,Feb 20,2018 3:37 AM
To: xiaobo <guxiaobo1982@qq.com>
Cc: Denny Lee <denny.g.lee@gmail.com>, user@spark.apache.org <user@spark.apache.org>
Subject: Re: Does Pyspark Support Graphx?



If you copy the Jar file and all of the dependencies to the machines, you can manually add
them to the classpath. If you are using Yarn and HDFS you can alternatively use --jars and
point it to the hdfs locations of the jar files and it will (in most cases) distribute them
to the worker nodes at job submission time.

Nicholas Szandor Hakobian, Ph.D.Staff Data Scientist
Rally Health
nicholas.hakobian@rallyhealth.com














On Sun, Feb 18, 2018 at 7:24 PM, xiaobo <guxiaobo1982@qq.com> wrote:
Another question is how to install graphframes permanently when the spark nodes can not connect
to the internet.




------------------ Original ------------------
From: Denny Lee <denny.g.lee@gmail.com>
Date: Mon,Feb 19,2018 10:23 AM
To: xiaobo <guxiaobo1982@qq.com>
Cc: user@spark.apache.org <user@spark.apache.org>
Subject: Re: Does Pyspark Support Graphx?



Note the --packages option works for both PySpark and Spark (Scala).  For the SparkLauncher
class, you should be able to include packages ala:

spark.addSparkArg("--packages", "graphframes:0.5.0-spark2.0-s_2.11")


On Sun, Feb 18, 2018 at 3:30 PM xiaobo <guxiaobo1982@qq.com> wrote:

Hi Denny,
The pyspark script uses the --packages option to load graphframe library, what about the SparkLauncher
class? 




------------------ Original ------------------
From: Denny Lee <denny.g.lee@gmail.com>
Date: Sun,Feb 18,2018 11:07 AM
To: 94035420 <guxiaobo1982@qq.com>
Cc: user@spark.apache.org <user@spark.apache.org>



Subject: Re: Does Pyspark Support Graphx?



That’s correct - you can use GraphFrames though as it does support PySpark.  
On Sat, Feb 17, 2018 at 17:36 94035420 <guxiaobo1982@qq.com> wrote:

I can not find anything for graphx module in the python API document, does it mean it is not
supported yet?
Mime
View raw message