spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Jurney <russell.jur...@gmail.com>
Subject Re: Automating lengthy command to pyspark with configuration?
Date Mon, 29 Aug 2016 18:47:21 GMT
I've got most of it working through spark.jars

On Sunday, August 28, 2016, ayan guha <guha.ayan@gmail.com> wrote:

> Best to create alias and place in your bashrc
> On 29 Aug 2016 08:30, "Russell Jurney" <russell.jurney@gmail.com
> <javascript:_e(%7B%7D,'cvml','russell.jurney@gmail.com');>> wrote:
>
>> In order to use PySpark with MongoDB and ElasticSearch, I currently run
>> the rather long commands of:
>>
>> 1) pyspark --executor-memory 10g --jars ../lib/mongo-hadoop-spark-2.0.
>> 0-rc0.jar,../lib/mongo-java-driver-3.2.2.jar,../lib/mongo-hadoop-2.0.0-rc0.jar
>> --driver-class-path ../lib/mongo-hadoop-spark-2.0.
>> 0-rc0.jar:../lib/mongo-java-driver-3.2.2.jar:../lib/mongo-ha
>> doop-2.0.0-rc0.jar
>>
>> 2) pyspark --jars ../lib/elasticsearch-hadoop-2.3.4.jar
>> --driver-class-path ../lib/elasticsearch-hadoop-2.3.4.jar
>> Can all these things be made a part of my configuration, so that I don't
>> have to call these lengthy additions to pyspark?
>>
>> Thanks!
>> --
>> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
>> <javascript:_e(%7B%7D,'cvml','russell.jurney@gmail.com');> relato.io
>>
>

-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com relato.io

Mime
View raw message