spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <van...@cloudera.com>
Subject Re: Is SPARK_CLASSPATH really deprecated?
Date Mon, 02 Mar 2015 18:12:49 GMT
Just a note for whoever writes the doc, spark.executor.extraClassPath
is *prepended* to the executor's classpath, which is a rather
important distinction. :-)

On Fri, Feb 27, 2015 at 12:21 AM, Patrick Wendell <pwendell@gmail.com> wrote:
> I think we need to just update the docs, it is a bit unclear right
> now. At the time, we made it worded fairly sternly because we really
> wanted people to use --jars when we deprecated SPARK_CLASSPATH. But
> there are other types of deployments where there is a legitimate need
> to augment the classpath of every executor.
>
> I think it should probably say something more like
>
> "Extra classpath entries to append to the classpath of executors. This
> is sometimes used in deployment environments where dependencies of
> Spark are present in a specific place on all nodes".
>
> Kannan - if you want to submit I patch I can help review it.
>
> On Thu, Feb 26, 2015 at 8:24 PM, Kannan Rajah <krajah@maprtech.com> wrote:
>> Thanks Marcelo. Do you think it would be useful to make
>> spark.executor.extraClassPath be made to pick up some environment variable
>> that can be set from spark-env.sh? Here is a example.
>>
>> spark-env.sh
>> ------------------
>> executor_extra_cp = get_hbase_jars_for_cp
>> export executor_extra_cp
>>
>> spark-defaults.conf
>> ---------------------
>> spark.executor.extraClassPath = ${executor_extra_cp}
>>
>> This will let us add logic inside get_hbase_jars_for_cp function to pick the
>> right version hbase jars. There could be multiple versions installed on the
>> node.
>>
>>
>>
>> --
>> Kannan
>>
>> On Thu, Feb 26, 2015 at 6:08 PM, Marcelo Vanzin <vanzin@cloudera.com> wrote:
>>>
>>> On Thu, Feb 26, 2015 at 5:12 PM, Kannan Rajah <krajah@maprtech.com> wrote:
>>> > Also, I would like to know if there is a localization overhead when we
>>> > use
>>> > spark.executor.extraClassPath. Again, in the case of hbase, these jars
>>> > would
>>> > be typically available on all nodes. So there is no need to localize
>>> > them
>>> > from the node where job was submitted. I am wondering if we use the
>>> > SPARK_CLASSPATH approach, then it would not do localization. That would
>>> > be
>>> > an added benefit.
>>> > Please clarify.
>>>
>>> spark.executor.extraClassPath doesn't localize anything. It just
>>> prepends those classpath entries to the usual classpath used to launch
>>> the executor. There's no copying of files or anything, so they're
>>> expected to exist on the nodes.
>>>
>>> It's basically exactly the same as SPARK_CLASSPATH, but broken down to
>>> two options (one for the executors, and one for the driver).
>>>
>>> --
>>> Marcelo
>>
>>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message