spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <van...@cloudera.com>
Subject Re: replacement for SPARK_JAVA_OPTS
Date Thu, 07 Aug 2014 21:47:14 GMT
Andrew has been working on a fix:
https://github.com/apache/spark/pull/1770

On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <cody@koeninger.org> wrote:
> Just wanted to check in on this, see if I should file a bug report
> regarding the mesos argument propagation.
>
>
> On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <cody@koeninger.org> wrote:
>
>> 1. I've tried with and without escaping equals sign, it doesn't affect the
>> results.
>>
>> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for getting
>> system properties set in the local shell (although not for executors).
>>
>> 3. We're using the default fine-grained mesos mode, not setting
>> spark.mesos.coarse, so it doesn't seem immediately related to that ticket.
>> Should I file a bug report?
>>
>>
>> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pwendell@gmail.com>
>> wrote:
>>
>>> The third issue may be related to this:
>>> https://issues.apache.org/jira/browse/SPARK-2022
>>>
>>> We can take a look at this during the bug fix period for the 1.1
>>> release next week. If we come up with a fix we can backport it into
>>> the 1.0 branch also.
>>>
>>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <pwendell@gmail.com>
>>> wrote:
>>> > Thanks for digging around here. I think there are a few distinct issues.
>>> >
>>> > 1. Properties containing the '=' character need to be escaped.
>>> > I was able to load properties fine as long as I escape the '='
>>> > character. But maybe we should document this:
>>> >
>>> > == spark-defaults.conf ==
>>> > spark.foo a\=B
>>> > == shell ==
>>> > scala> sc.getConf.get("spark.foo")
>>> > res2: String = a=B
>>> >
>>> > 2. spark.driver.extraJavaOptions, when set in the properties file,
>>> > don't affect the driver when running in client mode (always the case
>>> > for mesos). We should probably document this. In this case you need to
>>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
>>> >
>>> > 3. Arguments aren't propagated on Mesos (this might be because of the
>>> > other issues, or a separate bug).
>>> >
>>> > - Patrick
>>> >
>>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <cody@koeninger.org>
>>> wrote:
>>> >> In addition, spark.executor.extraJavaOptions does not seem to behave
>>> as I
>>> >> would expect; java arguments don't seem to be propagated to executors.
>>> >>
>>> >>
>>> >> $ cat conf/spark-defaults.conf
>>> >>
>>> >> spark.master
>>> >>
>>> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
>>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
>>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> >>
>>> >>
>>> >> $ ./bin/spark-shell
>>> >>
>>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
>>> >> res0: String = -Dfoo.bar.baz=23
>>> >>
>>> >> scala> sc.parallelize(1 to 100).map{ i => (
>>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
>>> >>      |  System.getProperty("foo.bar.baz")
>>> >>      | )}.collect
>>> >>
>>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
>>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
>>> >> (dn-02.mxstg,null), ...
>>> >>
>>> >>
>>> >>
>>> >> Note that this is a mesos deployment, although I wouldn't expect that
>>> to
>>> >> affect the availability of spark.driver.extraJavaOptions in a local
>>> spark
>>> >> shell.
>>> >>
>>> >>
>>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <cody@koeninger.org>
>>> wrote:
>>> >>
>>> >>> Either whitespace or equals sign are valid properties file formats.
>>> >>> Here's an example:
>>> >>>
>>> >>> $ cat conf/spark-defaults.conf
>>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> >>>
>>> >>> $ ./bin/spark-shell -v
>>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>>> >>> Adding default property:
>>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>>> >>>
>>> >>>
>>> >>> scala>  System.getProperty("foo.bar.baz")
>>> >>> res0: String = null
>>> >>>
>>> >>>
>>> >>> If you add double quotes, the resulting string value will have double
>>> >>> quotes.
>>> >>>
>>> >>>
>>> >>> $ cat conf/spark-defaults.conf
>>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>>> >>>
>>> >>> $ ./bin/spark-shell -v
>>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>>> >>> Adding default property:
>>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
>>> >>>
>>> >>> scala>  System.getProperty("foo.bar.baz")
>>> >>> res0: String = null
>>> >>>
>>> >>>
>>> >>> Neither one of those affects the issue; the underlying problem in
my
>>> case
>>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and
>>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
>>> >>> spark-defaults.conf before the java process is started.
>>> >>>
>>> >>> Here's an example of the process running when only
>>> spark-defaults.conf is
>>> >>> being used:
>>> >>>
>>> >>> $ ps -ef | grep spark
>>> >>>
>>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
>>> ./bin/spark-shell -v
>>> >>>
>>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
>>> /usr/local/java/bin/java
>>> >>> -cp
>>> >>>
>>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
>>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
>>> >>> org.apache.spark.repl.Main
>>> >>>
>>> >>>
>>> >>> Here's an example of it when the command line --driver-java-options
is
>>> >>> used (and thus things work):
>>> >>>
>>> >>>
>>> >>> $ ps -ef | grep spark
>>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
>>> ./bin/spark-shell -v
>>> >>> --driver-java-options -Dfoo.bar.baz=23
>>> >>>
>>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
>>> /usr/local/java/bin/java
>>> >>> -cp
>>> >>>
>>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path= -Xms512m
>>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
>>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
>>> org.apache.spark.repl.Main
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <pwendell@gmail.com>
>>> >>> wrote:
>>> >>>
>>> >>>> Cody - in your example you are using the '=' character, but
in our
>>> >>>> documentation and tests we use a whitespace to separate the
key and
>>> >>>> value in the defaults file.
>>> >>>>
>>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
>>> >>>>
>>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> >>>>
>>> >>>> I'm not sure if the java properties file parser will try to
interpret
>>> >>>> the equals sign. If so you might need to do this.
>>> >>>>
>>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>>> >>>>
>>> >>>> Do those work for you?
>>> >>>>
>>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <vanzin@cloudera.com
>>> >
>>> >>>> wrote:
>>> >>>> > Hi Cody,
>>> >>>> >
>>> >>>> > Could you file a bug for this if there isn't one already?
>>> >>>> >
>>> >>>> > For system properties SparkSubmit should be able to read
those
>>> >>>> > settings and do the right thing, but that obviously won't
work for
>>> >>>> > other JVM options... the current code should work fine
in cluster
>>> mode
>>> >>>> > though, since the driver is a different process. :-)
>>> >>>> >
>>> >>>> >
>>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
>>> cody@koeninger.org>
>>> >>>> wrote:
>>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java
system
>>> properties
>>> >>>> via
>>> >>>> >> -D.
>>> >>>> >>
>>> >>>> >> This was used for properties that varied on a
>>> >>>> per-deployment-environment
>>> >>>> >> basis, but needed to be available in the spark shell
and workers.
>>> >>>> >>
>>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had
been
>>> deprecated,
>>> >>>> and
>>> >>>> >> replaced by spark-defaults.conf and command line arguments
to
>>> >>>> spark-submit
>>> >>>> >> or spark-shell.
>>> >>>> >>
>>> >>>> >> However, setting spark.driver.extraJavaOptions and
>>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf
is not a
>>> >>>> replacement
>>> >>>> >> for SPARK_JAVA_OPTS:
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> $ cat conf/spark-defaults.conf
>>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>>> >>>> >>
>>> >>>> >> $ ./bin/spark-shell
>>> >>>> >>
>>> >>>> >> scala> System.getProperty("foo.bar.baz")
>>> >>>> >> res0: String = null
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
>>> >>>> >>
>>> >>>> >> scala> System.getProperty("foo.bar.baz")
>>> >>>> >> res0: String = 23
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> Looking through the shell scripts for spark-submit
and
>>> spark-class, I
>>> >>>> can
>>> >>>> >> see why this is; parsing spark-defaults.conf from bash
could be
>>> >>>> brittle.
>>> >>>> >>
>>> >>>> >> But from an ergonomic point of view, it's a step back
to go from a
>>> >>>> >> set-it-and-forget-it configuration in spark-env.sh,
to requiring
>>> >>>> command
>>> >>>> >> line arguments.
>>> >>>> >>
>>> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell
with
>>> the
>>> >>>> >> appropriate arguments, but I wanted to bring the issue
up to see
>>> if
>>> >>>> anyone
>>> >>>> >> else had run into it,
>>> >>>> >> or had any direction for a general solution (beyond
parsing java
>>> >>>> properties
>>> >>>> >> files from bash).
>>> >>>> >
>>> >>>> >
>>> >>>> >
>>> >>>> > --
>>> >>>> > Marcelo
>>> >>>>
>>> >>>
>>> >>>
>>>
>>
>>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message