spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Malouf <malouf.g...@gmail.com>
Subject Re: replacement for SPARK_JAVA_OPTS
Date Fri, 08 Aug 2014 00:30:58 GMT
Can this be cherry-picked for 1.1 if everything works out?  In my opinion,
it could be qualified as a bug fix.


On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <vanzin@cloudera.com> wrote:

> Andrew has been working on a fix:
> https://github.com/apache/spark/pull/1770
>
> On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <cody@koeninger.org> wrote:
> > Just wanted to check in on this, see if I should file a bug report
> > regarding the mesos argument propagation.
> >
> >
> > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <cody@koeninger.org>
> wrote:
> >
> >> 1. I've tried with and without escaping equals sign, it doesn't affect
> the
> >> results.
> >>
> >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for getting
> >> system properties set in the local shell (although not for executors).
> >>
> >> 3. We're using the default fine-grained mesos mode, not setting
> >> spark.mesos.coarse, so it doesn't seem immediately related to that
> ticket.
> >> Should I file a bug report?
> >>
> >>
> >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pwendell@gmail.com>
> >> wrote:
> >>
> >>> The third issue may be related to this:
> >>> https://issues.apache.org/jira/browse/SPARK-2022
> >>>
> >>> We can take a look at this during the bug fix period for the 1.1
> >>> release next week. If we come up with a fix we can backport it into
> >>> the 1.0 branch also.
> >>>
> >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <pwendell@gmail.com>
> >>> wrote:
> >>> > Thanks for digging around here. I think there are a few distinct
> issues.
> >>> >
> >>> > 1. Properties containing the '=' character need to be escaped.
> >>> > I was able to load properties fine as long as I escape the '='
> >>> > character. But maybe we should document this:
> >>> >
> >>> > == spark-defaults.conf ==
> >>> > spark.foo a\=B
> >>> > == shell ==
> >>> > scala> sc.getConf.get("spark.foo")
> >>> > res2: String = a=B
> >>> >
> >>> > 2. spark.driver.extraJavaOptions, when set in the properties file,
> >>> > don't affect the driver when running in client mode (always the case
> >>> > for mesos). We should probably document this. In this case you need
> to
> >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
> >>> >
> >>> > 3. Arguments aren't propagated on Mesos (this might be because of the
> >>> > other issues, or a separate bug).
> >>> >
> >>> > - Patrick
> >>> >
> >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <cody@koeninger.org>
> >>> wrote:
> >>> >> In addition, spark.executor.extraJavaOptions does not seem to behave
> >>> as I
> >>> >> would expect; java arguments don't seem to be propagated to
> executors.
> >>> >>
> >>> >>
> >>> >> $ cat conf/spark-defaults.conf
> >>> >>
> >>> >> spark.master
> >>> >>
> >>>
> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
> >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
> >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> >>
> >>> >>
> >>> >> $ ./bin/spark-shell
> >>> >>
> >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
> >>> >> res0: String = -Dfoo.bar.baz=23
> >>> >>
> >>> >> scala> sc.parallelize(1 to 100).map{ i => (
> >>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
> >>> >>      |  System.getProperty("foo.bar.baz")
> >>> >>      | )}.collect
> >>> >>
> >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
> >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
> >>> >> (dn-02.mxstg,null), ...
> >>> >>
> >>> >>
> >>> >>
> >>> >> Note that this is a mesos deployment, although I wouldn't expect
> that
> >>> to
> >>> >> affect the availability of spark.driver.extraJavaOptions in a local
> >>> spark
> >>> >> shell.
> >>> >>
> >>> >>
> >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <cody@koeninger.org
> >
> >>> wrote:
> >>> >>
> >>> >>> Either whitespace or equals sign are valid properties file
formats.
> >>> >>> Here's an example:
> >>> >>>
> >>> >>> $ cat conf/spark-defaults.conf
> >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> >>>
> >>> >>> $ ./bin/spark-shell -v
> >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> >>> Adding default property:
> >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>> >>>
> >>> >>>
> >>> >>> scala>  System.getProperty("foo.bar.baz")
> >>> >>> res0: String = null
> >>> >>>
> >>> >>>
> >>> >>> If you add double quotes, the resulting string value will have
> double
> >>> >>> quotes.
> >>> >>>
> >>> >>>
> >>> >>> $ cat conf/spark-defaults.conf
> >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>> >>>
> >>> >>> $ ./bin/spark-shell -v
> >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> >>> Adding default property:
> >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
> >>> >>>
> >>> >>> scala>  System.getProperty("foo.bar.baz")
> >>> >>> res0: String = null
> >>> >>>
> >>> >>>
> >>> >>> Neither one of those affects the issue; the underlying problem
in
> my
> >>> case
> >>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS
and
> >>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
> >>> >>> spark-defaults.conf before the java process is started.
> >>> >>>
> >>> >>> Here's an example of the process running when only
> >>> spark-defaults.conf is
> >>> >>> being used:
> >>> >>>
> >>> >>> $ ps -ef | grep spark
> >>> >>>
> >>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
> >>> ./bin/spark-shell -v
> >>> >>>
> >>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
> >>> /usr/local/java/bin/java
> >>> >>> -cp
> >>> >>>
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
> >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
> >>> >>> org.apache.spark.repl.Main
> >>> >>>
> >>> >>>
> >>> >>> Here's an example of it when the command line
> --driver-java-options is
> >>> >>> used (and thus things work):
> >>> >>>
> >>> >>>
> >>> >>> $ ps -ef | grep spark
> >>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
> >>> ./bin/spark-shell -v
> >>> >>> --driver-java-options -Dfoo.bar.baz=23
> >>> >>>
> >>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
> >>> /usr/local/java/bin/java
> >>> >>> -cp
> >>> >>>
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path=
-Xms512m
> >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
> >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
> >>> org.apache.spark.repl.Main
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <
> pwendell@gmail.com>
> >>> >>> wrote:
> >>> >>>
> >>> >>>> Cody - in your example you are using the '=' character,
but in our
> >>> >>>> documentation and tests we use a whitespace to separate
the key
> and
> >>> >>>> value in the defaults file.
> >>> >>>>
> >>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
> >>> >>>>
> >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> >>>>
> >>> >>>> I'm not sure if the java properties file parser will try
to
> interpret
> >>> >>>> the equals sign. If so you might need to do this.
> >>> >>>>
> >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>> >>>>
> >>> >>>> Do those work for you?
> >>> >>>>
> >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <
> vanzin@cloudera.com
> >>> >
> >>> >>>> wrote:
> >>> >>>> > Hi Cody,
> >>> >>>> >
> >>> >>>> > Could you file a bug for this if there isn't one already?
> >>> >>>> >
> >>> >>>> > For system properties SparkSubmit should be able to
read those
> >>> >>>> > settings and do the right thing, but that obviously
won't work
> for
> >>> >>>> > other JVM options... the current code should work
fine in
> cluster
> >>> mode
> >>> >>>> > though, since the driver is a different process. :-)
> >>> >>>> >
> >>> >>>> >
> >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
> >>> cody@koeninger.org>
> >>> >>>> wrote:
> >>> >>>> >> We were previously using SPARK_JAVA_OPTS to set
java system
> >>> properties
> >>> >>>> via
> >>> >>>> >> -D.
> >>> >>>> >>
> >>> >>>> >> This was used for properties that varied on a
> >>> >>>> per-deployment-environment
> >>> >>>> >> basis, but needed to be available in the spark
shell and
> workers.
> >>> >>>> >>
> >>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS
had been
> >>> deprecated,
> >>> >>>> and
> >>> >>>> >> replaced by spark-defaults.conf and command line
arguments to
> >>> >>>> spark-submit
> >>> >>>> >> or spark-shell.
> >>> >>>> >>
> >>> >>>> >> However, setting spark.driver.extraJavaOptions
and
> >>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf
is not a
> >>> >>>> replacement
> >>> >>>> >> for SPARK_JAVA_OPTS:
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> $ cat conf/spark-defaults.conf
> >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>> >>>> >>
> >>> >>>> >> $ ./bin/spark-shell
> >>> >>>> >>
> >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>> >>>> >> res0: String = null
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
> >>> >>>> >>
> >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>> >>>> >> res0: String = 23
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> Looking through the shell scripts for spark-submit
and
> >>> spark-class, I
> >>> >>>> can
> >>> >>>> >> see why this is; parsing spark-defaults.conf from
bash could be
> >>> >>>> brittle.
> >>> >>>> >>
> >>> >>>> >> But from an ergonomic point of view, it's a step
back to go
> from a
> >>> >>>> >> set-it-and-forget-it configuration in spark-env.sh,
to
> requiring
> >>> >>>> command
> >>> >>>> >> line arguments.
> >>> >>>> >>
> >>> >>>> >> I can solve this with an ad-hoc script to wrap
spark-shell with
> >>> the
> >>> >>>> >> appropriate arguments, but I wanted to bring the
issue up to
> see
> >>> if
> >>> >>>> anyone
> >>> >>>> >> else had run into it,
> >>> >>>> >> or had any direction for a general solution (beyond
parsing
> java
> >>> >>>> properties
> >>> >>>> >> files from bash).
> >>> >>>> >
> >>> >>>> >
> >>> >>>> >
> >>> >>>> > --
> >>> >>>> > Marcelo
> >>> >>>>
> >>> >>>
> >>> >>>
> >>>
> >>
> >>
>
>
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message