spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Or <and...@databricks.com>
Subject Re: replacement for SPARK_JAVA_OPTS
Date Fri, 08 Aug 2014 05:19:00 GMT
Ah, great to know this is already being fixed. Thanks Patrick, I have
marked my JIRA as a duplicate.


2014-08-07 21:42 GMT-07:00 Patrick Wendell <pwendell@gmail.com>:

> Andrew - I think your JIRA may duplicate existing work:
> https://github.com/apache/spark/pull/1513
>
>
> On Thu, Aug 7, 2014 at 7:55 PM, Andrew Or <andrew@databricks.com> wrote:
> > @Cody I took a quick glance at the Mesos code and it appears that we
> > currently do not even pass extra java options to executors except in
> coarse
> > grained mode, and even in this mode we do not pass them to executors
> > correctly. I have filed a related JIRA here:
> > https://issues.apache.org/jira/browse/SPARK-2921. This is a somewhat
> > serious limitation and we will try to fix this for 1.1.
> >
> > -Andrew
> >
> >
> > 2014-08-07 19:42 GMT-07:00 Andrew Or <andrew@databricks.com>:
> >
> >> Thanks Marcelo, I have moved the changes to a new PR to describe the
> >> problems more clearly: https://github.com/apache/spark/pull/1845
> >>
> >> @Gary Yeah, the goal is to get this into 1.1 as a bug fix.
> >>
> >>
> >> 2014-08-07 17:30 GMT-07:00 Gary Malouf <malouf.gary@gmail.com>:
> >>
> >> Can this be cherry-picked for 1.1 if everything works out?  In my
> opinion,
> >>> it could be qualified as a bug fix.
> >>>
> >>>
> >>> On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <vanzin@cloudera.com>
> >>> wrote:
> >>>
> >>> > Andrew has been working on a fix:
> >>> > https://github.com/apache/spark/pull/1770
> >>> >
> >>> > On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <cody@koeninger.org>
> >>> wrote:
> >>> > > Just wanted to check in on this, see if I should file a bug report
> >>> > > regarding the mesos argument propagation.
> >>> > >
> >>> > >
> >>> > > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <
> cody@koeninger.org>
> >>> > wrote:
> >>> > >
> >>> > >> 1. I've tried with and without escaping equals sign, it doesn't
> >>> affect
> >>> > the
> >>> > >> results.
> >>> > >>
> >>> > >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works
for
> >>> getting
> >>> > >> system properties set in the local shell (although not for
> >>> executors).
> >>> > >>
> >>> > >> 3. We're using the default fine-grained mesos mode, not setting
> >>> > >> spark.mesos.coarse, so it doesn't seem immediately related
to that
> >>> > ticket.
> >>> > >> Should I file a bug report?
> >>> > >>
> >>> > >>
> >>> > >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <
> pwendell@gmail.com
> >>> >
> >>> > >> wrote:
> >>> > >>
> >>> > >>> The third issue may be related to this:
> >>> > >>> https://issues.apache.org/jira/browse/SPARK-2022
> >>> > >>>
> >>> > >>> We can take a look at this during the bug fix period for
the 1.1
> >>> > >>> release next week. If we come up with a fix we can backport
it
> into
> >>> > >>> the 1.0 branch also.
> >>> > >>>
> >>> > >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <
> >>> pwendell@gmail.com>
> >>> > >>> wrote:
> >>> > >>> > Thanks for digging around here. I think there are
a few
> distinct
> >>> > issues.
> >>> > >>> >
> >>> > >>> > 1. Properties containing the '=' character need to
be escaped.
> >>> > >>> > I was able to load properties fine as long as I escape
the '='
> >>> > >>> > character. But maybe we should document this:
> >>> > >>> >
> >>> > >>> > == spark-defaults.conf ==
> >>> > >>> > spark.foo a\=B
> >>> > >>> > == shell ==
> >>> > >>> > scala> sc.getConf.get("spark.foo")
> >>> > >>> > res2: String = a=B
> >>> > >>> >
> >>> > >>> > 2. spark.driver.extraJavaOptions, when set in the
properties
> file,
> >>> > >>> > don't affect the driver when running in client mode
(always the
> >>> case
> >>> > >>> > for mesos). We should probably document this. In
this case you
> >>> need
> >>> > to
> >>> > >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
> >>> > >>> >
> >>> > >>> > 3. Arguments aren't propagated on Mesos (this might
be because
> of
> >>> the
> >>> > >>> > other issues, or a separate bug).
> >>> > >>> >
> >>> > >>> > - Patrick
> >>> > >>> >
> >>> > >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <
> >>> cody@koeninger.org>
> >>> > >>> wrote:
> >>> > >>> >> In addition, spark.executor.extraJavaOptions
does not seem to
> >>> behave
> >>> > >>> as I
> >>> > >>> >> would expect; java arguments don't seem to be
propagated to
> >>> > executors.
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >> $ cat conf/spark-defaults.conf
> >>> > >>> >>
> >>> > >>> >> spark.master
> >>> > >>> >>
> >>> > >>>
> >>> >
> >>>
> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
> >>> > >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
> >>> > >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >> $ ./bin/spark-shell
> >>> > >>> >>
> >>> > >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
> >>> > >>> >> res0: String = -Dfoo.bar.baz=23
> >>> > >>> >>
> >>> > >>> >> scala> sc.parallelize(1 to 100).map{ i =>
(
> >>> > >>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
> >>> > >>> >>      |  System.getProperty("foo.bar.baz")
> >>> > >>> >>      | )}.collect
> >>> > >>> >>
> >>> > >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
> >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
> >>> > >>> >> (dn-02.mxstg,null), ...
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >> Note that this is a mesos deployment, although
I wouldn't
> expect
> >>> > that
> >>> > >>> to
> >>> > >>> >> affect the availability of spark.driver.extraJavaOptions
in a
> >>> local
> >>> > >>> spark
> >>> > >>> >> shell.
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger
<
> >>> cody@koeninger.org
> >>> > >
> >>> > >>> wrote:
> >>> > >>> >>
> >>> > >>> >>> Either whitespace or equals sign are valid
properties file
> >>> formats.
> >>> > >>> >>> Here's an example:
> >>> > >>> >>>
> >>> > >>> >>> $ cat conf/spark-defaults.conf
> >>> > >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> > >>> >>>
> >>> > >>> >>> $ ./bin/spark-shell -v
> >>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> > >>> >>> Adding default property:
> >>> > >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> scala>  System.getProperty("foo.bar.baz")
> >>> > >>> >>> res0: String = null
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> If you add double quotes, the resulting string
value will
> have
> >>> > double
> >>> > >>> >>> quotes.
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> $ cat conf/spark-defaults.conf
> >>> > >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>> > >>> >>>
> >>> > >>> >>> $ ./bin/spark-shell -v
> >>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> > >>> >>> Adding default property:
> >>> > >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
> >>> > >>> >>>
> >>> > >>> >>> scala>  System.getProperty("foo.bar.baz")
> >>> > >>> >>> res0: String = null
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> Neither one of those affects the issue; the
underlying
> problem
> >>> in
> >>> > my
> >>> > >>> case
> >>> > >>> >>> seems to be that bin/spark-class uses the
SPARK_SUBMIT_OPTS
> and
> >>> > >>> >>> SPARK_JAVA_OPTS environment variables, but
nothing parses
> >>> > >>> >>> spark-defaults.conf before the java process
is started.
> >>> > >>> >>>
> >>> > >>> >>> Here's an example of the process running
when only
> >>> > >>> spark-defaults.conf is
> >>> > >>> >>> being used:
> >>> > >>> >>>
> >>> > >>> >>> $ ps -ef | grep spark
> >>> > >>> >>>
> >>> > >>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00
bash
> >>> > >>> ./bin/spark-shell -v
> >>> > >>> >>>
> >>> > >>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
> >>> > >>> /usr/local/java/bin/java
> >>> > >>> >>> -cp
> >>> > >>> >>>
> >>> > >>>
> >>> >
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> > >>> >>> -XX:MaxPermSize=128m -Djava.library.path=
-Xms512m -Xmx512m
> >>> > >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell
-v --class
> >>> > >>> >>> org.apache.spark.repl.Main
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> Here's an example of it when the command
line
> >>> > --driver-java-options is
> >>> > >>> >>> used (and thus things work):
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> $ ps -ef | grep spark
> >>> > >>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00
bash
> >>> > >>> ./bin/spark-shell -v
> >>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23
> >>> > >>> >>>
> >>> > >>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
> >>> > >>> /usr/local/java/bin/java
> >>> > >>> >>> -cp
> >>> > >>> >>>
> >>> > >>>
> >>> >
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> > >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path=
> >>> -Xms512m
> >>> > >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit
spark-shell -v
> >>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
> >>> > >>> org.apache.spark.repl.Main
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick
Wendell <
> >>> > pwendell@gmail.com>
> >>> > >>> >>> wrote:
> >>> > >>> >>>
> >>> > >>> >>>> Cody - in your example you are using
the '=' character, but
> in
> >>> our
> >>> > >>> >>>> documentation and tests we use a whitespace
to separate the
> key
> >>> > and
> >>> > >>> >>>> value in the defaults file.
> >>> > >>> >>>>
> >>> > >>> >>>> docs:
> http://spark.apache.org/docs/latest/configuration.html
> >>> > >>> >>>>
> >>> > >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> > >>> >>>>
> >>> > >>> >>>> I'm not sure if the java properties file
parser will try to
> >>> > interpret
> >>> > >>> >>>> the equals sign. If so you might need
to do this.
> >>> > >>> >>>>
> >>> > >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>> > >>> >>>>
> >>> > >>> >>>> Do those work for you?
> >>> > >>> >>>>
> >>> > >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo
Vanzin <
> >>> > vanzin@cloudera.com
> >>> > >>> >
> >>> > >>> >>>> wrote:
> >>> > >>> >>>> > Hi Cody,
> >>> > >>> >>>> >
> >>> > >>> >>>> > Could you file a bug for this if
there isn't one already?
> >>> > >>> >>>> >
> >>> > >>> >>>> > For system properties SparkSubmit
should be able to read
> >>> those
> >>> > >>> >>>> > settings and do the right thing,
but that obviously won't
> >>> work
> >>> > for
> >>> > >>> >>>> > other JVM options... the current
code should work fine in
> >>> > cluster
> >>> > >>> mode
> >>> > >>> >>>> > though, since the driver is a different
process. :-)
> >>> > >>> >>>> >
> >>> > >>> >>>> >
> >>> > >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM,
Cody Koeninger <
> >>> > >>> cody@koeninger.org>
> >>> > >>> >>>> wrote:
> >>> > >>> >>>> >> We were previously using SPARK_JAVA_OPTS
to set java
> system
> >>> > >>> properties
> >>> > >>> >>>> via
> >>> > >>> >>>> >> -D.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> This was used for properties
that varied on a
> >>> > >>> >>>> per-deployment-environment
> >>> > >>> >>>> >> basis, but needed to be available
in the spark shell and
> >>> > workers.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> On upgrading to 1.0, we saw
that SPARK_JAVA_OPTS had been
> >>> > >>> deprecated,
> >>> > >>> >>>> and
> >>> > >>> >>>> >> replaced by spark-defaults.conf
and command line
> arguments
> >>> to
> >>> > >>> >>>> spark-submit
> >>> > >>> >>>> >> or spark-shell.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> However, setting spark.driver.extraJavaOptions
and
> >>> > >>> >>>> >> spark.executor.extraJavaOptions
in spark-defaults.conf is
> >>> not a
> >>> > >>> >>>> replacement
> >>> > >>> >>>> >> for SPARK_JAVA_OPTS:
> >>> > >>> >>>> >>
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> $ cat conf/spark-defaults.conf
> >>> > >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> $ ./bin/spark-shell
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>> > >>> >>>> >> res0: String = null
> >>> > >>> >>>> >>
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> $ ./bin/spark-shell --driver-java-options
> "-Dfoo.bar.baz=23"
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>> > >>> >>>> >> res0: String = 23
> >>> > >>> >>>> >>
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> Looking through the shell scripts
for spark-submit and
> >>> > >>> spark-class, I
> >>> > >>> >>>> can
> >>> > >>> >>>> >> see why this is; parsing spark-defaults.conf
from bash
> >>> could be
> >>> > >>> >>>> brittle.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> But from an ergonomic point
of view, it's a step back to
> go
> >>> > from a
> >>> > >>> >>>> >> set-it-and-forget-it configuration
in spark-env.sh, to
> >>> > requiring
> >>> > >>> >>>> command
> >>> > >>> >>>> >> line arguments.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> I can solve this with an ad-hoc
script to wrap
> spark-shell
> >>> with
> >>> > >>> the
> >>> > >>> >>>> >> appropriate arguments, but I
wanted to bring the issue
> up to
> >>> > see
> >>> > >>> if
> >>> > >>> >>>> anyone
> >>> > >>> >>>> >> else had run into it,
> >>> > >>> >>>> >> or had any direction for a general
solution (beyond
> parsing
> >>> > java
> >>> > >>> >>>> properties
> >>> > >>> >>>> >> files from bash).
> >>> > >>> >>>> >
> >>> > >>> >>>> >
> >>> > >>> >>>> >
> >>> > >>> >>>> > --
> >>> > >>> >>>> > Marcelo
> >>> > >>> >>>>
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>>
> >>> > >>
> >>> > >>
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Marcelo
> >>> >
> >>> > ---------------------------------------------------------------------
> >>> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> >>> > For additional commands, e-mail: dev-help@spark.apache.org
> >>> >
> >>> >
> >>>
> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message