spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <>
Subject [jira] [Commented] (SPARK-2678) `Spark-submit` overrides user application options
Date Thu, 07 Aug 2014 23:03:13 GMT


Patrick Wendell commented on SPARK-2678:

Fixed in 1.0.3 via:

> `Spark-submit` overrides user application options
> -------------------------------------------------
>                 Key: SPARK-2678
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 1.0.1, 1.0.2
>            Reporter: Cheng Lian
>            Assignee: Cheng Lian
>            Priority: Blocker
>             Fix For: 1.1.0, 1.0.3
> Here is an example:
> {code}
> ./bin/spark-submit --class Foo some.jar --help
> {code}
> SInce {{--help}} appears behind the primary resource (i.e. {{some.jar}}), it should be
recognized as a user application option. But it's actually overriden by {{spark-submit}} and
will show {{spark-submit}} help message.
> When directly invoking {{spark-submit}}, the constraints here are:
> # Options before primary resource should be recognized as {{spark-submit}} options
> # Options after primary resource should be recognized as user application options
> The tricky part is how to handle scripts like {{spark-shell}} that delegate  {{spark-submit}}.
These scripts allow users specify both {{spark-submit}} options like {{--master}} and user
defined application options together. For example, say we'd like to write a new script {{}}
to start the Hive Thrift server, basically we may do this:
> {code}
> $SPARK_HOME/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
spark-internal $@
> {code}
> Then user may call this script like:
> {code}
> ./sbin/ --master spark://some-host:7077 --hiveconf key=value
> {code}
> Notice that all options are captured by {{$@}}. If we put it before {{spark-internal}},
they are all recognized as {{spark-submit}} options, thus {{--hiveconf}} won't be passed to
{{HiveThriftServer2}}; if we put it after {{spark-internal}}, they *should* all be recognized
as options of {{HiveThriftServer2}}, but because of this bug, {{--master}} is still recognized
as {{spark-submit}} option and leads to the right behavior.
> Although currently all scripts using {{spark-submit}} work correctly, we still should
fix this bug, because it causes option name collision between {{spark-submit}} and user application,
and every time we add a new option to {{spark-submit}}, some existing user applications may
break. However, solving this bug may cause some incompatible changes.
> The suggested solution here is using {{--}} as separator of {{spark-submit}} options
and user application options. For the Hive Thrift server example above, user should call it
in this way:
> {code}
> ./sbin/ --master spark://some-host:7077 -- --hiveconf key=value
> {code}
> And {{SparkSubmitArguments}} should be responsible for splitting two sets of options
and pass them correctly.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message