spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serega Sheypak <>
Subject Re: Run spark 2.2 on yarn as usual java application
Date Mon, 19 Mar 2018 13:02:35 GMT
Hi Jörn, thanks for your reply
Oozie starts ooze java action as single "long running" MapReduce Mapper.
This mapper is responsible for calling main class. Main class belongs to
user and this main class starts spark job.
yarn-cluster is not an option for me. I have to do something special to
maintain "run away" driver. Imagine I want to kill the spark job. I can
just kill oozie workflow, it will kill spawned mapper with main class with
driver inside it.
It won't happen in yarn-cluster mode, since driver is not running in the
process "managed" by oozie.

2018-03-19 13:41 GMT+01:00 Jörn Franke <>:

> Maybe you should better run it in yarn cluster mode. Yarn client would
> start the driver on the oozie server.
> On 19. Mar 2018, at 12:58, Serega Sheypak <>
> wrote:
> I'm trying to run it as Oozie java action and reduce env dependency. The
> only thing I need is Hadoop Configuration to talk to hdfs and yarn.
> Spark submit is a shell thing. Trying to do all from jvm.
> Oozie java action starts main class which inststiates SparkConf and
> session. It works well in local mode but throws exception when I try to run
> spark as yarn-client
> пн, 19 марта 2018 г. в 7:16, Jacek Laskowski <>:
>> Hi,
>> What's the deployment process then (if not using spark-submit)? How is
>> the AM deployed? Why would you want to skip spark-submit?
>> Jacek
>> On 19 Mar 2018 00:20, "Serega Sheypak" <> wrote:
>>> Hi, Is it even possible to run spark on yarn as usual java application?
>>> I've built jat using maven with spark-yarn dependency and I manually
>>> populate SparkConf with all hadoop properties.
>>> SparkContext fails to start with exception:
>>>    1. Caused by: java.lang.IllegalStateException: Library directory
>>>    '/hadoop/yarn/local/usercache/root/appcache/application_
>>>    1521375636129_0022/container_e06_1521375636129_0022_01_
>>>    000002/assembly/target/scala-2.11/jars' does not exist; make sure
>>>    Spark is built.
>>>    2. at org.apache.spark.launcher.CommandBuilderUtils.checkState(Com
>>>    3. at org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(Co
>>>    4. at org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(
>>>    YarnCommandBuilderUtils.scala:38)
>>> I took a look at the code and it has some hardcodes and checks for
>>> specific files layout. I don't follow why :)
>>> Is it possible to bypass such checks?

View raw message