spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomasz Guziałek <Tomasz.Guzia...@HumanInference.com>
Subject RE: SparkLauncher not notified about finished job - hangs infinitely.
Date Mon, 03 Aug 2015 07:49:20 GMT
Reading from the input stream and the error stream (in separate threads) indeed unblocked the
launcher and it exited properly. Thanks for your responses!

Best regards,
Tomasz

From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Friday, July 31, 2015 19:20
To: Elkhan Dadashov
Cc: Tomasz Guziałek; user@spark.apache.org
Subject: Re: SparkLauncher not notified about finished job - hangs infinitely.

Tomasz:
Please take a look at the Redirector class inside:
./launcher/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java

FYI

On Fri, Jul 31, 2015 at 10:02 AM, Elkhan Dadashov <elkhan8502@gmail.com<mailto:elkhan8502@gmail.com>>
wrote:
Hi Tomasz,

Answer to your 1st question:

Clear/read the error (spark.getErrorStream()) and output (spark.getInputStream()) stream buffers
before you call spark.waitFor(), it would be better to clear/read them with 2 different threads.
Then it should work fine.

As Spark job is launched as subprocess, and according to Oracle documentation<https://docs.oracle.com/javase/8/docs/api/java/lang/Process.html>:

"By default, the created subprocess does not have its own terminal or console. All its standard
I/O (i.e. stdin, stdout, stderr) operations will be redirected to the parent process, where
they can be accessed via the streams obtained using the methodsgetOutputStream(), getInputStream(),
and getErrorStream(). The parent process uses these streams to feed input to and get output
from the subprocess. Because some native platforms only provide limited buffer size for standard
input and output streams, failure to promptly write the input stream or read the output stream
of the subprocess may cause the subprocess to block, or even deadlock.
"



On Fri, Jul 31, 2015 at 2:45 AM, Tomasz Guziałek <Tomasz.Guzialek@humaninference.com<mailto:Tomasz.Guzialek@humaninference.com>>
wrote:
I am trying to submit a JAR with Spark job into the YARN cluster from Java code. I am using
SparkLauncher to submit SparkPi example:

    Process spark = new SparkLauncher()
        .setAppResource("C:\\spark-1.4.1-bin-hadoop2.6\\lib\\spark-examples-1.4.1-hadoop2.6.0.jar")
        .setMainClass("org.apache.spark.examples.SparkPi")
        .setMaster("yarn-cluster")
        .launch();
    System.out.println("Waiting for finish...");
    int exitCode = spark.waitFor();
    System.out.println("Finished! Exit code:" + exitCode);

There are two problems:

1. While submitting in "yarn-cluster" mode, the application is successfully submitted to YARN
and executes successfully (it is visible in the YARN UI, reported as SUCCESS and PI value
is printed in the output). However, the submitting application is never notified that processing
is finished - it hangs infinitely after printing "Waiting to finish..." The log of the container
can be found here: http://pastebin.com/LscBjHQc
2. While submitting in "yarn-client" mode, the application does not appear in YARN UI and
the submitting application hangs at "Waiting to finish..." When hanging code is killed, the
application shows up in YARN UI and it is reported as SUCCESS, but the output is empty (PI
value is not printed out). The log of the container can be found here: http://pastebin.com/9KHi81r4

I tried to execute the submitting application both with Oracle Java 8 and 7.

Any hints what might be wrong?

Best regards,
Tomasz



--

Best regards,
Elkhan Dadashov

Mime
View raw message