Well, system.exit will not ensure all data was processed before shutdown. 
There should not be a deadlock is onBatchCompleted just starts the thread (that runs stop()) and completes.

On Wed, Aug 12, 2015 at 1:50 AM, Shushant Arora <shushantarora09@gmail.com> wrote:
calling jssc.stop(false/true,false/true) from streamingListener causes deadlock , So I created another thread and called jssc.stop from  that but that too caused deadlock if onBatchCompleted is not completed before jssc.stop().

So is it safe If I call System.exit(1) from another thread without calling jssc.stop()- since that leads to deadlock.


On Tue, Aug 11, 2015 at 9:54 PM, Shushant Arora <shushantarora09@gmail.com> wrote:
Is stopping in the streaming context in onBatchCompleted event of StreamingListener does not kill the app?

I have below code in streaming listener

public void onBatchCompleted(StreamingListenerBatchCompleted arg0) {
//check stop condition
System.out.println("stopping gracefully");
jssc.stop(false,false);
System.out.println("stopped gracefully");
}

stopped gracefully is never printed.

On UI no more batches are processed but application is never killed/stopped? Whats the best way to kill the app.after stopping context?



On Tue, Aug 11, 2015 at 2:55 AM, Shushant Arora <shushantarora09@gmail.com> wrote:
Thanks!



On Tue, Aug 11, 2015 at 1:34 AM, Tathagata Das <tdas@databricks.com> wrote:
1. RPC can be done in many ways, and a web service is one of many ways. A even more hacky version can be the app polling a file in a file system, if the file exists start shutting down. 
2. No need to set a flag. When you get the signal from RPC, you can just call context.stop(stopGracefully = true) . Though note that this is blocking, so gotta be carefully about doing blocking calls on the RPC thread.

On Mon, Aug 10, 2015 at 12:24 PM, Shushant Arora <shushantarora09@gmail.com> wrote:
By RPC you mean web service exposed on driver which listens and set some flag and driver checks that flag at end of each batch and if set then gracefully stop the application ?

On Tue, Aug 11, 2015 at 12:43 AM, Tathagata Das <tdas@databricks.com> wrote:
In general, it is a little risky to put long running stuff in a shutdown hook as it may delay shutdown of the process which may delay other things. That said, you could try it out. 

A better way to explicitly shutdown gracefully is to use an RPC to signal the driver process to start shutting down, and then the process will gracefully stop the context and terminate. This is more robust that than leveraging shutdown hooks.

On Mon, Aug 10, 2015 at 11:56 AM, Shushant Arora <shushantarora09@gmail.com> wrote:
Any help in best recommendation for gracefully shutting down a spark stream application ?
I am running it on yarn and a way to tell from externally either yarn application -kill command or some other way but need current batch to be processed completely and checkpoint to be saved before shutting down.

Runtime.getRuntime().addShutdownHook does not seem to be working. Yarn kills the application immediately and dooes not call shutdown hook call back .


On Sun, Aug 9, 2015 at 12:45 PM, Shushant Arora <shushantarora09@gmail.com> wrote:
Hi

How to ensure in spark streaming 1.3 with kafka that when an application is killed , last running batch is fully processed and offsets are written to checkpointing dir.

On Fri, Aug 7, 2015 at 8:56 AM, Shushant Arora <shushantarora09@gmail.com> wrote:
Hi

I am using spark stream 1.3 and using custom checkpoint to save kafka offsets.

1.Is doing 
Runtime.getRuntime().addShutdownHook(new Thread() {
  @Override
  public void run() {
  jssc.stop(true, true);
   System.out.println("Inside Add Shutdown Hook");
  }
 });

to handle stop is safe ? 

2.And I need to handle saving checkoinnt in shutdown hook also or driver will handle it automatically since it grcaefully stops stream and handle 
completion of foreachRDD function on stream ?
directKafkaStream.foreachRDD(new Function<JavaRDD<byte[][]>, Void>() {
}

Thanks