spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sagarcasual ." <sagarcas...@gmail.com>
Subject Re: Pausing spark kafka streaming (direct) or exclude/include some partitions on the fly per batch
Date Fri, 02 Sep 2016 21:33:51 GMT
Hi Cody, thanks for the reply.
I am using Spark 1.6.1 with Kafka 0.9.
When I want to stop streaming, stopping the context sounds ok, but for
temporarily excluding partitions is there any way I can supply
topic-partition info on the fly at the beginning of every pull dynamically.
Will streaminglistener be of any help?

On Fri, Sep 2, 2016 at 10:37 AM, Cody Koeninger <cody@koeninger.org> wrote:

> If you just want to pause the whole stream, just stop the app and then
> restart it when you're ready.
>
> If you want to do some type of per-partition manipulation, you're
> going to need to write some code.  The 0.10 integration makes the
> underlying kafka consumer pluggable, so you may be able to wrap a
> consumer to do what you need.
>
> On Fri, Sep 2, 2016 at 12:28 PM, sagarcasual . <sagarcasual@gmail.com>
> wrote:
> > Hello, this is for
> > Pausing spark kafka streaming (direct) or exclude/include some
> partitions on
> > the fly per batch
> > =========================================================
> > I have following code that creates a direct stream using Kafka connector
> for
> > Spark.
> >
> > final JavaInputDStream<KafkaMessage> msgRecords =
> > KafkaUtils.createDirectStream(
> >             jssc, String.class, String.class, StringDecoder.class,
> > StringDecoder.class,
> >             KafkaMessage.class, kafkaParams, topicsPartitions,
> >             message -> {
> >                 return KafkaMessage.builder()
> >                         .
> >                         .build();
> >             }
> >     );
> >
> > However I want to handle a situation, where I can decide that this
> streaming
> > needs to pause for a while on conditional basis, is there any way to
> achieve
> > this? Say my Kafka is undergoing some maintenance, so between 10AM to
> 12PM
> > stop processing, and then again pick up at 12PM from the last offset,
> how do
> > I do it?
> >
> > Also, assume all of a sudden we want to take one-or-more of the
> partitions
> > for a pull and add it back after some pulls, how do I achieve that?
> >
> > -Regards
> > Sagar
> >
>

Mime
View raw message