spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srikanth <srikanth...@gmail.com>
Subject Re: Rebalancing when adding kafka partitions
Date Fri, 22 Jul 2016 18:05:11 GMT
In Spark 1.x, if we restart from a checkpoint, will it read from new
partitions?

If you can, pls point us to some doc/link that talks about Kafka 0.10 integ
in Spark 2.0.

On Fri, Jul 22, 2016 at 1:33 PM, Cody Koeninger <cody@koeninger.org> wrote:

> For the integration for kafka 0.8, you are literally starting a
> streaming job against a fixed set of topicapartitions,  It will not
> change throughout the job, so you'll need to restart the spark job if
> you change kafka partitions.
>
> For the integration for kafka 0.10 / spark 2.0, if you use subscribe
> or subscribepattern, it should pick up new partitions as they are
> added.
>
> On Fri, Jul 22, 2016 at 11:29 AM, Srikanth <srikanth.ht@gmail.com> wrote:
> > Hello,
> >
> > I'd like to understand how Spark Streaming(direct) would handle Kafka
> > partition addition?
> > Will a running job be aware of new partitions and read from it?
> > Since it uses Kafka APIs to query offsets and offsets are handled
> > internally.
> >
> > Srikanth
>

Mime
View raw message