I did try a test with spark 2.0 + spark-streaming-kafka-0-10-assembly.
Partition was increased using "bin/kafka-topics.sh --alter" after spark job was started.
I don't see messages from new partitions in the DStream.

    KafkaUtils.createDirectStream[Array[Byte], Array[Byte]] (
        ssc, PreferConsistent, Subscribe[Array[Byte], Array[Byte]](topics, kafkaParams) )
    .map(r => (r.key(), r.value()))

Also, no.of partitions did not increase too.
    dataStream.foreachRDD( (rdd, curTime) => {
     logger.info(s"rdd has ${rdd.getNumPartitions} partitions.")

Should I be setting some parameter/config? Is the doc for new integ available?


On Fri, Jul 22, 2016 at 2:15 PM, Cody Koeninger <cody@koeninger.org> wrote:
No, restarting from a checkpoint won't do it, you need to re-define the stream.

Here's the jira for the 0.10 integration


I haven't gotten docs completed yet, but there are examples at


On Fri, Jul 22, 2016 at 1:05 PM, Srikanth <srikanth.ht@gmail.com> wrote:
> In Spark 1.x, if we restart from a checkpoint, will it read from new
> partitions?
> If you can, pls point us to some doc/link that talks about Kafka 0.10 integ
> in Spark 2.0.
> On Fri, Jul 22, 2016 at 1:33 PM, Cody Koeninger <cody@koeninger.org> wrote:
>> For the integration for kafka 0.8, you are literally starting a
>> streaming job against a fixed set of topicapartitions,  It will not
>> change throughout the job, so you'll need to restart the spark job if
>> you change kafka partitions.
>> For the integration for kafka 0.10 / spark 2.0, if you use subscribe
>> or subscribepattern, it should pick up new partitions as they are
>> added.
>> On Fri, Jul 22, 2016 at 11:29 AM, Srikanth <srikanth.ht@gmail.com> wrote:
>> > Hello,
>> >
>> > I'd like to understand how Spark Streaming(direct) would handle Kafka
>> > partition addition?
>> > Will a running job be aware of new partitions and read from it?
>> > Since it uses Kafka APIs to query offsets and offsets are handled
>> > internally.
>> >
>> > Srikanth