spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <>
Subject Re: Kafka Direct Stream
Date Thu, 01 Oct 2015 14:06:07 GMT
You can get the topic for a given partition from the offset range.  You can
either filter using that; or just have a single rdd and match on topic when
doing mapPartitions or foreachPartition (which I think is a better idea)

On Wed, Sep 30, 2015 at 5:02 PM, Udit Mehta <> wrote:

> Hi,
> I am using spark direct stream to consume from multiple topics in Kafka. I
> am able to consume fine but I am stuck at how to separate the data for each
> topic since I need to process data differently depending on the topic.
> I basically want to split the RDD consisting on N topics into N RDD's each
> having 1 topic.
> Any help would be appreciated.
> Thanks in advance,
> Udit

View raw message