spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neelesh <neele...@gmail.com>
Subject Re: Spark Streaming 1.3 & Kafka Direct Streams
Date Wed, 01 Apr 2015 18:57:17 GMT
Thanks Cody!

On Wed, Apr 1, 2015 at 11:21 AM, Cody Koeninger <cody@koeninger.org> wrote:

> If you want to change topics from batch to batch, you can always just
> create a KafkaRDD repeatedly.
>
> The streaming code as it stands assumes a consistent set of topics
> though.  The implementation is private so you cant subclass it without
> building your own spark.
>
> On Wed, Apr 1, 2015 at 1:09 PM, Neelesh <neeleshs@gmail.com> wrote:
>
>> Thanks Cody, that was really helpful.  I have a much better understanding
>> now. One last question -  Kafka topics  are initialized once in the driver,
>> is there an easy way of adding/removing topics on the fly?
>> KafkaRDD#getPartitions() seems to be computed only once, and no way of
>> refreshing them.
>>
>> Thanks again!
>>
>> On Wed, Apr 1, 2015 at 10:01 AM, Cody Koeninger <cody@koeninger.org>
>> wrote:
>>
>>> https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md
>>>
>>> The kafka consumers run in the executors.
>>>
>>> On Wed, Apr 1, 2015 at 11:18 AM, Neelesh <neeleshs@gmail.com> wrote:
>>>
>>>> With receivers, it was pretty obvious which code ran where - each
>>>> receiver occupied a core and ran on the workers. However, with the new
>>>> kafka direct input streams, its hard for me to understand where the code
>>>> that's reading from kafka brokers runs. Does it run on the driver (I hope
>>>> not), or does it run on workers?
>>>>
>>>> Any help appreciated
>>>> thanks!
>>>> -neelesh
>>>>
>>>
>>>
>>
>

Mime
View raw message