spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: Kafka direct approach: blockInterval and topic partitions
Date Mon, 10 Aug 2015 17:58:26 GMT
There's no long-running receiver pushing blocks of messages, so
blockInterval isn't relevant.

Batch interval is what matters.

On Mon, Aug 10, 2015 at 12:52 PM, allonsy <luke1989@gmail.com> wrote:

> Hi everyone,
>
> I recently started using the new Kafka direct approach.
>
> Now, as far as I understood, each Kafka partition /is/ an RDD partition
> that
> will be processed by a single core.
> What I don't understand is the relation between those partitions and the
> blocks generated every blockInterval.
>
> For example, assume:
>
> 1000ms batch interval
> 16 topic partitions (total of 16 cores available)
>
> Moreover, we have that the blockInterval is set to 200ms.
>
> What am I actually dividing by the blockInterval value in such a scenario?
> I'd like to tune this value but I cannot understand what it stands for.
>
> I hope I made myself clear,
>
> thank you all! :)
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-direct-approach-blockInterval-and-topic-partitions-tp24197.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message