spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alec Ten Harmsel <a...@alectenharmsel.com>
Subject Re: Which is better? One spark app listening to 10 topics vs. 10 spark apps each listening to 1 topic
Date Mon, 27 Oct 2014 10:35:29 GMT

On 10/27/2014 05:19 AM, Jianshi Huang wrote:
> Any suggestion? :)
>
> Jianshi
>
> On Thu, Oct 23, 2014 at 3:49 PM, Jianshi Huang
> <jianshi.huang@gmail.com <mailto:jianshi.huang@gmail.com>> wrote:
>
>     The Kafka stream has 10 topics and the data rate is quite high (~
>     100K/s per topic).
>
>     Which configuration do you recommend?
>     - 1 Spark app consuming all Kafka topics
>     - 10 separate Spark app each consuming one topic
>
>     Assuming they have the same resource pool.
>
>     Cheers,
>     -- 
>     Jianshi Huang
>
>     LinkedIn: jianshi
>     Twitter: @jshuang
>     Github & Blog: http://huangjs.github.com/
>
>
>
>
> -- 
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/

Do you have time to try and benchmark both? I don't know anything about
Kafka, but I would imagine that the performance of both options would be
similar.

That said, I would recommend having them all run separately; adding new
data streams doesn't require killing a monolithic job, and an error in
one stream would affect a monolithic job much worse that having them all
run separately.

Regards,

Alec

Mime
View raw message