storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vineet Mishra <clearmido...@gmail.com>
Subject Re: Storm Kafka Processing
Date Mon, 02 Feb 2015 18:23:56 GMT
Well I am already running Kafka with 10 Partitions and Replication factor
as 3 which is the default size of my cluster.

bin/kafka-topics.sh --create --zookeeper host1:2181,host2:2181,host3:2181
--replication-factor 3 --partitions 10 --topic test

and I am also running Kafka Storm topology with Executors count as 10

TopologyBuilder builder=new TopologyBuilder();
        builder.setSpout("KafkaSpout", new KafkaSpout(kafkaConfig), 10);

I am having a notion that since the time I have started running Kafka from
last* changed RF and # of Partitions I am landing up with latency.

* bin/kafka-topics.sh --create --zookeeper host1:2181,host2:2181,host3:2181
--replication-factor 1 --partitions 1 --topic test

Well I will try with above provided Storm Kafka bundle. Hope that could
help out!

Thanks!

On Mon, Feb 2, 2015 at 10:30 PM, Harsha <storm@harsha.io> wrote:

>  Vineet,
>        Can you try using the one in storm
> https://github.com/apache/storm/tree/master/external/storm-kafka . This
> is published into maven repo. So you can use the following
> <dependency>
> <groupId>org.apache.storm</groupId>
> <artifactId>storm-kafka</artifactId>
> <version>0.9.3</version>
> </dependency>
>
> If you are using topic with partitions size 10 make sure you configured
> your kafka spout with parallelism set to 10. Also make sure on the producer
> side you are pushing data onto all of the 10 partitions so that your kafka
> spout is fetching data from all of the 10 partitions.
> -Harsha
>
>
> On Mon, Feb 2, 2015, at 08:55 AM, Vineet Mishra wrote:
>
> Hi Harsha,
>
> I am using storm.kafka.KafkaSpout.KafkaSpout implementation from
>
> https://github.com/wurstmeister/storm-kafka-0.8-plus
>
> Thanks!
>
> On Mon, Feb 2, 2015 at 8:14 PM, Harsha <storm@harsha.io> wrote:
>
>
> Vineet,
>         Which kafka spout are you using?
>
> -Harsha
>
>
>
> On Mon, Feb 2, 2015, at 05:25 AM, Vineet Mishra wrote:
>
> Hi,
>
> I am running Kafka Storm Engine to process real time data generated on a 3
> node distributed cluster.
>
> Currently I have set 10 Executors for Storm Spout, which I don't think is
> running in parallel.
> Moreover earlier I was running the Kafka Topology with Replication Factor
> and Partitions as 1(which seems to have run comparatively faster), now I
> gave the Replication Factor as 3 and Partitions as 10 and I could see the
> performance degradation.
>
> Is there any way I can max utilize the available resource and get the max
> throughput of event processing.
>
> Looking for the expert suggestions at URGENT.
>
> Thanks!
>
>
>
>
>
>
>

Mime
View raw message