samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagadish Venkatraman <jagadish1...@gmail.com>
Subject Re: How to partition a topic into multiple and how to create multiple Samza Containers
Date Sat, 19 Mar 2016 02:15:12 GMT
You can use the kafka-topics.sh tool to create a Kafka topic with your
desired umber of partitions. You can also use the tool to repartition topics

On Friday, March 18, 2016, Milinda Pathirage <mpathira@umail.iu.edu> wrote:

> Hi Mohan,
>
> Samza maps Kafka topic partitions to containers. So if your topic has only
> 1 partition, only 1 container will be spawned even if you configure Samza
> job to use more than 1 container. So please partition the input topic
>  first.  The "Tasks" section of [1] contains more information on this.
>
> Thanks
> Milinda
>
> [1]
>
> https://samza.apache.org/learn/documentation/0.10/introduction/concepts.html
>
> On Fri, Mar 18, 2016 at 9:11 AM, mohanraj v <mohanrajv.cbe@gmail.com
> <javascript:;>> wrote:
>
> > Hi,
> >
> >      Im trying to create more than one container in my application(Single
> > machine).
> > I have 1,00,000 records in one kafka topic.How to partition it into two
> and
> > process it in parallel. I configured my job properties as below but i
> didnt
> > get multiple containers.Kindly reply me as soon as possible to work on
> this
> > application.
> >
> > machine configuration:
> > 4GB RAM,2 cores
> >
> > # Job
> >
> > job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
> > job.name=job-parser
> >
> > # YARN
> >
> >
> yarn.package.path=file:///home/hello-samza/target/hello-samza-0.10.0-dist.tar.gz
> > yarn.container.count=2
> > yarn.container.memory.mb=512
> > yarn.container.cpu.cores=2
> > #yarn.am.container.memory.mb=1024
> >
> > # Task
> > task.class=samza.task.ParserStreamTask
> > task.inputs=kafka.input
> >
> > # Serializers
> >
> >
> serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory
> >
> > # Kafka System
> >
> >
> systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
> > systems.kafka.samza.msg.serde=string
> > systems.kafka.consumer.zookeeper.connect=localhost:2181/
> > systems.kafka.producer.bootstrap.servers=localhost:9092
> >
> > # Job Coordinator
> > job.coordinator.system=kafka
> > job.coordinator.replication.factor=1
> >
> >
> >
> > Thanks,
> > Mohan
> >
>
>
>
> --
> Milinda Pathirage
>
> PhD Student | Research Assistant
> School of Informatics and Computing | Data to Insight Center
> Indiana University
>
> twitter: milindalakmal
> skype: milinda.pathirage
> blog: http://milinda.pathirage.org
>


-- 
Sent from my iphone.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message