samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lukas Steiblys" <lu...@doubledutch.me>
Subject Re: Number of partitions
Date Thu, 21 May 2015 19:19:42 GMT
Each job will get all the partitions and each task (500 of them) within the 
job will get 1 partition. So there will be 500 processes working through the 
log.

I'd try to figure out what your scaling needs are for the next 2-3 years and 
then calculate your resource requirements accordingly (how many parallel 
executing tasks you would need). If you need to split, it is not trivial, 
but doable.

Lukas

-----Original Message----- 
From: Michael Ravits
Sent: Thursday, May 21, 2015 11:17 AM
To: dev@samza.apache.org
Subject: Re: Number of partitions

Well, since the number of partitions can't be changed after the system
starts running I wanted to have the flexibility to grow a lot without
stopping for upgrade.
Just wonder what would be a tolerable number for Samza.
For example if I'd start with 5 jobs, each will get 100 partitions. Is this
reasonable? Or too much for a single job instance?

On Thu, May 21, 2015 at 7:46 PM, Lukas Steiblys <lukas@doubledutch.me>
wrote:

> 500 is a bit extreme unless you're planning on running the job on some 200
> machines and try to exploit their full power. I personally run 4 in
> production for our system processing 100 messages/s and there's plenty of
> room to grow.
>
> Lukas
>
> On Thursday, May 21, 2015, Michael Ravits <michaelr524@gmail.com> wrote:
>
> > Hi,
> >
> > I wonder what are the considerations I need to account for in regard to
> the
> > number of partitions in input topics for Samza.
> > When testing with a 500 partitions topic with one Samza job I noticed 
> > the
> > start up time to be very long.
> > Are there any problems that might occur when dealing with this number of
> > partitions?
> >
> > Thanks,
> > Michael
> >
> 


Mime
View raw message