Looking into the "fail to rebalance" messages. We do have zk 3.3.4. Could a
higher number of partitions be the cause?
On Thu, Dec 13, 2012 at 8:03 AM, Jun Rao <junrao@gmail.com> wrote:
> Do you see "fail to rebalance after 4 tries" in the worker not getting any
> data? If so, what's the ZK version? You should use 3.3.4 or above, which
> fixed some bugs that could cause rebalance to fail.
>
> Thanks,
>
> Jun
>
> On Wed, Dec 12, 2012 at 11:17 PM, David Ross <dyross@klout.com> wrote:
>
> > Hello,
> >
> > I am trying to distribute work across several nodes using Kafka. I have 3
> > brokers each with 16 partitions. I have 8 worker servers listening with
> one
> > message stream on the same topic. I expect each server to own about 1/8
> of
> > the partitions, yet I am not seeing this. It seems initially, the work is
> > fairly evenly distributed. However, after running for several hours, I
> see
> > that only three consumers own any partitions, and only 32 of the 48 have
> an
> > owner at all.
> >
> > What gives?
> >
> > For reference, we have 0.7.0 on the server and 0.7.2 on the consumer.
> Also,
> > I set the max rebalance retries to be 10 because I saw a lot of rebalance
> > failures in the logs.
> >
> >
> > Thanks,
> >
> > David
> >
>
|