kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Wang <chen.apache.s...@gmail.com>
Subject Re: Broker keeps rebalancing
Date Thu, 13 Nov 2014 19:01:27 GMT
kafka.config.zookeeper.session.timeout.ms60000
kafka.config.rebalance.backoff.ms6000kafka.config.rebalance.max.retries6

On Thu, Nov 13, 2014 at 10:56 AM, Guozhang Wang <wangguoz@gmail.com> wrote:

> I was originally asking about consumer configs, which should contain the
> following:
>
> http://kafka.apache.org/documentation.html#consumerconfigs
>
> zookeeper.session.timeout.ms
> zookeeper.connection.timeout.ms
>
> On Thu, Nov 13, 2014 at 10:40 AM, Manish <maaand@gmail.com> wrote:
>
> > @Guozhang:
> >
> > In server.properties  we have :
> >
> > zookeeper.connection.timeout.ms=1000000
> >
> >
> > In zoo.cfg we have
> >
> > tickTime=2000
> >
> > initLimit=10
> >
> > syncLimit=5
> >
> > dataDir=/opt/zookeeper/data
> >
> > dataLogDir=/opt/zookeeper/logs
> >
> > clientPort=2182
> >
> > server.1=xxxx.com:2888:3888
> >
> > server.2=xxxx.com:2888:3888
> >
> > server.3=xxxx.com:2888:3888
> >
> >
> > On Thu, Nov 13, 2014 at 10:27 AM, Guozhang Wang <wangguoz@gmail.com>
> > wrote:
> >
> > > Chen,
> > >
> > > From ZK logs it sounds like ZK kept timed out consumers which triggers
> > > rebalance.
> > >
> > > What is the zk session timeout config value in your consumers?
> > >
> > > Guozhang
> > >
> > > On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang <
> chen.apache.solr@gmail.com>
> > > wrote:
> > >
> > > > Thanks for the info.
> > > > It makes sense, however, I didn't see any "session timeout"/"expired"
> > > > entries in consumer log..
> > > > but do see lots of connection closed entry in zookeeper log:
> > > >
> > > > 2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection
> > for
> > > > client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
> > > > 2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket
> > > > connection
> > > > from /10.93.80.121:38437
> > > > 2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from
> > old
> > > > client /10.93.80.121:38437; will be dropped if server is in r-o mode
> > > > 2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to
> > > establish
> > > > new session at /10.93.80.121:38437
> > > > 2014-11-13 10:08:04,538 [myid:1] - INFO
> > > >  [CommitProcessor:1:ZooKeeperServer@617] - Established session
> > > > 0x149a4cc1b580e7e with negotiated timeout 40000 for client /
> > > > 10.93.80.121:38437
> > > > 2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection
> > for
> > > > client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e
> > > >
> > > > We are using -Xmx2048m for consumer, and I didn't see any GC related
> > > > exceptions
> > > >
> > > > Chen
> > > >
> > > >
> > > >
> > > > On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang <wangguoz@gmail.com>
> > > wrote:
> > > >
> > > > > Hey Chen,
> > > > >
> > > > > As Neha suggested, typical reason of too many rebalances is that
> your
> > > > > consumers kept being timed out from ZK, and you can verify this by
> > > > checking
> > > > > in your consumer logs for sth. like "session timeout" entries
> (these
> > > are
> > > > > not ERROR entries).
> > > > >
> > > > > Guozhang
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede <
> > > neha.narkhede@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Does this help?
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > > > > > ?
> > > > > >
> > > > > > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang <
> > > chen.apache.solr@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi there,
> > > > > > > My kafka client is reading a 3 partition topic from kafka
with
> 3
> > > > > threads
> > > > > > > distributed on different machines. I am seeing frequent
owner
> > > changes
> > > > > on
> > > > > > > the topics when running:
> > > > > > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker
> --group
> > > > > > > my_test_group --topic mytopic -zkconnect localhost:2181
> > > > > > >
> > > > > > > The owner kept changing once a while, but I didn't see
any
> > > exceptions
> > > > > > > thrown from the consumer side. When checking broker log,
its
> full
> > > of
> > > > > > >  INFO Closing socket connection to /IP.
> (kafka.network.Processor)
> > > > > > >
> > > > > > > Is this expected behavior? If so,  how can I tell when
 the
> > leader
> > > is
> > > > > > > imbalanced, and rebalance is triggered?
> > > > > > > Thanks,
> > > > > > > Chen
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message