kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ewen Cheslack-Postava <e...@confluent.io>
Subject Re: Question regarding to reconnect.backoff.ms
Date Thu, 03 Sep 2015 02:06:25 GMT
Steve,

I don't think there is a better solution at the moment. This is an easy
issue to miss in unit testing because generally connections to localhost
will be rejected immediately if there isn't anything listening on the port.
If you're running in an environment where this happens normally, then for
now you'll need to wait for the long timeout.

https://issues.apache.org/jira/browse/KAFKA-2120 may also alleviate the
problem by at least reducing the amount of time for the request to fail.
Depending on how adventurous you are, you could try using a version with
that patch and maybe adjust the setting lower than its default.

-Ewen

On Wed, Sep 2, 2015 at 10:46 AM, Steve Tian <steve.cs.tian@gmail.com> wrote:

> Would kafka dev kindly give us some advice on this?
>
> Cheers, Steve
>
> On Tue, Sep 1, 2015, 11:20 PM Steve Tian <steve.cs.tian@gmail.com> wrote:
>
> > Thanks, Rahul!  In my environment I need to have reconnect.backoff.ms
> > longer than OS default tcp timeout so that NetworkClient can give second
> > node a try.
> >
> > I believe this is related to
> > https://issues.apache.org/jira/browse/KAFKA-2459 .
> >
> > Cheers, Steve
> >
> > On Tue, Sep 1, 2015, 5:24 PM Rahul Jain <rahulj51@gmail.com> wrote:
> >
> >> We did notice something similar. When a broker node (out of 3) went
> down,
> >> metadata calls continued to go to the failed node and producer kept
> >> failing. We were able to make it work by increasing the
> >> reconnect.backoff.ms
> >> to 1 second.
> >>
> >> Something similar was discussed earlier -
> >>
> >>
> http://qnalist.com/questions/6002514/new-producer-metadata-update-problem-on-2-node-cluster
> >>
> >>
> >>
> >> On Mon, Aug 31, 2015 at 11:00 PM, Steve Tian <steve.cs.tian@gmail.com>
> >> wrote:
> >>
> >> > Hi everyone,
> >> >
> >> > Is there any concerns to have a long reconnect.backoff.ms for new
> java
> >> > Kafka producer (0.8.2.0/0.8.2.1)?
> >> >
> >> > Assuming we have bootstrap.servers=host1:port1,host2:port2,host3:port3
> >> and
> >> > host1 is *down* in the very beginning. If a newly created Kafka
> producer
> >> > decide to choose host1 as first node to connect for metadata update,
> >> then
> >> > that producer will keep trying on host1 *only* as default tcp timeout
> is
> >> > surely longer than default value of reconnect.backoff.ms, which is 10
> >> ms.
> >> >
> >> > I am thinking to have reconnect.backoff.ms longer than N * T where N
> is
> >> > the
> >> > number of nodes in bootstrap.servers and T is the default tcp timeout.
> >> Is
> >> > there any concerns to have a long reconnect.backoff.ms like that?
> Any
> >> > better solutions?
> >> >
> >> > Cheers, Steve
> >> >
> >>
> >
>



-- 
Thanks,
Ewen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message