kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Brown <...@metamx.com>
Subject Re: Kafka consumer not reconnecting to restarted Kafka node.
Date Tue, 18 Oct 2011 20:29:57 GMT
Hi all,

Mathias' FetcherRunnable error has bitten me a couple times too, and
I've finally found a way to reproduce it. I can't reproduce it using
the default config, but I can if I add a path to the zk connect
string. Our deployments do add a path—and I'll go ahead and wager that
Mathias' do, too, and LinkedIn's don't. ;)

The repro below works for me on tag kafka-0.7.0-incubating-candidate-3
from git://git.apache.org/kafka.git, and also on branch kafka-v0.6
from git://github.com/kafka-dev/kafka.git. (Both use
zookeeper-3.3.3.jar.)

 Dan

1. Add a path to the zk connect string in config/server.properties:
-zk.connect=localhost:2181
+zk.connect=localhost:2181/kafka

2. Start zk:
$ bin/zookeeper-server-start.sh config/zookeeper.properties

3. Create the zk path:
$ zkCli.sh -server localhost create /kafka null

4. Start a broker:
$ bin/kafka-server-start.sh config/server.properties

5. Start a consumer:
$ bin/kafka-console-consumer.sh --zookeeper localhost/kafka --topic one

6. Publish a message to create the topic and connect the consumer to the broker:
$ date | bin/kafka-console-producer.sh --zookeeper localhost/kafka --topic one

7. Kill zk (^C), let the consumer disconnect, restart zk, and let the
consumer reconnect

8. Kill the broker (^C)

After you kill the broker in (8), the consumer should log the error
that Mathias reported:
[2011-10-18 15:57:22,264] INFO multifetch reconnect due to
java.io.EOFException: Received -1 when reading from channel, socket
has likely been closed. (kafka.consumer.SimpleConsumer)
[2011-10-18 15:57:22,266] ERROR error in FetcherRunnable
(kafka.consumer.FetcherRunnable)
java.net.ConnectException: Connection refused
        at sun.nio.ch.Net.connect(Native Method)
        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
        at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:54)
        at kafka.consumer.SimpleConsumer.liftedTree2$1(SimpleConsumer.scala:130)
        at kafka.consumer.SimpleConsumer.multifetch(SimpleConsumer.scala:122)
        at kafka.consumer.FetcherRunnable.run(FetcherRunnable.scala:64)
[2011-10-18 15:57:22,267] INFO stopping fetcher FetchRunnable-0 to
host 192.168.0.184 (kafka.consumer.FetcherRunnable)

At this point, the consumer should be unresponsive: it's lost the
connection to its broker, and it won't rebalance if you restart the
broker or add new consumers to its group.

Mime
View raw message