kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Ward <tim.w...@origamienergy.com.INVALID>
Subject Java consumer error handling on DNS lookup failure
Date Tue, 04 Jun 2019 14:59:37 GMT
I have a Kafka client written in Java running in Kubernetes, and Kafka running in Kubernetes.

When the client is running but no Kafka nodes are running it appears from the exception below
that the DNS lookup fails, then something catches the exception, logs it, and reties. Apparently
without returning or throwing from poll().

This would all be fair enough ... except that the retry happens every few milliseconds, causing
a large stack trace to be logged every few milliseconds, which, if this keeps going for a
few days, eats up an awful lot of space in the cloud logging system. And it *can* keep happening
for days or weeks in a development environment, because a developer working on another part
of the system may not care, or even know, that this part is broken.

What can I do to reduce the volume of logging data? Some combination of interventions that
could


  *   Retry less quickly than every few milliseconds
  *   Retry a finite number of times before giving up altogether
  *   Cause poll() to throw rather than retry
  *   Not include the stack trace in the log messages

might be helpful. The general approach to K8s applications seems to be that if a dependency
doesn't exist the client application should simply crash out, so that Kubernetes' backoff
and retry mechanism will do what's wanted, in which case some way of getting poll() to throw
rather than swallow this exception might be the answer?

Error connecting to node confluent-0.confluent.mynamespace.svc.cluster.local:9091 (id: 0 rack:
null)
java.io.IOException: Can't resolve address: confluent-0.confluent.mynamespace.svc.cluster.local:9091
              at org.apache.kafka.common.network.Selector.doConnect(Selector.java:235) ~[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.common.network.Selector.connect(Selector.java:214) ~[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:864)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient.access$700(NetworkClient.java:64)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:1035)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:920)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:508) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:271)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:243)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:314)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1218)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181)
[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115)
[kafka-clients-2.0.0.jar:?]
              at com.origamienergy.etpu.nodes.md.MetrologyWriteWorker.run(MetrologyWriteWorker.java:51)
[tiger-v2.5-0-g0fd0b8f.jar:?]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_212]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_212]
              at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: java.nio.channels.UnresolvedAddressException
              at sun.nio.ch.Net.checkAddress(Net.java:101) ~[?:1.8.0_212]
              at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622) ~[?:1.8.0_212]
              at org.apache.kafka.common.network.Selector.doConnect(Selector.java:233) ~[kafka-clients-2.0.0.jar:?]
              ... 19 more

Tim Ward

This email is from Origami Energy Limited. The contents of this email and any attachment are
confidential to the intended recipient(s). If you are not an intended recipient: (i) do not
use, disclose, distribute, copy or publish this email or its contents; (ii) please contact
Origami Energy Limited immediately; and then (iii) delete this email. For more information,
our privacy policy is available here: https://origamienergy.com/privacy-policy/. Origami Energy
Limited (company number 8619644) is a company registered in England with its registered office
at Ashcombe Court, Woolsack Way, Godalming, GU7 1LQ.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message