kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cosmin Marginean <cosmargin...@gmail.com>
Subject Re: Bizarre crash when creating a consumer
Date Fri, 08 Jan 2016 23:06:08 GMT
Just for posterity: what happened here was an issue with the hostname

ERROR 2016-01-08 22:02:09,675 [main] [none] c.k.messaging.kafka.ConsumerGroup: ip-10-100-102-52:
ip-10-100-102-52: unknown error
! java.net.UnknownHostException: ip-10-100-102-52: unknown error
! at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[na:1.8.0_65]
! at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) ~[na:1.8.0_65]
! at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) ~[na:1.8.0_65]
! at java.net.InetAddress.getLocalHost(InetAddress.java:1500) ~[na:1.8.0_65]
! ... 63 common frames omitted
! Causing: java.net.UnknownHostException: ip-10-100-102-52: ip-10-100-102-52: unknown error
! at java.net.InetAddress.getLocalHost(InetAddress.java:1505) ~[na:1.8.0_65]
! at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:119)
~[k2-app-1.0-RC52.jar:na]
! at kafka.javaapi.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:66)
~[k2-app-1.0-RC52.jar:na]
! at kafka.javaapi.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:69)
~[k2-app-1.0-RC52.jar:na]
! at kafka.consumer.Consumer$.createJavaConsumerConnector(ConsumerConnector.scala:105) ~[k2-app-1.0-RC52.jar:na]
! at kafka.consumer.Consumer.createJavaConsumerConnector(ConsumerConnector.scala) ~[k2-app-1.0-RC52.jar:na]



This is the error on the client. In the stacktrace above, ip-10-100-102-52 is the hostname
of the client connecting to Zookeeper. Setting the hostname correctly fixes this. Still not
sure why the client hostname would be a problem here though, but at least it’s lesson learnt.

(Probably a combination of factors caused this exception to be completely swallowed, but I
think that’s a different topic)

Cos  


On Friday, 8 January 2016 at 11:16, Cosmin Marginean wrote:

> Hi Marko, this seems to have solved this. Dealing with another issue now, which I’ll
report separately.
> Thank you for your help!
>  
> Cheers
> Cos
>  
>  
> On Friday, 8 January 2016 at 09:27, Cosmin Marginean wrote:
>  
> > Hi Marko, I will migrate the code and also change the timeout. thanks for your suggestions.
Will post a status once I’ve tested.
> >  
> > Cheers
> > Cos
> >  
> >  
> > On Thursday, 7 January 2016 at 22:59, Marko Bonaći wrote:
> >  
> > > Actually, why don't you use the same code as outlined here (that includes
> > > timeout in props):
> > > http://kafka.apache.org/090/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
> > >  
> > > Marko Bonaći
> > > Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> > > Solr & Elasticsearch Support
> > > Sematext <http://sematext.com/> | Contact
> > > <http://sematext.com/about/contact.html>
> > >  
> > > On Thu, Jan 7, 2016 at 11:55 PM, Marko Bonaći <marko.bonaci@sematext.com
(mailto:marko.bonaci@sematext.com)>
> > > wrote:
> > >  
> > > > Hi Cosmin,
> > > > do you have default server configuration on these new nodes you're setting
> > > > up?
> > > > I'd check consumer's socket.timeout.ms (http://socket.timeout.ms), maybe
someone set it to 30
> > > > instead of 30 000 :)
> > > > Speaking from my own experience (I had the same symptom and this turned
> > > > out to be the cause).
> > > >  
> > > > Marko Bonaći
> > > > Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> > > > Solr & Elasticsearch Support
> > > > Sematext <http://sematext.com/> | Contact
> > > > <http://sematext.com/about/contact.html>
> > > >  
> > > > On Thu, Jan 7, 2016 at 11:23 PM, Cosmin Marginean <cosmarginean@gmail.com
(mailto:cosmarginean@gmail.com)>
> > > > wrote:
> > > >  
> > > > > Hi
> > > > >  
> > > > > I have a straightforward piece of code that creates a consumer (Kafka
> > > > > 0.9.0.0).
> > > > >  
> > > > > Properties props = new Properties();
> > > > > props.put("zookeeper.connect", zookeeperServers);
> > > > > props.put(org.apache.kafka.clients.consumer.ConsumerConfig.GROUP_ID_CONFIG,
groupId);
> > > > > log.info (http://log.info)("Starting consumer group for topic {}
and group ID {}. Zookeeper servers: {}", topic, groupId, zookeeperServers);
> > > > > consumer = kafka.consumer.Consumer.createJavaConsumerConnector(new
ConsumerConfig(props));
> > > > > log.info (http://log.info)("Consumer group started for topic {} and
group ID {}", topic, groupId);
> > > > >  
> > > > > We’ve run this countless times without any issues, but now we’re
deploying a new environment (AWS, just like the ones before) and it appears that the client
Java process dies entirely (without any logs/crash report/etc). This happens right after logging
the “Starting consumer group..”, so presumably when it tries to createJavaConsumerConnector
> > > > >  
> > > > > Agreeably, this might be “environmental”, but even though we
triple checked everything (network setup, kafka logs, zookeeper logs, etc), we couldn’t
identify anything suspicious yet. So what I'd like to know is if there’s a way to add further
Kafka diagnosis/logging. Attached (trace-logging.txt) is further logging after turning everything
to TRACE, and at the top you can see the message “Starting consumer…”, but with nothing
really suspicious as far as I can tell.
> > > > >  
> > > > >  
> > > > > As an additional piece of information, Zookeeper does report the
following when this happens
> > > > >  
> > > > > 2016-01-07 21:58:44,763 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357]
- caught end of stream exception
> > > > > EndOfStreamException: Unable to read additional data from client
sessionid 0x1521e14797c0001, likely client has closed socket
> > > > > at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> > > > > at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
> > > > > at java.lang.Thread.run(Thread.java:745)
> > > > > 2016-01-07 21:58:44,764 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /10.100.101.159:41613 which had sessionid 0x1521e14797c0001
> > > > >  
> > > > >  
> > > > > Any suggestions would be appreciated.
> > > > >  
> > > > > Thank you
> > > > >  
> > > > > Cosmin  
> >  
>  


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message