kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Wu <stevenz...@gmail.com>
Subject error handling with high-level consumer
Date Wed, 04 Feb 2015 17:55:01 GMT
Hi,

We have observed these two exceptions with consumer *iterator.next()*
recently. want to ask how should we handle them properly.

*1) CRC corruption*
Message is corrupt (stored crc = 433657556, computed crc = 3265543163)

I assume in this case we should just catch it and move on to the next msg?
any other iterator/consumer exception we should catch and handle?


*2) Unrecoverable consumer erorr with "Iterator is in failed state"*

yesterday, one of our kafka consumers got stuck with very large maxalg and
was throwing the following exception.

2015-02-04 08:35:19,841 ERROR KafkaConsumer-0 KafkaConsumer - Exception on
consuming kafka with topic: <topic_name>
java.lang.IllegalStateException: Iterator is in failed state
at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:54)
at kafka.utils.IteratorTemplate.next(IteratorTemplate.scala:38)
at kafka.consumer.ConsumerIterator.next(ConsumerIterator.scala:46)
at com.netflix.suro.input.kafka.KafkaConsumer$1.run(KafkaConsumer.java:103)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

we had a surge of traffic of <topic_name>, so I guess the traffic storm
caused the problem. I tried to restart a few consumer instances but after
rebalancing, another instance got assigned the problematic partitions and
got stuck again with the above errors.

We decided to drop messages, stop all consumer instances, reset all offset
by deleting zk entries and restarted them, the problem went away.

Producer version is kafka_2.8.2-0.8.1.1 with snappy-java-1.0.5
Consumer version is kafka_2.9.2-0.8.2-beta with snappy-java-1.1.1.6

We googled this issue but this was already fixed long time ago on 0.7.x.
any idea? is mismatched snappy version the culpit? is it a bug in
0.8.2-beta?


Thanks,
Steven

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message