kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marina <ppi...@yahoo.com.INVALID>
Subject Re: Experiences with corrupted messages
Date Fri, 02 Oct 2015 12:57:46 GMT
Hi, Lance,I'm very interested in your analyses of handling corrupt messages in High-level consumer
as well.
We also experienced some un-explained "deaths" of some high-level consumers. Very rarely though.
We could not figure out why they died yet. Now I wonder if this could be due to such corrupted
messages....When you say "Your only other recourse is to iterate past the problem offset"
- what exactly do you mean? 
1) do you mean by manually updating current offset in Zookeeper (if ZK storage is used)? what
if the new Kafka-based storage is used?
2) or do you mean to skip this message when iterating over events in the consumer code - when
reading Kafka's stream? 
            ConsumerIterator<byte[], byte[]> iter = kafkaStream.iterator();
            while (iter.hasNext()) {
                --- skip bad message here somehow?            
....             }
I would think that if you can get message in the while{} - you are already past the point
at which Consumer dies if the message is corrupt.... is it not the case?
thanks!MArina
[sorry, I did not mean to high-jack the thread - but I think it is important to understand
how to skip corrupted messages for both use cases ....]

      From: Lance Laursen <llaursen@rubiconproject.com>
 To: users@kafka.apache.org 
 Sent: Thursday, October 1, 2015 4:49 PM
 Subject: Re: Experiences with corrupted messages
   
Hey Jörg,

Unfortunately when the high level consumer hits a corrupt message, it
enters an invalid state and closes. The only way around this is to iterate
your offset by 1 in order to skip the corrupt message. This is currently
not automated. You can catch this exception if you are using the simple
consumer client, but unfortunately mirrormaker uses the high level client.

There have been some corrupt producer message bugs related to using snappy
compression recently, but this does not seem to be the same as your problem.

Does MM stop on the exact same message each time (
https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-ConsumerOffsetChecker
)? I would suggest triple checking that your configurations are the same
across all DC's (you mentioned that MM mirrors successfully to another DC
with no problem), as well as examine the problem message to see if you can
find anything different about it when compared to the others (See:
https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-SimpleConsumerShell
). Your only other recourse is to iterate past the problem offset.



On Thu, Oct 1, 2015 at 1:22 AM, Jörg Wagner <joerg.wagner1@1und1.de> wrote:

> Hey everyone,
>
> I've been having some issues with corrupted messages and mirrormaker as I
> wrote previously. Since there was no feedback, I want to ask a new question:
>
> Did you ever have corrupted messages in kafka? Did things break? How did
> you recover or work around that?
>
> Thanks
> Jörg
>

  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message