kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Sverdelov <alexey.sverde...@googlemail.com>
Subject Re: Experiences with corrupted messages
Date Mon, 05 Oct 2015 10:04:45 GMT
Hi Marina,

this is how I "fixed" this problem:
http://stackoverflow.com/questions/32904383/apache-kafka-with-high-level-consumer-skip-corrupted-messages/32945841

This is a workaround and I hope it will be fixed in some of next Kafka
releases.

Have a nice day,
Alexey

On Fri, Oct 2, 2015 at 2:57 PM, Marina <ppine7@yahoo.com.invalid> wrote:

> Hi, Lance,I'm very interested in your analyses of handling corrupt
> messages in High-level consumer as well.
> We also experienced some un-explained "deaths" of some high-level
> consumers. Very rarely though. We could not figure out why they died yet.
> Now I wonder if this could be due to such corrupted messages....When you
> say "Your only other recourse is to iterate past the problem offset" - what
> exactly do you mean?
> 1) do you mean by manually updating current offset in Zookeeper (if ZK
> storage is used)? what if the new Kafka-based storage is used?
> 2) or do you mean to skip this message when iterating over events in the
> consumer code - when reading Kafka's stream?
>             ConsumerIterator<byte[], byte[]> iter = kafkaStream.iterator();
>             while (iter.hasNext()) {
>                 --- skip bad message here somehow?
> ....             }
> I would think that if you can get message in the while{} - you are already
> past the point at which Consumer dies if the message is corrupt.... is it
> not the case?
> thanks!MArina
> [sorry, I did not mean to high-jack the thread - but I think it is
> important to understand how to skip corrupted messages for both use cases
> ....]
>
>       From: Lance Laursen <llaursen@rubiconproject.com>
>  To: users@kafka.apache.org
>  Sent: Thursday, October 1, 2015 4:49 PM
>  Subject: Re: Experiences with corrupted messages
>
> Hey Jörg,
>
> Unfortunately when the high level consumer hits a corrupt message, it
> enters an invalid state and closes. The only way around this is to iterate
> your offset by 1 in order to skip the corrupt message. This is currently
> not automated. You can catch this exception if you are using the simple
> consumer client, but unfortunately mirrormaker uses the high level client.
>
> There have been some corrupt producer message bugs related to using snappy
> compression recently, but this does not seem to be the same as your
> problem.
>
> Does MM stop on the exact same message each time (
>
> https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-ConsumerOffsetChecker
> )? I would suggest triple checking that your configurations are the same
> across all DC's (you mentioned that MM mirrors successfully to another DC
> with no problem), as well as examine the problem message to see if you can
> find anything different about it when compared to the others (See:
>
> https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-SimpleConsumerShell
> ). Your only other recourse is to iterate past the problem offset.
>
>
>
> On Thu, Oct 1, 2015 at 1:22 AM, Jörg Wagner <joerg.wagner1@1und1.de>
> wrote:
>
> > Hey everyone,
> >
> > I've been having some issues with corrupted messages and mirrormaker as I
> > wrote previously. Since there was no feedback, I want to ask a new
> question:
> >
> > Did you ever have corrupted messages in kafka? Did things break? How did
> > you recover or work around that?
> >
> > Thanks
> > Jörg
> >
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message