samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yi Pan <nickpa...@gmail.com>
Subject Re: Reporting deserialization error in StreamTask
Date Fri, 11 Mar 2016 21:12:28 GMT
Hi, Jack,

There have been asks similar to yours, like SAMZA-427. As fixed in
SAMZA-59, we also included metrics to report the count of deserialization
errors. If you are asking about the actual message that caused the error to
be reported, there has to be a different way. Options are:
1) write the whole message to the log. The issue is: if there are a flush
of error messages coming in, it could flood the log.
2) optionally allow user to config an error topic as an escape channel to
write those messages in raw bytes (w/ an upper limit on size). Then, the
actual message can be sent to the error topic for debug/analysis.

I personally prefer option 2). Does it serve your use case as well?

-Yi

On Fri, Mar 11, 2016 at 11:47 AM, Jack Huang <jackhuang@machinezone.com>
wrote:

> Hi all,
>
> I have a StreamTask that uses *JsonSerde* to parse the input from a Kafka
> topic. I notice that when the input is not a valid json, the task fails
> with an exception on json parsing. If I use
>
> *task.drop.deserialization.errors=**true*
>
>
> in the configuration then the malformed json will be dropped silently and
> the task goes on to the next message. Is there a way for the task to report
> the deserialization error but not fail?
>
> Thanks,
>
> Jack Huang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message