flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rinat Sharipov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-10525) Deserialization schema, skip data, that couldn't be properly deserialized
Date Wed, 10 Oct 2018 14:57:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rinat Sharipov updated FLINK-10525:
-----------------------------------
    Description: 
Hi mates, in accordance with the contract of *org.apache.flink.api.common.serialization.DeserializationSchema*,
it should return *null* value, when content couldn’t be deserialized.

But in most cases (e.x. *org.apache.flink.formats.avro.AvroDeserializationSchema*) method
fails if data is corrupted. 
  
 We’ve implemented our own SerDe class, that returns *null*, if data doesn’t satisfy avro
schema, but it’s rather hard to maintain this functionality during migration to the latest
Flink version. 

I think, that it’ll be useful if Flink will support optional skip of failed records in
avro and other Deserializers in the source code

  was:
Hi mates, in accordance with the contract of org.apache.flink.formats.avro.DeserializationSchema,
it should return *null* value, when content couldn’t be deserialized.
But in most cases (for example org.apache.flink.formats.avro.AvroDeserializationSchema) method
fails if data is corrupted. 
 
We’ve implemented our own SerDe class, that returns null, if data doesn’t satisfy avro
schema, but it’s rather hard to maintain this functionality during migration to the latest
Flink version.


> Deserialization schema, skip data, that couldn't be properly deserialized
> -------------------------------------------------------------------------
>
>                 Key: FLINK-10525
>                 URL: https://issues.apache.org/jira/browse/FLINK-10525
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Rinat Sharipov
>            Priority: Minor
>
> Hi mates, in accordance with the contract of *org.apache.flink.api.common.serialization.DeserializationSchema*,
it should return *null* value, when content couldn’t be deserialized.
> But in most cases (e.x. *org.apache.flink.formats.avro.AvroDeserializationSchema*) method
fails if data is corrupted. 
>   
>  We’ve implemented our own SerDe class, that returns *null*, if data doesn’t satisfy
avro schema, but it’s rather hard to maintain this functionality during migration to the
latest Flink version. 
> I think, that it’ll be useful if Flink will support optional skip of failed records
in avro and other Deserializers in the source code



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message