flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rinat Sharipov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-10525) Deserialization schema, skip data, that couldn't be properly deserialized
Date Wed, 10 Oct 2018 14:58:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rinat Sharipov updated FLINK-10525:
-----------------------------------
    Description: 
Hi mates, in accordance with the contract of *org.apache.flink.api.common.serialization.DeserializationSchema*,
it should return *null* value, when content couldn’t be deserialized.

But in most cases (e.x. *org.apache.flink.formats.avro.AvroDeserializationSchema*) method
fails if data is corrupted. 
  
 We’ve implemented our own SerDe class, that returns *null*, if data doesn’t satisfy avro
schema, but it’s rather hard to maintain this functionality during migration to the latest
Flink version. 

I think, that it’ll be useful feature, if Flink will support optional skip of failed records
in avro and other Deserializers

  was:
Hi mates, in accordance with the contract of *org.apache.flink.api.common.serialization.DeserializationSchema*,
it should return *null* value, when content couldn’t be deserialized.

But in most cases (e.x. *org.apache.flink.formats.avro.AvroDeserializationSchema*) method
fails if data is corrupted. 
  
 We’ve implemented our own SerDe class, that returns *null*, if data doesn’t satisfy avro
schema, but it’s rather hard to maintain this functionality during migration to the latest
Flink version. 

I think, that it’ll be useful if Flink will support optional skip of failed records in
avro and other Deserializers in the source code


> Deserialization schema, skip data, that couldn't be properly deserialized
> -------------------------------------------------------------------------
>
>                 Key: FLINK-10525
>                 URL: https://issues.apache.org/jira/browse/FLINK-10525
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Rinat Sharipov
>            Priority: Minor
>
> Hi mates, in accordance with the contract of *org.apache.flink.api.common.serialization.DeserializationSchema*,
it should return *null* value, when content couldn’t be deserialized.
> But in most cases (e.x. *org.apache.flink.formats.avro.AvroDeserializationSchema*) method
fails if data is corrupted. 
>   
>  We’ve implemented our own SerDe class, that returns *null*, if data doesn’t satisfy
avro schema, but it’s rather hard to maintain this functionality during migration to the
latest Flink version. 
> I think, that it’ll be useful feature, if Flink will support optional skip of failed
records in avro and other Deserializers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message