flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [flink] EugeneYushin commented on issue #8187: [FLINK-12197] [Formats] Avro row deser for Confluent binary format
Date Tue, 23 Apr 2019 08:43:39 GMT
EugeneYushin commented on issue #8187: [FLINK-12197] [Formats] Avro row deser for Confluent
binary format
URL: https://github.com/apache/flink/pull/8187#issuecomment-485708497
 
 
   Hi @dawidwys. Thanks for a review. Please find my answers below.
   >  Why do we need a whole duplicated `DeserializationSchema`? Can't we just wrap the
original `AvroDeserializationSchema` and just introduce a wrapper that converts the output
of the schema to `Row`?
   
   I've covered this in feature description briefly: 
   > Use the same approach as for AvroRowDeserializationSchema, extending AbstractDeserializationSchema
instead of AvroDeserializationSchema cause' the latter works with Generic/SpecificRecord abstractions
from avro libs
   
   Here's a more detailed explanation.
   `AvroDeserializationSchema` doesn't provide any methods which can be re-used in case of
`Row` passed as an input class:
   - [constructor](https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java#L98)
is intended to work with avro SpecificRecord/GenericRecord classes
   ```
   * @param recordClazz class to which deserialize. Should be one of:
   *                    {@link org.apache.avro.specific.SpecificRecord},
   *                    {@link org.apache.avro.generic.GenericRecord}.
   ```
   - [deserialize()](https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java#L126)
method can't be used as a wrapper cause':
   -- this method is parametrized with return type, so parents' `AvroDeserializationSchema.deserialize`
should return Row, while it's intended to return avro SpecificRecord/GenericRecord
   -- it sets reader schema only
   -- confluent avro binary format includes schema id in record, which should be used during
decoding to set writer schema as well
   - [getProducedType()](https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java#L165)
deals only with avro SpecificRecord/GenericRecord classes
   - the same concerns I have if to choose `RegistryAvroDeserializationSchema` instead of
`AvroDeserializationSchema` for extending
   
   Please, let me know if this makes sense to you.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message