mrunit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Dalsass (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MRUNIT-197) Problems using Avro with MRUnit
Date Thu, 18 Sep 2014 09:31:33 GMT

     [ https://issues.apache.org/jira/browse/MRUNIT-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nicolas Dalsass updated MRUNIT-197:
-----------------------------------
    Attachment: fix_avro_serialization.patch

Hi,

There's currently another bug when using Avro objects with MrUnit, which prevents them to
be used at all. The problem is in Serialization : Avro internally uses a proxy before the
outputBuffer (an encoder), and only writes to the buffer when the encoder is closed.

This patch for the serializer closing before reading the outputBuffer again, which fixes the
problem for Avro, without I think affecting other serializations. (I checked with classic
Hadoop Serialization)

> Problems using Avro with MRUnit
> -------------------------------
>
>                 Key: MRUNIT-197
>                 URL: https://issues.apache.org/jira/browse/MRUNIT-197
>             Project: MRUnit
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Matthew Hayes
>         Attachments: MemberEventCountUnitTest.java, fix_avro_serialization.patch
>
>
> I'm not able to use MRUnit with Avro in a particular use case.  See the exception below.
 I've attached a sample test that demonstrates the problem.
> When the input is just a plain integer it works fine.  However if the input is a record
that contains an integer it doesn't work.  I stepped through the code with a debugger to try
understanding what is going on.  In the Serialization class's copy method, the serializer
it gets on this line is wrong:
> serializer = (Serializer<Object>) serializationFactory
>           .getSerializer(clazz);
> When I look at the schema within this object is is "int" instead of the record's schema.
> {noformat}
> java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast
to java.lang.Number
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:78)
> 	at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
> 	at org.apache.avro.hadoop.io.AvroSerializer.serialize(AvroSerializer.java:104)
> 	at org.apache.avro.hadoop.io.AvroSerializer.serialize(AvroSerializer.java:46)
> 	at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:74)
> 	at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:91)
> 	at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104)
> 	at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:608)
> 	at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:612)
> 	at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:118)
> 	at org.apache.hadoop.mrunit.MapDriverBase.withInput(MapDriverBase.java:207)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message