mrunit-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hien Luu <h...@yahoo.com>
Subject Re: Partial Avro MapReduce job
Date Thu, 04 Jul 2013 15:00:10 GMT
It is possible to set all these Avro related configurations in MRUnit, so I am not sure a patch
is necessary.   What would be nice to have is documentation to point out what configurations
are needed.  Surprising I couldn't find a FAQ link on http://mrunit.apache.org/

Hien

________________________________
 From: Dan Filimon <dangeorge.filimon@gmail.com>
To: user@mrunit.apache.org; Hien Luu <hluu@yahoo.com> 
Sent: Wednesday, July 3, 2013 11:28 PM
Subject: Re: Partial Avro MapReduce job
 


Well, from what I understand, all of these are set with setStrings() in the Configuration
meaning they have multiple options.

- io.serializations has to also include AvroSerialization.class.getName() in addition to the
existing serialization (which is for Writables).
- avro.serialization.key.writer.schema and avro.serialization.value.writer.schema also have
multiple options and the values here should contain the schemas of the different datums inside
the AvroKeys and AvroValues.

So, if you have a job that has 3 types of AvroKeys: AvroKey<Integer>, AvroKey<CustomRecord>,
AvroKey<List<Double>>, the value of avro.serialization.key.writer.schema should
be an array of Strings with the respective schemas of Integer, CustomRecord and List<Double>.
It should be the same for AvroValues.

What do you think about adding support for configuring these in MRUnit?
I could come up with a patch... :)



On Wed, Jul 3, 2013 at 6:48 PM, Hien Luu <hluu@yahoo.com> wrote:

Good to hear you got it working.
>
>
>I need to find out more information about these properties:
>
>
>io.serializations, avro.serialization.key.writer.schema, and avro.serialization.value.writer.schema
>
>
>
>Hien
>
>
>
>________________________________
> From: Dan Filimon <dangeorge.filimon@gmail.com>
>To: user@mrunit.apache.org; Hien Luu <hluu@yahoo.com> 
>Sent: Wednesday, July 3, 2013 2:06 AM
>
>Subject: Re: Partial Avro MapReduce job
> 
>
>
>Yes, thank you! That worked. I also needed to register the Avro serialization like here:
>https://github.com/Lab41/etl-by-example/wiki/Testing
>
>
>
>
>On Tue, Jul 2, 2013 at 9:57 PM, Hien Luu <hluu@yahoo.com> wrote:
>
>Hi Dan,
>>
>>
>>I think the following setting is needed to solve the issue that you ran into:
>>
>>
>>mapDriver.getConfiguration().setStrings("avro.serialization.key.writer.schema",
>>                                            <the JSON Avro
schema for you AvroKey>);
>>
>>
>>I have been trying to find more documentation about the property "avro.serialization.key.writer.schema".
>>
>>Hien
>>
>>
>>
>>
>>
>>________________________________
>> From: Dan Filimon <dangeorge.filimon@gmail.com>
>>To: user@mrunit.apache.org; Hien Luu <hluu@yahoo.com> 
>>Sent: Tuesday, July 2, 2013 9:17 AM
>>Subject: Re: Partial Avro MapReduce job
>> 
>>
>>
>>Hi Hien!
>>
>>
>>I saw that answer but it seems like it's for a different kind of exception. Did you
really have the same problem?
>>Thanks!
>>
>>
>>
>>On Tue, Jul 2, 2013 at 6:39 PM, Hien Luu <hluu@yahoo.com> wrote:
>>
>>I ran into the same issue and the answer is at http://stackoverflow.com/questions/15230482/mrunit-with-avro-nullpointerexception-in-serialization.
>>>
>>>
>>>It would be great if this answer is added to the FAQ of MRUnit or one of the tutorials.
>>>
>>>
>>>Hien
>>>
>>>
>>>
>>>________________________________
>>> From: Dan Filimon <dangeorge.filimon@gmail.com>
>>>To: user@mrunit.apache.org 
>>>Sent: Tuesday, July 2, 2013 7:47 AM
>>>Subject: Partial Avro MapReduce job
>>> 
>>>
>>>
>>>Hi!
>>>
>>>I've been looking online for a way of testing my job and came across what seemed
to be some promising leads.
 [1]
>>>
>>>I can't find anything exactly suitable to my case: I'm consuming a custom AvroKey
and outputting IntWritable, VectorWritable from the mapper.
>>>
>>>I'm getting the following error:
>>>java.lang.IllegalStateException: No applicable class implementing Serialization
in conf at io.serializations for class org.apache.avro.mapred.AvroKey
>>>
>>>
>>>And worryingly, .withConfiguration() is now deprecating in MRUnit so [1] doesn't
seem like it's 100% up to date.
>>>Any ideas?
>>>
>>>Thanks!
>>>
>>>
>>>[1] https://cwiki.apache.org/confluence/display/MRUNIT/MRUnit+with+Avro
>>>
>>>
>>
>>
>>
>
>
>
Mime
View raw message