mrunit-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Filimon <dangeorge.fili...@gmail.com>
Subject Re: Partial Avro MapReduce job
Date Thu, 04 Jul 2013 06:28:04 GMT
Well, from what I understand, all of these are set with setStrings() in the
Configuration meaning they have multiple options.

- io.serializations has to also include AvroSerialization.class.getName()
in addition to the existing serialization (which is for Writables).
- avro.serialization.key.writer.schema and
avro.serialization.value.writer.schema also have multiple options and the
values here should contain the schemas of the different datums inside the
AvroKeys and AvroValues.

So, if you have a job that has 3 types of AvroKeys: AvroKey<Integer>,
AvroKey<CustomRecord>, AvroKey<List<Double>>, the value of
avro.serialization.key.writer.schema should be an array of Strings with the
respective schemas of Integer, CustomRecord and List<Double>.
It should be the same for AvroValues.

What do you think about adding support for configuring these in MRUnit?
I could come up with a patch... :)


On Wed, Jul 3, 2013 at 6:48 PM, Hien Luu <hluu@yahoo.com> wrote:

> Good to hear you got it working.
>
> I need to find out more information about these properties:
>
> *io.serializations*, *avro.serialization.key.writer.schema*, and *
> avro.serialization.value.writer.schema*
> *
> *
> Hien
>
>   ------------------------------
>  *From:* Dan Filimon <dangeorge.filimon@gmail.com>
> *To:* user@mrunit.apache.org; Hien Luu <hluu@yahoo.com>
> *Sent:* Wednesday, July 3, 2013 2:06 AM
>
> *Subject:* Re: Partial Avro MapReduce job
>
> Yes, thank you! That worked. I also needed to register the Avro
> serialization like here:
> https://github.com/Lab41/etl-by-example/wiki/Testing
>
>
> On Tue, Jul 2, 2013 at 9:57 PM, Hien Luu <hluu@yahoo.com> wrote:
>
> Hi Dan,
>
> I think the following setting is needed to solve the issue that you ran
> into:
>
>
> mapDriver.getConfiguration().setStrings("avro.serialization.key.writer.schema",
>                                             <the JSON Avro schema for you
> AvroKey>);
>
> I have been trying to find more documentation about the property
> "avro.serialization.key.writer.schema".
>
> Hien
>
>
>   ------------------------------
>  *From:* Dan Filimon <dangeorge.filimon@gmail.com>
> *To:* user@mrunit.apache.org; Hien Luu <hluu@yahoo.com>
> *Sent:* Tuesday, July 2, 2013 9:17 AM
> *Subject:* Re: Partial Avro MapReduce job
>
> Hi Hien!
>
> I saw that answer but it seems like it's for a different kind of
> exception. Did you really have the same problem?
> Thanks!
>
>
> On Tue, Jul 2, 2013 at 6:39 PM, Hien Luu <hluu@yahoo.com> wrote:
>
> I ran into the same issue and the answer is at
> http://stackoverflow.com/questions/15230482/mrunit-with-avro-nullpointerexception-in-serialization
> .
>
> It would be great if this answer is added to the FAQ of MRUnit or one of
> the tutorials.
>
> Hien
>
>   ------------------------------
>  *From:* Dan Filimon <dangeorge.filimon@gmail.com>
> *To:* user@mrunit.apache.org
> *Sent:* Tuesday, July 2, 2013 7:47 AM
> *Subject:* Partial Avro MapReduce job
>
> Hi!
>
> I've been looking online for a way of testing my job and came across what
> seemed to be some promising leads. [1]
>
> I can't find anything exactly suitable to my case: I'm consuming a custom
> AvroKey and outputting IntWritable, VectorWritable from the mapper.
>
> I'm getting the following error:
> java.lang.IllegalStateException: No applicable class implementing
> Serialization in conf at io.serializations for class
> org.apache.avro.mapred.AvroKey
>
> And worryingly, .withConfiguration() is now deprecating in MRUnit so [1]
> doesn't seem like it's 100% up to date.
> Any ideas?
>
> Thanks!
>
> [1] https://cwiki.apache.org/confluence/display/MRUNIT/MRUnit+with+Avro
>
>
>
>
>
>
>
>
>

Mime
View raw message