drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mufy <mufeed.us...@gmail.com>
Subject How to set multiple Avro schemas with AvroParquetOutputFormat?
Date Wed, 01 Oct 2014 02:52:45 GMT
Have this from one of the dev guys. Was hoping to get some inputs.

In my MapReduce job, Im using AvroParquetOutputFormat to write to Parquet
files using Avro schema.

The application logic requires multiple types of files getting created by
Reducer and each file has its own Avro schema.

The class AvroParquetOutputFormat has a static method setSchema() to set
Avro schema of output. Looking at the code, AvroParquetOutputFormat uses
AvroWriteSupport.setSchema() which again is a static implementation.

Without extending AvroWriteSupport and hacking the logic, is there a
simpler way to achieve multiple Avro schema output from
AvroParquetOutputFormat in a single MR job?

Mufeed Usman
My LinkedIn <http://www.linkedin.com/pub/mufeed-usman/28/254/400> | My
Social Cause <http://www.vision2016.org.in/> | My Blogs : LiveJournal

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message