spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghavendra Pandey <raghavendra.pan...@gmail.com>
Subject Re: Spark SQL: Storing AVRO Schema in Parquet
Date Fri, 09 Jan 2015 07:05:35 GMT
I cam across this http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/.
You can take a look.

On Fri Jan 09 2015 at 12:08:49 PM Raghavendra Pandey <
raghavendra.pandey@gmail.com> wrote:

> I have the similar kind of requirement where I want to push avro data into
> parquet. But it seems you have to do it on your own. There is parquet-mr
> project that uses hadoop to do so. I am trying to write a spark job to do
> similar kind of thing.
>
> On Fri, Jan 9, 2015 at 3:20 AM, Jerry Lam <chilinglam@gmail.com> wrote:
>
>> Hi spark users,
>>
>> I'm using spark SQL to create parquet files on HDFS. I would like to
>> store the avro schema into the parquet meta so that non spark sql
>> applications can marshall the data without avro schema using the avro
>> parquet reader. Currently, schemaRDD.saveAsParquetFile does not allow to do
>> that. Is there another API that allows me to do this?
>>
>> Best Regards,
>>
>> Jerry
>>
>
>

Mime
View raw message