spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sadhan Sood <sadhan.s...@gmail.com>
Subject Re: Adding partitions to parquet data
Date Thu, 20 Nov 2014 20:02:34 GMT
Ah awesome, thanks!!

On Thu, Nov 20, 2014 at 3:01 PM, Michael Armbrust <michael@databricks.com>
wrote:

> In 1.2 by default we use Spark parquet support instead of Hive when the
> SerDe contains the word "Parquet".  This should work with hive partitioning.
>
> On Thu, Nov 20, 2014 at 10:33 AM, Sadhan Sood <sadhan.sood@gmail.com>
> wrote:
>
>> We are loading parquet data as temp tables but wondering if there is a
>> way to add a partition to the data without going through hive (we still
>> want to use spark's parquet serde as compared to hive). The data looks like
>> ->
>>
>> /date1/file1, /date1/file2 ... , /date2/file1,
>> /date2/file2,..../daten/filem
>>
>> and we are loading it like:
>> val parquetFileRDD = sqlContext.parquetFile(comma separated parquet file
>> names)
>>
>> but it would be nice to able to add a partition and provide date in the
>> query parameter.
>>
>
>

Mime
View raw message