spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: Spark SQL, Hive & Parquet data types
Date Mon, 23 Feb 2015 12:25:53 GMT
Yes, recently we improved ParquetRelation2 quite a bit. Spark SQL uses 
its own Parquet support to read partitioned Parquet tables declared in 
Hive metastore. Only writing to partitioned tables is not covered yet. 
These improvements will be included in Spark 1.3.0.

Just created SPARK-5948 to track writing to partitioned Parquet tables.

Cheng

On 2/20/15 10:58 PM, The Watcher wrote:
>>
>>     1. In Spark 1.3.0, timestamp support was added, also Spark SQL uses
>>     its own Parquet support to handle both read path and write path when
>>     dealing with Parquet tables declared in Hive metastore, as long as you’re
>>     not writing to a partitioned table. So yes, you can.
>>
>> Ah, I had missed the part about being partitioned or not. Is this related
> to the work being done on ParquetRelation2 ?
>
> We will indeed write to a partitioned table : do neither the read nor the
> write path go through Spark SQL's parquet support in that case ? Is there a
> JIRA/PR I can monitor to see when this would change ?
>
> Thanks
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message