spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 李铖 <lidali...@gmail.com>
Subject Re: Should I do spark-sql query on HDFS or apache hive?
Date Wed, 18 Mar 2015 00:10:14 GMT
Did you mean that parquet is faster than hive format ,and hive format is
faster than hdfs ,for Spark SQL?

: )

2015-03-18 1:23 GMT+08:00 Michael Armbrust <michael@databricks.com>:

> The performance has more to do with the particular format you are using,
> not where the metadata is coming from.   Even hive tables are read from
> files HDFS usually.
>
> You probably should use HiveContext as its query language is more powerful
> than SQLContext.  Also, parquet is usually the faster data format for Spark
> SQL.
>
> On Tue, Mar 17, 2015 at 3:41 AM, 李铖 <lidaling1@gmail.com> wrote:
>
>> Hi,everybody.
>>
>> I am new in spark. Now I want to do interactive sql query using spark
>> sql. spark sql can run under hive or loading files from hdfs.
>>
>> Which is better or faster?
>>
>> Thanks.
>>
>
>

Mime
View raw message