spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu (Simon) Chen" <xche...@gmail.com>
Subject Re: cache spark sql parquet file in memory?
Date Sat, 07 Jun 2014 21:23:55 GMT
Is there a way to start tachyon on top of a yarn cluster?
 On Jun 7, 2014 2:11 PM, "Marek Wiewiorka" <marek.wiewiorka@gmail.com>
wrote:

> I was also thinking of using tachyon to store parquet files - maybe
> tomorrow I will give a try as well.
>
>
> 2014-06-07 20:01 GMT+02:00 Michael Armbrust <michael@databricks.com>:
>
>> Not a stupid question!  I would like to be able to do this.  For now, you
>> might try writing the data to tachyon <http://tachyon-project.org/>
>> instead of HDFS.  This is untested though, please report any issues you run
>> into.
>>
>> Michael
>>
>>
>> On Fri, Jun 6, 2014 at 8:13 PM, Xu (Simon) Chen <xchenum@gmail.com>
>> wrote:
>>
>>> This might be a stupid question... but it seems that saveAsParquetFile()
>>> writes everything back to HDFS. I am wondering if it is possible to cache
>>> parquet-format intermediate results in memory, and therefore making spark
>>> sql queries faster.
>>>
>>> Thanks.
>>> -Simon
>>>
>>
>>
>

Mime
View raw message