spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mridul Muralidharan <mri...@gmail.com>
Subject Re: on shark, is tachyon less efficient than memory_only cache strategy ?
Date Tue, 08 Jul 2014 16:50:38 GMT
You are ignoring serde costs :-)

- Mridul

On Tue, Jul 8, 2014 at 8:48 PM, Aaron Davidson <ilikerps@gmail.com> wrote:
> Tachyon should only be marginally less performant than memory_only, because
> we mmap the data from Tachyon's ramdisk. We do not have to, say, transfer
> the data over a pipe from Tachyon; we can directly read from the buffers in
> the same way that Shark reads from its in-memory columnar format.
>
>
>
> On Tue, Jul 8, 2014 at 1:18 AM, qingyang li <liqingyang1985@gmail.com>
> wrote:
>
>> hi, when i create a table, i can point the cache strategy using
>> shark.cache,
>> i think "shark.cache=memory_only"  means data are managed by spark, and
>> data are in the same jvm with excutor;   while  "shark.cache=tachyon"
>>  means  data are managed by tachyon which is off heap, and data are not in
>> the same jvm with excutor,  so spark will load data from tachyon for each
>> query sql , so,  is  tachyon less efficient than memory_only cache strategy
>>  ?
>> if yes, can we let spark load all data once from tachyon  for all sql query
>>  if i want to use tachyon cache strategy since tachyon is more HA than
>> memory_only ?
>>

Mime
View raw message