spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shahab <shahab.mok...@gmail.com>
Subject Re: why "Shuffle Write" is not zero when everything is cached and there is enough memory?
Date Mon, 30 Mar 2015 11:01:35 GMT
Thanks Saisai. I will try your solution, but still i don't understand why
filesystem should be used where there is a plenty of memory available!



On Mon, Mar 30, 2015 at 11:22 AM, Saisai Shao <sai.sai.shao@gmail.com>
wrote:

> Shuffle write will finally spill the data into file system as a bunch of
> files. If you want to avoid disk write, you can mount a ramdisk and
> configure "spark.local.dir" to this ram disk. So shuffle output will write
> to memory based FS, and will not introduce disk IO.
>
> Thanks
> Jerry
>
> 2015-03-30 17:15 GMT+08:00 shahab <shahab.mokari@gmail.com>:
>
>> Hi,
>>
>> I was looking at SparkUI, Executors, and I noticed that I have 597 MB for
>>  "Shuffle while I am using cached temp-table and the Spark had 2 GB free
>> memory (the number under Memory Used is 597 MB /2.6 GB) ?!!!
>>
>> Shouldn't be Shuffle Write be zero and everything (map/reduce) tasks be
>> done in memory?
>>
>> best,
>>
>> /Shahab
>>
>
>

Mime
View raw message