spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynold Xin <r...@databricks.com>
Subject Re: Off-heap storage and dynamic allocation
Date Tue, 03 Nov 2015 16:53:16 GMT
I don't think there is any special handling w.r.t. Tachyon vs in-heap
caching. As a matter of fact, I think the current offheap caching
implementation is pretty bad, because:

1. There is no namespace sharing in offheap mode
2. Similar to 1, you cannot recover the offheap memory once Spark driver or
executor crashes
3. It requires expensive serialization to go offheap

It would've been simpler to just treat Tachyon as a normal file system, and
use it that way to at least satisfy 1 and 2, and also substantially
simplify the internals.




On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <justin.uang@gmail.com> wrote:

> Yup, but I'm wondering what happens when an executor does get removed, but
> when we're using tachyon. Will the cached data still be available, since
> we're using off-heap storage, so the data isn't stored in the executor?
>
> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <
> ryan.blake.williams@gmail.com> wrote:
>
>> fwiw, I think that having cached RDD partitions prevents executors from
>> being removed under dynamic allocation by default; see SPARK-8958
>> <https://issues.apache.org/jira/browse/SPARK-8958>. The
>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
>> controls this.
>>
>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <justin.uang@gmail.com>
>> wrote:
>>
>>> Hey guys,
>>>
>>> According to the docs for 1.5.1, when an executor is removed for dynamic
>>> allocation, the cached data is gone. If I use off-heap storage like
>>> tachyon, conceptually there isn't this issue anymore, but is the cached
>>> data still available in practice? This would be great because then we would
>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>>> quite small.
>>>
>>> ==================
>>> In addition to writing shuffle files, executors also cache data either
>>> on disk or in memory. When an executor is removed, however, all cached data
>>> will no longer be accessible. There is currently not yet a solution for
>>> this in Spark 1.2. In future releases, the cached data may be preserved
>>> through an off-heap storage similar in spirit to how shuffle files are
>>> preserved through the external shuffle service.
>>> ==================
>>>
>>

Mime
View raw message