spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Uang <justin.u...@gmail.com>
Subject Re: Off-heap storage and dynamic allocation
Date Tue, 03 Nov 2015 15:59:26 GMT
Yup, but I'm wondering what happens when an executor does get removed, but
when we're using tachyon. Will the cached data still be available, since
we're using off-heap storage, so the data isn't stored in the executor?

On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <ryan.blake.williams@gmail.com>
wrote:

> fwiw, I think that having cached RDD partitions prevents executors from
> being removed under dynamic allocation by default; see SPARK-8958
> <https://issues.apache.org/jira/browse/SPARK-8958>. The
> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
> controls this.
>
> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <justin.uang@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> According to the docs for 1.5.1, when an executor is removed for dynamic
>> allocation, the cached data is gone. If I use off-heap storage like
>> tachyon, conceptually there isn't this issue anymore, but is the cached
>> data still available in practice? This would be great because then we would
>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>> quite small.
>>
>> ==================
>> In addition to writing shuffle files, executors also cache data either on
>> disk or in memory. When an executor is removed, however, all cached data
>> will no longer be accessible. There is currently not yet a solution for
>> this in Spark 1.2. In future releases, the cached data may be preserved
>> through an off-heap storage similar in spirit to how shuffle files are
>> preserved through the external shuffle service.
>> ==================
>>
>

Mime
View raw message