spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <>
Subject Re: is spark.cleaner.ttl safe?
Date Tue, 11 Mar 2014 21:01:18 GMT
Actually, TD's work-in-progress is probably more what you want:

On Tue, Mar 11, 2014 at 1:58 PM, Michael Allman <> wrote:

> Hello,
> I've been trying to run an iterative spark job that spills 1+ GB to disk
> per iteration on a system with limited disk space. I believe there's enough
> space if spark would clean up unused data from previous iterations, but as
> it stands the number of iterations I can run is limited by available disk
> space.
> I found a thread on the usage of spark.cleaner.ttl on the old Spark Users
> Google group here:
> I think this setting may be what I'm looking for, however the cleaner
> seems to delete data that's still in use. The effect is I get bizarre
> exceptions from Spark complaining about missing broadcast data or
> ArrayIndexOutOfBounds. When is spark.cleaner.ttl safe to use? Is it
> supposed to delete in-use data or is this a bug/shortcoming?
> Cheers,
> Michael

View raw message