spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: No space left on device
Date Wed, 22 Aug 2018 10:45:35 GMT
Hi,

that was just one of the options, and not the first one, is there any
chance of trying out the other options mentioned? For example, pointing the
shuffle storage area to a location with larger space?

Regards,
Gourav Sengupta

On Wed, Aug 22, 2018 at 11:15 AM Vitaliy Pisarev <
vitaliy.pisarev@biocatch.com> wrote:

> Documentation says that 'spark.shuffle.memoryFraction' was deprecated,
> but it doesn't say what to use instead. Any idea?
>
> On Wed, Aug 22, 2018 at 9:36 AM, Gourav Sengupta <
> gourav.sengupta@gmail.com> wrote:
>
>> Hi,
>>
>> The best part about Spark is that it is showing you which configuration
>> to tweak as well. In case you are using EMR, try to see that the
>> configuration points to the right location in the cluster "spark.local.dir".
>> If a disk is mounted across all the systems with a common path (you can
>> do that easily in EMR) then you can change the configuration to point to
>> that disk location and thereby overcome the issue.
>>
>> On another note also try to see why the data is being written to the
>> disk, is it too much shuffle, can you increase the shuffle memory as shown
>> in the error message using "spark.shuffle.memoryFraction"?
>>
>> By any change have you changed from caching to persistent data frames?
>>
>>
>> Regards,
>> Gourav Sengupta
>>
>>
>>
>> On Tue, Aug 21, 2018 at 12:04 PM Vitaliy Pisarev <
>> vitaliy.pisarev@biocatch.com> wrote:
>>
>>> The other time when I encountered this I solved it by throwing more
>>> resources at it (stronger cluster).
>>> I was not able to understand the root cause though. I'll be happy to
>>> hear deeper insight as well.
>>>
>>> On Mon, Aug 20, 2018 at 7:08 PM, Steve Lewis <lordjoe2000@gmail.com>
>>> wrote:
>>>
>>>>
>>>> We are trying to run a job that has previously run on Spark 1.3 on a different
cluster. The job was converted to 2.3 spark and this is a new cluster.
>>>>
>>>>     The job dies after completing about a half dozen stages with
>>>>
>>>> java.io.IOException: No space left on device
>>>>
>>>>
>>>>    It appears that the nodes are using local storage as tmp.
>>>>
>>>>
>>>>     I could use help diagnosing the issue and how to fix it.
>>>>
>>>>
>>>> Here are the spark conf properties
>>>>
>>>> Spark Conf Properties
>>>> spark.driver.extraJavaOptions=-Djava.io.tmpdir=/scratch/home/int/eva/zorzan/sparktmp/
>>>> spark.master=spark://10.141.0.34:7077
>>>> spark.mesos.executor.memoryOverhead=3128
>>>> spark.shuffle.consolidateFiles=true
>>>> spark.shuffle.spill=falsespark.app.name=Anonymous
>>>> spark.shuffle.manager=sort
>>>> spark.storage.memoryFraction=0.3
>>>> spark.jars=file:/home/int/eva/zorzan/bin/SparkHydraV2-master/HydraSparkBuilt.jar
>>>> spark.ui.killEnabled=true
>>>> spark.shuffle.spill.compress=true
>>>> spark.shuffle.sort.bypassMergeThreshold=100
>>>> com.lordjoe.distributed.marker_property=spark_property_set
>>>> spark.executor.memory=12g
>>>> spark.mesos.coarse=true
>>>> spark.shuffle.memoryFraction=0.4
>>>> spark.serializer=org.apache.spark.serializer.KryoSerializer
>>>> spark.kryo.registrator=com.lordjoe.distributed.hydra.HydraKryoSerializer
>>>> spark.default.parallelism=360
>>>> spark.io.compression.codec=lz4
>>>> spark.reducer.maxMbInFlight=128
>>>> spark.hadoop.validateOutputSpecs=false
>>>> spark.submit.deployMode=client
>>>> spark.local.dir=/scratch/home/int/eva/zorzan/sparktmp
>>>> spark.shuffle.file.buffer.kb=1024
>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> <https://maps.google.com/?q=4221+105th+Ave+NE+Kirkland,+WA+98033&entry=gmail&source=g>
>>>> Kirkland, WA 98033
>>>> <https://maps.google.com/?q=4221+105th+Ave+NE+Kirkland,+WA+98033&entry=gmail&source=g>
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>

Mime
View raw message