spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: setting heap space
Date Mon, 13 Oct 2014 08:00:32 GMT
Like this:

import org.apache.spark.storage.StorageLevel
val rdd = sc.parallelize(1 to
1000000).persist(StorageLevel.MEMORY_AND_DISK_SER)

Thanks
Best Regards

On Mon, Oct 13, 2014 at 12:50 PM, Chengi Liu <chengi.liu.86@gmail.com>
wrote:

> Cool.. Thanks.. And one last final question..
> conf = SparkConf.set(....).set(...)
> matrix = get_data(..)
> rdd = sc.parallelize(matrix) # heap error here...
> How and where do I set set the storage level.. seems like conf is the
> wrong place to set this thing up..?? as I get this error:
> py4j.protocol.Py4JJavaError: An error occurred while calling
> None.org.apache.spark.api.java.JavaSparkContext.
> : java.lang.IllegalArgumentException: For input string:
> "StorageLevel.MEMORY_AND_DISK_SER"
> ?
> Thanks for all the help
>
> On Mon, Oct 13, 2014 at 12:15 AM, Akhil Das <akhil@sigmoidanalytics.com>
> wrote:
>
>> like this you can set:
>>
>> sparkConf.set("spark.executor.extraJavaOptions", " -XX:+UseCompressedOops
>> -XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:FreqInlineSize=300
>> -XX:MaxInlineSize=300 ")
>>
>> Here's a benchmark example
>> <https://github.com/tdas/spark-streaming-benchmark/blob/bd591dbe9e2836d9a72b87c3e63e30ffd908dfd6/Benchmark.scala#L30>
>>
>> Thanks
>> Best Regards
>>
>> On Mon, Oct 13, 2014 at 12:36 PM, Chengi Liu <chengi.liu.86@gmail.com>
>> wrote:
>>
>>> Hi Akhil,
>>>   Thanks for the response..
>>> Another query... do you know how to use "spark.executor.extraJavaOptions"
>>> option?
>>> SparkConf.set("spark.executor.extraJavaOptions","what value should go
>>> in here")?
>>> I am trying to find an example but cannot seem to find the same..
>>>
>>>
>>> On Mon, Oct 13, 2014 at 12:03 AM, Akhil Das <akhil@sigmoidanalytics.com>
>>> wrote:
>>>
>>>> Few things to keep in mind:
>>>> - I believe Driver memory should not exceed executor memory
>>>> - Set spark.storage.memoryFraction default is 0.6
>>>> - Set spark.rdd.compress default is set to false
>>>> - Always specify the level of parallelism while doing a groupBy,
>>>> reduceBy, join, sortBy etc.
>>>> - If you don't have enough memory and the data is huge, then set the
>>>> Storage level to DISK_AND_MEMORY_SER
>>>>
>>>> More you can read over here.
>>>> <http://spark.apache.org/docs/1.0.0/tuning.html>
>>>>
>>>> Thanks
>>>> Best Regards
>>>>
>>>> On Sun, Oct 12, 2014 at 10:28 PM, Chengi Liu <chengi.liu.86@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>   I am trying to use spark but I am having hard time configuring the
>>>>> sparkconf...
>>>>> My current conf is
>>>>> conf =
>>>>> SparkConf().set("spark.executor.memory","10g").set("spark.akka.frameSize",
>>>>> "100000000").set("spark.driver.memory","16g")
>>>>>
>>>>> but I still see the java heap size error
>>>>> 14/10/12 09:54:50 ERROR Executor: Exception in task 3.0 in stage 0.0
>>>>> (TID 3)
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>> at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:296)
>>>>> at
>>>>> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:35)
>>>>> at
>>>>> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:18)
>>>>> at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:699)
>>>>> at
>>>>> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:332)
>>>>> at
>>>>> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>>>>> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>>>> at
>>>>> com.twitter.chill.WrappedArraySerializer.read(WrappedArraySerializer.scala:34)
>>>>> at
>>>>> com.twitter.chill.WrappedArraySerializer.read(WrappedArraySerializer.scala:21)
>>>>> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>>>> at org.apache.spark.serializer.KryoDeserializationStream.readO
>>>>>
>>>>>
>>>>> Whats the right way to turn these knobs and what other knobs I can
>>>>> play with.
>>>>> Thanks
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message