spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: About memory leak in spark 1.4.1
Date Tue, 04 Aug 2015 14:28:28 GMT
w.r.t. spark.deploy.spreadOut , here is the scaladoc:

  // As a temporary workaround before better ways of configuring memory, we
allow users to set
  // a flag that will perform round-robin scheduling across the nodes
(spreading out each app
  // among all the nodes) instead of trying to consolidate each app onto a
small # of nodes.
  private val spreadOutApps = conf.getBoolean("spark.deploy.spreadOut",
true)

Cheers

On Tue, Aug 4, 2015 at 4:13 AM, Igor Berman <igor.berman@gmail.com> wrote:

> sorry, can't disclose info about my prod cluster
>
> nothing jumps into my mind regarding your config
> we don't use lz4 compression, don't know what is spark.deploy.spreadOut(there
> is no documentation regarding this)
>
> If you are sure that you don't have memory leak in your business logic I
> would try to reset each property to default(or just remove it from your
> config) and try to run your job to see if it's not
> somehow connected
>
> my config(nothing special really)
> spark.shuffle.consolidateFiles true
> spark.speculation false
> spark.executor.extraJavaOptions -XX:+UseStringCache
> -XX:+UseCompressedStrings -XX:+PrintGC -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps -Xloggc:gc.log -verbose:gc
> spark.executor.logs.rolling.maxRetainedFiles 1000
> spark.executor.logs.rolling.strategy time
> spark.worker.cleanup.enabled true
> spark.logConf true
> spark.rdd.compress true
>
>
>
>
>
> On 4 August 2015 at 12:59, Sea <261810726@qq.com> wrote:
>
>> How much machines are there in your standalone cluster?
>> I am not using tachyon.
>>
>> GC can not help me... Can anyone help ?
>>
>> my configuration:
>>
>> spark.deploy.spreadOut false
>> spark.eventLog.enabled true
>> spark.executor.cores 24
>>
>> spark.ui.retainedJobs 10
>> spark.ui.retainedStages 10
>> spark.history.retainedApplications 5
>> spark.deploy.retainedApplications 10
>> spark.deploy.retainedDrivers  10
>> spark.streaming.ui.retainedBatches 10
>> spark.sql.thriftserver.ui.retainedSessions 10
>> spark.sql.thriftserver.ui.retainedStatements 100
>>
>> spark.file.transferTo false
>> spark.driver.maxResultSize 4g
>> spark.sql.hive.metastore.jars=/spark/spark-1.4.1/hive/*
>>
>> spark.eventLog.dir                hdfs://mycluster/user/spark/historylog
>> spark.history.fs.logDirectory     hdfs://mycluster/user/spark/historylog
>>
>> spark.driver.extraClassPath=/spark/spark-1.4.1/extlib/*
>> spark.executor.extraClassPath=/spark/spark-1.4.1/extlib/*
>>
>> spark.sql.parquet.binaryAsString true
>> spark.serializer        org.apache.spark.serializer.KryoSerializer
>> spark.kryoserializer.buffer 32
>> spark.kryoserializer.buffer.max 256
>> spark.shuffle.consolidateFiles true
>> spark.io.compression.codec org.apache.spark.io.LZ4CompressionCodec
>>
>>
>>
>>
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Igor Berman";<igor.berman@gmail.com>;
>> *发送时间:* 2015年8月3日(星期一) 晚上7:56
>> *收件人:* "Sea"<261810726@qq.com>;
>> *抄送:* "Barak Gitsis"<barakg@similarweb.com>; "Ted Yu"<yuzhihong@gmail.com>;
>> "user@spark.apache.org"<user@spark.apache.org>; "rxin"<
>> rxin@databricks.com>; "joshrosen"<joshrosen@databricks.com>; "davies"<
>> davies@databricks.com>;
>> *主题:* Re: About memory leak in spark 1.4.1
>>
>> in general, what is your configuration? use --conf "spark.logConf=true"
>>
>> we have 1.4.1 in production standalone cluster and haven't experienced
>> what you are describing
>> can you verify in web-ui that indeed spark got your 50g per executor
>> limit? I mean in configuration page..
>>
>> might be you are using offheap storage(Tachyon)?
>>
>>
>> On 3 August 2015 at 04:58, Sea <261810726@qq.com> wrote:
>>
>>> "spark uses a lot more than heap memory, it is the expected behavior."
>>>  It didn't exist in spark 1.3.x
>>> What does "a lot more than" means?  It means that I lose control of it!
>>> I try to  apply 31g, but it still grows to 55g and continues to grow!!!
>>> That is the point!
>>> I have tried set memoryFraction to 0.2,but it didn't help.
>>> I don't know whether it will still exist in the next release 1.5, I wish
>>> not.
>>>
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> *发件人:* "Barak Gitsis";<barakg@similarweb.com>;
>>> *发送时间:* 2015年8月2日(星期天) 晚上9:55
>>> *收件人:* "Sea"<261810726@qq.com>; "Ted Yu"<yuzhihong@gmail.com>;
>>> *抄送:* "user@spark.apache.org"<user@spark.apache.org>; "rxin"<
>>> rxin@databricks.com>; "joshrosen"<joshrosen@databricks.com>; "davies"<
>>> davies@databricks.com>;
>>> *主题:* Re: About memory leak in spark 1.4.1
>>>
>>> spark uses a lot more than heap memory, it is the expected behavior.
>>> in 1.4 off-heap memory usage is supposed to grow in comparison to 1.3
>>>
>>> Better use as little memory as you can for heap, and since you are not
>>> utilizing it already, it is safe for you to reduce it.
>>> memoryFraction helps you optimize heap usage for your data/application
>>> profile while keeping it tight.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Aug 2, 2015 at 12:54 PM Sea <261810726@qq.com> wrote:
>>>
>>>> spark.storage.memoryFraction is in heap memory, but my situation is
>>>> that the memory is more than heap memory !
>>>>
>>>> Anyone else use spark 1.4.1 in production?
>>>>
>>>>
>>>> ------------------ 原始邮件 ------------------
>>>> *发件人:* "Ted Yu";<yuzhihong@gmail.com>;
>>>> *发送时间:* 2015年8月2日(星期天) 下午5:45
>>>> *收件人:* "Sea"<261810726@qq.com>;
>>>> *抄送:* "Barak Gitsis"<barakg@similarweb.com>; "user@spark.apache.org"<
>>>> user@spark.apache.org>; "rxin"<rxin@databricks.com>; "joshrosen"<
>>>> joshrosen@databricks.com>; "davies"<davies@databricks.com>;
>>>> *主题:* Re: About memory leak in spark 1.4.1
>>>>
>>>> http://spark.apache.org/docs/latest/tuning.html does mention spark.storage.memoryFraction
>>>> in two places.
>>>> One is under Cache Size Tuning section.
>>>>
>>>> FYI
>>>>
>>>> On Sun, Aug 2, 2015 at 2:16 AM, Sea <261810726@qq.com> wrote:
>>>>
>>>>> Hi, Barak
>>>>>     It is ok with spark 1.3.0, the problem is with spark 1.4.1.
>>>>>     I don't think spark.storage.memoryFraction will make any sense,
>>>>> because it is still in heap memory.
>>>>>
>>>>>
>>>>> ------------------ 原始邮件 ------------------
>>>>> *发件人:* "Barak Gitsis";<barakg@similarweb.com>;
>>>>> *发送时间:* 2015年8月2日(星期天) 下午4:11
>>>>> *收件人:* "Sea"<261810726@qq.com>; "user"<user@spark.apache.org>;
>>>>> *抄送:* "rxin"<rxin@databricks.com>; "joshrosen"<
>>>>> joshrosen@databricks.com>; "davies"<davies@databricks.com>;
>>>>> *主题:* Re: About memory leak in spark 1.4.1
>>>>>
>>>>> Hi,
>>>>> reducing spark.storage.memoryFraction did the trick for me. Heap
>>>>> doesn't get filled because it is reserved..
>>>>> My reasoning is:
>>>>> I give executor all the memory i can give it, so that makes it a
>>>>> boundary.
>>>>> From here i try to make the best use of memory I can.
>>>>> storage.memoryFraction is in a sense user data space.  The rest can be
used
>>>>> by the system.
>>>>> If you don't have so much data that you MUST store in memory for
>>>>> performance, better give spark more space..
>>>>> ended up setting it to 0.3
>>>>>
>>>>> All that said, it is on spark 1.3 on cluster
>>>>>
>>>>> hope that helps
>>>>>
>>>>> On Sat, Aug 1, 2015 at 5:43 PM Sea <261810726@qq.com> wrote:
>>>>>
>>>>>> Hi, all
>>>>>> I upgrage spark to 1.4.1, many applications failed... I find the
heap
>>>>>> memory is not full , but the process of CoarseGrainedExecutorBackend
will
>>>>>> take more memory than I expect, and it will increase as time goes
on,
>>>>>> finally more than max limited of the server, the worker will die.....
>>>>>>
>>>>>> Any can help?
>>>>>>
>>>>>> Mode:standalone
>>>>>>
>>>>>> spark.executor.memory 50g
>>>>>>
>>>>>> 25583 xiaoju    20   0 75.5g  55g  28m S 1729.3 88.1   2172:52 java
>>>>>>
>>>>>> 55g more than 50g I apply
>>>>>>
>>>>>> --
>>>>> *-Barak*
>>>>>
>>>>
>>>> --
>>> *-Barak*
>>>
>>
>>
>

Mime
View raw message