spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sea" <261810...@qq.com>
Subject Re: About memory leak in spark 1.4.1
Date Wed, 05 Aug 2015 13:10:45 GMT
No one help me... I help myself, I split the cluster to two cluster.... 1.4.1 and 1.3.0




------------------ 原始邮件 ------------------
发件人: "Ted Yu";<yuzhihong@gmail.com>;
发送时间: 2015年8月4日(星期二) 晚上10:28
收件人: "Igor Berman"<igor.berman@gmail.com>; 
抄送: "Sea"<261810726@qq.com>; "Barak Gitsis"<barakg@similarweb.com>; "user@spark.apache.org"<user@spark.apache.org>;
"rxin"<rxin@databricks.com>; "joshrosen"<joshrosen@databricks.com>; "davies"<davies@databricks.com>;

主题: Re: About memory leak in spark 1.4.1



w.r.t. spark.deploy.spreadOut , here is the scaladoc:

  // As a temporary workaround before better ways of configuring memory, we allow users to
set
  // a flag that will perform round-robin scheduling across the nodes (spreading out each
app
  // among all the nodes) instead of trying to consolidate each app onto a small # of nodes.
  private val spreadOutApps = conf.getBoolean("spark.deploy.spreadOut", true)



Cheers


On Tue, Aug 4, 2015 at 4:13 AM, Igor Berman <igor.berman@gmail.com> wrote:
sorry, can't disclose info about my prod cluster
nothing jumps into my mind regarding your config
we don't use lz4 compression, don't know what is spark.deploy.spreadOut(there is no documentation
regarding this)


If you are sure that you don't have memory leak in your business logic I would try to reset
each property to default(or just remove it from your config) and try to run your job to see
if it's not

somehow connected



my config(nothing special really)
spark.shuffle.consolidateFiles true
spark.speculation false

spark.executor.extraJavaOptions -XX:+UseStringCache -XX:+UseCompressedStrings -XX:+PrintGC
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log -verbose:gc
spark.executor.logs.rolling.maxRetainedFiles 1000
spark.executor.logs.rolling.strategy time
spark.worker.cleanup.enabled true
spark.logConf true
spark.rdd.compress true











On 4 August 2015 at 12:59, Sea <261810726@qq.com> wrote:
How much machines are there in your standalone cluster?

I am not using tachyon.


GC can not help me... Can anyone help ?


my configuration:


spark.deploy.spreadOut false
spark.eventLog.enabled true
spark.executor.cores 24


spark.ui.retainedJobs 10
spark.ui.retainedStages 10
spark.history.retainedApplications 5
spark.deploy.retainedApplications 10
spark.deploy.retainedDrivers  10
spark.streaming.ui.retainedBatches 10
spark.sql.thriftserver.ui.retainedSessions 10
spark.sql.thriftserver.ui.retainedStatements 100



spark.file.transferTo false
spark.driver.maxResultSize 4g
spark.sql.hive.metastore.jars=/spark/spark-1.4.1/hive/*


spark.eventLog.dir                hdfs://mycluster/user/spark/historylog
spark.history.fs.logDirectory     hdfs://mycluster/user/spark/historylog



spark.driver.extraClassPath=/spark/spark-1.4.1/extlib/*
spark.executor.extraClassPath=/spark/spark-1.4.1/extlib/*



spark.sql.parquet.binaryAsString true
spark.serializer        org.apache.spark.serializer.KryoSerializer
spark.kryoserializer.buffer 32
spark.kryoserializer.buffer.max 256
spark.shuffle.consolidateFiles true
spark.io.compression.codec org.apache.spark.io.LZ4CompressionCodec











------------------ 原始邮件 ------------------
发件人: "Igor Berman";<igor.berman@gmail.com>;
发送时间: 2015年8月3日(星期一) 晚上7:56
收件人: "Sea"<261810726@qq.com>; 
抄送: "Barak Gitsis"<barakg@similarweb.com>; "Ted Yu"<yuzhihong@gmail.com>;
"user@spark.apache.org"<user@spark.apache.org>; "rxin"<rxin@databricks.com>; "joshrosen"<joshrosen@databricks.com>;
"davies"<davies@databricks.com>; 
主题: Re: About memory leak in spark 1.4.1





in general, what is your configuration? use --conf "spark.logConf=true"



we have 1.4.1 in production standalone cluster and haven't experienced what you are describingcan
you verify in web-ui that indeed spark got your 50g per executor limit? I mean in configuration
page..


might be you are using offheap storage(Tachyon)?




On 3 August 2015 at 04:58, Sea <261810726@qq.com> wrote:
"spark uses a lot more than heap memory, it is the expected behavior."  It didn't exist in
spark 1.3.x
What does "a lot more than" means?  It means that I lose control of it!
I try to  apply 31g, but it still grows to 55g and continues to grow!!! That is the point!
I have tried set memoryFraction to 0.2,but it didn't help.
I don't know whether it will still exist in the next release 1.5, I wish not.






------------------ 原始邮件 ------------------
发件人: "Barak Gitsis";<barakg@similarweb.com>;
发送时间: 2015年8月2日(星期天) 晚上9:55
收件人: "Sea"<261810726@qq.com>; "Ted Yu"<yuzhihong@gmail.com>; 
抄送: "user@spark.apache.org"<user@spark.apache.org>; "rxin"<rxin@databricks.com>;
"joshrosen"<joshrosen@databricks.com>; "davies"<davies@databricks.com>; 
主题: Re: About memory leak in spark 1.4.1





spark uses a lot more than heap memory, it is the expected behavior.in 1.4 off-heap memory
usage is supposed to grow in comparison to 1.3


Better use as little memory as you can for heap, and since you are not utilizing it already,
it is safe for you to reduce it.
memoryFraction helps you optimize heap usage for your data/application profile while keeping
it tight.



 






On Sun, Aug 2, 2015 at 12:54 PM Sea <261810726@qq.com> wrote:

spark.storage.memoryFraction is in heap memory, but my situation is that the memory is more
than heap memory !  


Anyone else use spark 1.4.1 in production? 




------------------ 原始邮件 ------------------
发件人: "Ted Yu";<yuzhihong@gmail.com>;
发送时间: 2015年8月2日(星期天) 下午5:45
收件人: "Sea"<261810726@qq.com>; 
抄送: "Barak Gitsis"<barakg@similarweb.com>; "user@spark.apache.org"<user@spark.apache.org>;
"rxin"<rxin@databricks.com>; "joshrosen"<joshrosen@databricks.com>; "davies"<davies@databricks.com>;



主题: Re: About memory leak in spark 1.4.1




http://spark.apache.org/docs/latest/tuning.html does mention spark.storage.memoryFraction
in two places.
One is under Cache Size Tuning section.


FYI


On Sun, Aug 2, 2015 at 2:16 AM, Sea <261810726@qq.com> wrote:
Hi, Barak
    It is ok with spark 1.3.0, the problem is with spark 1.4.1.
    I don't think spark.storage.memoryFraction will make any sense, because it is still in
heap memory. 




------------------ 原始邮件 ------------------
发件人: "Barak Gitsis";<barakg@similarweb.com>;
发送时间: 2015年8月2日(星期天) 下午4:11
收件人: "Sea"<261810726@qq.com>; "user"<user@spark.apache.org>; 
抄送: "rxin"<rxin@databricks.com>; "joshrosen"<joshrosen@databricks.com>; "davies"<davies@databricks.com>;

主题: Re: About memory leak in spark 1.4.1



Hi,reducing spark.storage.memoryFraction did the trick for me. Heap doesn't get filled because
it is reserved..
My reasoning is: 
I give executor all the memory i can give it, so that makes it a boundary.
From here i try to make the best use of memory I can. storage.memoryFraction is in a sense
user data space.  The rest can be used by the system. 
If you don't have so much data that you MUST store in memory for performance, better give
spark more space.. 
ended up setting it to 0.3


All that said, it is on spark 1.3 on cluster


hope that helps


On Sat, Aug 1, 2015 at 5:43 PM Sea <261810726@qq.com> wrote:

Hi, all
I upgrage spark to 1.4.1, many applications failed... I find the heap memory is not full ,
but the process of CoarseGrainedExecutorBackend will take more memory than I expect, and it
will increase as time goes on, finally more than max limited of the server, the worker will
die.....


Any can help?


Mode:standalone


spark.executor.memory 50g


25583 xiaoju    20   0 75.5g  55g  28m S 1729.3 88.1   2172:52 java


55g more than 50g I apply



-- 

-Barak








-- 

-Barak
Mime
View raw message