spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhanfeng Huo" <huozhanf...@gmail.com>
Subject SparkSql OutOfMemoryError
Date Tue, 28 Oct 2014 09:33:48 GMT
Hiļ¼Œfriends:

I use spark(spark 1.1) sql operate data in hive-0.12, and the job fails when data is large.
So how to tune it ?

spark-defaults.conf:

    spark.shuffle.consolidateFiles true 
    spark.shuffle.manager SORT
    spark.akka.threads 4
    spark.sql.inMemoryColumnarStorage.compressed true
    spark.io.compression.codec lz4

cmds:

    ./spark-sql cluster --master spark://master:7077 --conf spark.akka.frameSize=2000 --executor-memory
30g --total-executor-cores 300 --driver-class-path /spark/default/lib/mysql-connector-java-5.1.21.jar

cmds in command line:

    SET spark.serializer=org.apache.spark.serializer.KryoSerializer; 
    SET spark.sql.shuffle.partitions=10000;
    select sum(uv) as suv, sum(vv) as svv from (select 1 as uv, count(1) as vv from test ta
 group by cookieid) t1;




14/10/28 14:42:52 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-14]
shutting down ActorSystem [sparkDriver] 
java.lang.OutOfMemoryError: Java heap space 
at com.google.protobuf_spark.ByteString.copyFrom(ByteString.java:90) 
at com.google.protobuf_spark.ByteString.copyFrom(ByteString.java:99) 
at akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:36) 
at akka.remote.EndpointWriter$$anonfun$akka$remote$EndpointWriter$$serializeMessage$1.apply(Endpoint.scala:672)

at akka.remote.EndpointWriter$$anonfun$akka$remote$EndpointWriter$$serializeMessage$1.apply(Endpoint.scala:672)

at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) 
at akka.remote.EndpointWriter.akka$remote$EndpointWriter$$serializeMessage(Endpoint.scala:671)

at akka.remote.EndpointWriter$$anonfun$7.applyOrElse(Endpoint.scala:559) 
at akka.remote.EndpointWriter$$anonfun$7.applyOrElse(Endpoint.scala:544) 
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) 
at akka.actor.FSM$class.processEvent(FSM.scala:595) 
at akka.remote.EndpointWriter.processEvent(Endpoint.scala:443) 
at akka.actor.FSM$class.akka$actor$FSM$$processMsg(FSM.scala:589) 
at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:583) 
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) 
at akka.actor.ActorCell.invoke(ActorCell.scala:456) 
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) 
at akka.dispatch.Mailbox.run(Mailbox.scala:219) 
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)

at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 
14/10/28 14:42:55 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-36]
shutting down ActorSystem [sparkDriver] 
java.lang.OutOfMemoryError: Java heap space


Zhanfeng Huo
Mime
View raw message