spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iain Cundy <Iain.Cu...@amdocs.com>
Subject RE: [MARKETING] Re: mapwithstate Hangs with Error cleaning broadcast
Date Tue, 15 Mar 2016 18:30:55 GMT
Hi Manas

I saw a very similar problem while using mapWithState. Timeout on BlockManager remove leading
to a stall.

In my case it only occurred when there was a big backlog of micro-batches, combined with a
shortage of memory. The adding and removing of blocks between new and old tasks was interleaved.
 Don’t really know what caused it. Once I fixed the problems that were causing the backlog
– in my case state compaction not working with Kryo in 1.6.0 (with Kryo workaround rather
than patch) – I’ve never seen it again.

So if you’ve got a backlog or other issue to fix maybe you’ll get lucky too ☺.

Cheers
Iain

From: manas kar [mailto:poorinspark@gmail.com]
Sent: 15 March 2016 14:49
To: Ted Yu
Cc: user
Subject: [MARKETING] Re: mapwithstate Hangs with Error cleaning broadcast

I am using spark 1.6.
I am not using any broadcast variable.
This broadcast variable is probably used by the state management of mapwithState

...Manas

On Tue, Mar 15, 2016 at 10:40 AM, Ted Yu <yuzhihong@gmail.com<mailto:yuzhihong@gmail.com>>
wrote:
Which version of Spark are you using ?

Can you show the code snippet w.r.t. broadcast variable ?

Thanks

On Tue, Mar 15, 2016 at 6:04 AM, manasdebashiskar <poorinspark@gmail.com<mailto:poorinspark@gmail.com>>
wrote:
Hi,
 I have a streaming application that takes data from a kafka topic and uses
mapwithstate.
 After couple of hours of smooth running of the application I see a problem
that seems to have stalled my application.
The batch seems to have been stuck after the following error popped up.
Has anyone seen this error or know what causes it?
14/03/2016 21:41:13,295 ERROR org.apache.spark.ContextCleaner: 95 - Error
cleaning broadcast 7456
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120
seconds]. This timeout is controlled by spark.rpc.askTimeout
        at
org.apache.spark.rpc.RpcTimeout.org<http://org.apache.spark.rpc.RpcTimeout.org>$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
        at
org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:136)
        at
org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:228)
        at
org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
        at
org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:67)
        at
org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:233)
        at
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:189)
        at
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:180)
        at scala.Option.foreach(Option.scala:236)
        at
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:180)
        at
org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180)
        at
org.apache.spark.ContextCleaner.org<http://org.apache.spark.ContextCleaner.org>$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:173)
        at
org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:68)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after
[120 seconds]
        at
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
        at
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:107)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        ... 12 more




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/mapwithstate-Hangs-with-Error-cleaning-broadcast-tp26500.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org<mailto:user-unsubscribe@spark.apache.org>
For additional commands, e-mail: user-help@spark.apache.org<mailto:user-help@spark.apache.org>



This message and the information contained herein is proprietary and confidential and subject
to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp
Mime
View raw message