spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grega Kešpret <gr...@celtra.com>
Subject Re: Worker hangs with 100% CPU in Standalone cluster
Date Thu, 16 Jan 2014 13:58:22 GMT
Just to follow up, we have since pinpointed the problem to be in
application code (not Spark). In some cases, there was an infinite loop in
Scala HashTable linear probing algorithm, where an element's next() pointed
at itself. It was probably caused by wrong hashCode() and equals() methods
on the object we were storing.

Milos, we also have Master node separate from Worker nodes. Could someone
from Spark team comment about that?

Grega
--
[image: Inline image 1]
*Grega Kešpret*
Analytics engineer

Celtra — Rich Media Mobile Advertising
celtra.com <http://www.celtra.com/> |
@celtramobile<http://www.twitter.com/celtramobile>


On Thu, Jan 16, 2014 at 2:46 PM, Milos Nikolic <milos.nikolic83@gmail.com>wrote:

> Hello,
>
> I’m facing the same (or similar) problem. In my case, the last two tasks
> hang in a map function following sc.sequenceFile(…). It happens from time
> to time (more often with TorrentBroadcast than HttpBroadcast) and after
> restarting it works fine.
>
> The problem always happens on the same node — on the node that plays the
> roles of the master and one worker. Once this node becomes master-only
> (i.e., I removed this nodes from conf/slaves), the problem is gone.
>
> Does that mean that the master and workers have to be on separate nodes?
>
> Best,
> Milos
>
>
> On Jan 6, 2014, at 5:44 PM, Grega Kešpret <grega@celtra.com> wrote:
>
> Hi,
>
> we are seeing several times a day one worker in a Standalone cluster hang
> up with 100% CPU at the last task and doesn't proceed. After we restart the
> job, it completes successfully.
>
> We are using Spark v0.8.1-incubating.
>
> Attached please find jstack logs of Worker
> and CoarseGrainedExecutorBackend JVM processes.
>
> Grega
> --
> <celtra_logo.png>
> *Grega Kešpret*
> Analytics engineer
>
> Celtra — Rich Media Mobile Advertising
> celtra.com <http://www.celtra.com/> | @celtramobile<http://www.twitter.com/celtramobile>
>  <logs.zip>
>
>
>

Mime
View raw message