spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Davidson <ilike...@gmail.com>
Subject Re: Akka disassociation on Java SE Embedded
Date Tue, 27 May 2014 17:47:51 GMT
Spark should effectively turn Akka's failure detector off, because we
historically had problems with GCs and other issues causing
disassociations. The only thing that should cause these messages nowadays
is if the TCP connection (which Akka sustains between Actor Systems on
different machines) actually drops. TCP connections are pretty resilient,
so one common cause of this is actual Executor failure -- recently, I have
experienced a similar-sounding problem due to my machine's OOM killer
terminating my Executors, such that they didn't produce any error output.


On Thu, May 22, 2014 at 9:19 AM, Chanwit Kaewkasi <chanwit@gmail.com> wrote:

> Hi all,
>
> On an ARM cluster, I have been testing a wordcount program with JRE 7
> and everything is OK. But when changing to the embedded version of
> Java SE (Oracle's eJRE), the same program cannot complete all
> computing stages.
>
> It is failed by many Akka's disassociation.
>
> - I've been trying to increase Akka's timeout but still stuck. I am
> not sure what is the right way to do so? (I suspected that GC pausing
> the world is causing this).
>
> - Another question is that how could I properly turn on Akka's logging
> to see what's the root cause of this disassociation problem? (If my
> guess about GC is wrong).
>
> Best regards,
>
> -chanwit
>
> --
> Chanwit Kaewkasi
> linkedin.com/in/chanwit
>

Mime
View raw message