flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chesnay Schepler <ches...@apache.org>
Subject Re: java.lang.Exception: TaskManager was lost/killed
Date Mon, 09 Apr 2018 19:48:16 GMT
We will need more information to offer any solution. The exception 
simply means that a TaskManager shut down, for which there are a myriad 
of possible explanations.

Please have a look at the TaskManager logs, they may contain a hint as 
to why it shut down.

On 09.04.2018 16:01, Javier Lopez wrote:
> Hi,
>
> "are you moving the job  jar to  the ~/flink-1.4.2/lib path ?  " -> 
> Yes, to every node in the cluster.
>
> On 9 April 2018 at 15:37, miki haiat <miko5054@gmail.com 
> <mailto:miko5054@gmail.com>> wrote:
>
>     Javier
>     "adding the jar file to the /lib path of every task manager"
>     are you moving the job  jar to  the* ~/flink-1.4.2/lib path* ?
>
>     On Mon, Apr 9, 2018 at 12:23 PM, Javier Lopez
>     <javier.lopez@zalando.de <mailto:javier.lopez@zalando.de>> wrote:
>
>         Hi,
>
>         We had the same metaspace problem, it was solved by adding the
>         jar file to the /lib path of every task manager, as explained
>         here
>         https://ci.apache.org/projects/flink/flink-docs-release-1.4/monitoring/debugging_classloading.html#avoiding-dynamic-classloading
>         <https://ci.apache.org/projects/flink/flink-docs-release-1.4/monitoring/debugging_classloading.html#avoiding-dynamic-classloading>.
>         As well we added these java options:
>         "-XX:CompressedClassSpaceSize=100M -XX:MaxMetaspaceSize=300M
>         -XX:MetaspaceSize=200M "
>
>         From time to time we have the same problem with TaskManagers
>         disconnecting, but the logs are not useful. We are using 1.3.2.
>
>         On 9 April 2018 at 10:41, Alexander Smirnov
>         <alexander.smirnoff@gmail.com
>         <mailto:alexander.smirnoff@gmail.com>> wrote:
>
>             I've seen similar problem, but it was not a heap size, but
>             Metaspace.
>             It was caused by a job restarting in a loop. Looks like
>             for each restart, Flink loads new instance of classes and
>             very soon in runs out of metaspace.
>
>             I've created a JIRA issue for this problem, but got no
>             response from the development team on it:
>             https://issues.apache.org/jira/browse/FLINK-9132
>             <https://issues.apache.org/jira/browse/FLINK-9132>
>
>
>             On Mon, Apr 9, 2018 at 11:36 AM 王凯 <wangkaibg@163.com
>             <mailto:wangkaibg@163.com>> wrote:
>
>                 thanks a lot,i will try it
>
>                 在 2018-04-09 00:06:02,"TechnoMage"
>                 <mlatta@technomage.com <mailto:mlatta@technomage.com>>
>                 写道:
>
>                     I have seen this when my task manager ran out of
>                     RAM. Increase the heap size.
>
>                     flink-conf.yaml:
>                     taskmanager.heap.mb
>                     jobmanager.heap.mb
>
>                     Michael
>
>>                     On Apr 8, 2018, at 2:36 AM, 王凯 <wangkaibg@163.com
>>                     <mailto:wangkaibg@163.com>> wrote:
>>
>>                     <QQ图片20180408163927.png>
>>                     hi all, recently, i found a problem,it runs well
>>                     when start. But after long run,the exception
>>                     display as above,how can resolve it?
>>
>>
>>
>
>
>
>
>
>


Mime
View raw message