flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Javier Lopez <javier.lo...@zalando.de>
Subject Re: Re: java.lang.Exception: TaskManager was lost/killed
Date Mon, 09 Apr 2018 14:01:38 GMT
Hi,

"are you moving the job  jar to  the ~/flink-1.4.2/lib path ?  " -> Yes, to
every node in the cluster.

On 9 April 2018 at 15:37, miki haiat <miko5054@gmail.com> wrote:

> Javier
> "adding the jar file to the /lib path of every task manager"
> are you moving the job  jar to  the* ~/flink-1.4.2/lib path* ?
>
> On Mon, Apr 9, 2018 at 12:23 PM, Javier Lopez <javier.lopez@zalando.de>
> wrote:
>
>> Hi,
>>
>> We had the same metaspace problem, it was solved by adding the jar file
>> to the /lib path of every task manager, as explained here
>> https://ci.apache.org/projects/flink/flink-docs-release
>> -1.4/monitoring/debugging_classloading.html#avoiding-dynamic-classloading.
>> As well we added these java options: "-XX:CompressedClassSpaceSize=100M
>> -XX:MaxMetaspaceSize=300M -XX:MetaspaceSize=200M "
>>
>> From time to time we have the same problem with TaskManagers
>> disconnecting, but the logs are not useful. We are using 1.3.2.
>>
>> On 9 April 2018 at 10:41, Alexander Smirnov <alexander.smirnoff@gmail.com
>> > wrote:
>>
>>> I've seen similar problem, but it was not a heap size, but Metaspace.
>>> It was caused by a job restarting in a loop. Looks like for each
>>> restart, Flink loads new instance of classes and very soon in runs out of
>>> metaspace.
>>>
>>> I've created a JIRA issue for this problem, but got no response from the
>>> development team on it: https://issues.apache.org/jira/browse/FLINK-9132
>>>
>>>
>>> On Mon, Apr 9, 2018 at 11:36 AM 王凯 <wangkaibg@163.com> wrote:
>>>
>>>> thanks a lot,i will try it
>>>>
>>>> 在 2018-04-09 00:06:02,"TechnoMage" <mlatta@technomage.com> 写道:
>>>>
>>>> I have seen this when my task manager ran out of RAM.  Increase the
>>>> heap size.
>>>>
>>>> flink-conf.yaml:
>>>> taskmanager.heap.mb
>>>> jobmanager.heap.mb
>>>>
>>>> Michael
>>>>
>>>> On Apr 8, 2018, at 2:36 AM, 王凯 <wangkaibg@163.com> wrote:
>>>>
>>>> <QQ图片20180408163927.png>
>>>> hi all, recently, i found a problem,it runs well when start. But after
>>>> long run,the exception display as above,how can resolve it?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message