flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From miki haiat <miko5...@gmail.com>
Subject Temporary failure in name resolution
Date Tue, 03 Apr 2018 06:26:30 GMT
i tried to run flink on kubernetes and  as stand alone HA cluster and on
both cases  task manger got lost/kill after few hours/days    .
im using ubuntu and flink 1.4.2 .


this is part of the log , i also attaches the full log .

>
> org.tlv.esb.StreamingJob$EsbTraceEvictor@20ffca60,
> WindowedStream.apply(WindowedStream.java:1061)) -> Sink: Unnamed (1/1)
> (91b27853aa30be93322d9c516ec266bf) switched from RUNNING to FAILED.
> java.lang.Exception: TaskManager was lost/killed:
> 6dc6cd5c15588b49da39a31b6480b2e3 @ beam2 (dataPort=42587)
> at
> org.apache.flink.runtime.instance.SimpleSlot.releaseSlot(SimpleSlot.java:217)
> at
> org.apache.flink.runtime.instance.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:523)
> at
> org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:192)
> at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:167)
> at
> org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:212)
> at org.apache.flink.runtime.jobmanager.JobManager.org
> $apache$flink$runtime$jobmanager$JobManager$$handleTaskManagerTerminated(JobManager.scala:1198)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:1096)
> at
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
> at
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:49)
> at
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
> at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> at
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
> at
> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
> at
> akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46)
> at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:374)
> at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:511)
> at akka.actor.ActorCell.invoke(ActorCell.scala:494)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
> at akka.dispatch.Mailbox.run(Mailbox.scala:224)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2018-04-02 13:09:01,727 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph - Job Flink
> Streaming esb correlate msg (0db04ff29124f59a123d4743d89473ed) switched
> from state RUNNING to FAILING.
> java.lang.Exception: TaskManager was lost/killed:
> 6dc6cd5c15588b49da39a31b6480b2e3 @ beam2 (dataPort=42587)
> at
> org.apache.flink.runtime.instance.SimpleSlot.releaseSlot(SimpleSlot.java:217)
> at
> org.apache.flink.runtime.instance.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:523)
> at
> org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:192)
> at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:167)
> at
> org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:212)
> at org.apache.flink.runtime.jobmanager.JobManager.org
> $apache$flink$runtime$jobmanager$JobManager$$handleTaskManagerTerminated(JobManager.scala:1198)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:1096)
> at
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
> at
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:49)
> at
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
> at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> at
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
> at
> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
> at
> akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46)
> at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:374)
> at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:511)
> at akka.actor.ActorCell.invoke(ActorCell.scala:494)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
> at akka.dispatch.Mailbox.run(Mailbox.scala:224)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2018-04-02 13:09:01,737 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom
> Source (1/1) (a10c25c2d3de57d33828524938fcfcc2) switched from RUNNING to
> CANCELING.

Mime
View raw message