flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Rohrmann (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (FLINK-10482) java.lang.IllegalArgumentException: Negative number of in progress checkpoints
Date Fri, 07 Dec 2018 10:46:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Till Rohrmann resolved FLINK-10482.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.5.6

Fixed via
1.8.0: 114cb2cabe5a2c236f02089675a57ec44bb1e4bd
1.7.1: f19bc72e910615a4d122f2fe3777fde6774bc001
1.6.3: 0664e02cad06a4494293e5f350554f1c1279d936
1.5.6: ddbef1be626a60d2d9a6c7c162c806a5d2953818

> java.lang.IllegalArgumentException: Negative number of in progress checkpoints
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-10482
>                 URL: https://issues.apache.org/jira/browse/FLINK-10482
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.6.1
>            Reporter: Julio Biason
>            Assignee: Andrey Zagrebin
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.5.6, 1.6.3, 1.8.0, 1.7.1
>
>
> Recently I found the following log on my JobManager log:
> {noformat}
> 2018-10-02 17:44:50,090 [flink-akka.actor.default-dispatcher-4117] ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler 
- Implementation error: Unhandled exception.
>  java.lang.IllegalArgumentException: Negative number of in progress checkpoints
>          at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:139)
>          at org.apache.flink.runtime.checkpoint.CheckpointStatsCounts.<init>(CheckpointStatsCounts.java:72)
>          at org.apache.flink.runtime.checkpoint.CheckpointStatsCounts.createSnapshot(CheckpointStatsCounts.java:177)
>          at org.apache.flink.runtime.checkpoint.CheckpointStatsTracker.createSnapshot(CheckpointStatsTracker.java:166)
>          at org.apache.flink.runtime.executiongraph.ExecutionGraph.getCheckpointStatsSnapshot(ExecutionGraph.java:553)
>          at org.apache.flink.runtime.executiongraph.ArchivedExecutionGraph.createFrom(ArchivedExecutionGraph.java:340)
>          at org.apache.flink.runtime.jobmaster.JobMaster.requestJob(JobMaster.java:923)
>          at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source)                 
 
>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>          at java.lang.reflect.Method.invoke(Method.java:498)                               
 
>          at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247)                                                                                        
 
>          at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162)
>          at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
>          at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)                                                                                                  
 
>          at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)                                                                                       
 
>          at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)                                                                                                   
 
>          at akka.actor.Actor$class.aroundReceive(Actor.scala:502)                                                                                                                            
 
>          at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)                                                                                                                     
 
>          at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)                 
 
>          at akka.actor.ActorCell.invoke(ActorCell.scala:495)           
 
>          at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)                                                                                                                          
 
>          at akka.dispatch.Mailbox.run(Mailbox.scala:224)    
>          at akka.dispatch.Mailbox.exec(Mailbox.scala:234)                                                                                                                                    
 
>          at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)                                                                                                             
 
>          at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)                                                                                                 
 
>          at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)     
 
>          at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {noformat}
> Related: The job details don't appear, the screen shows only the skeleton, but no information
(like the pipeline, substasks, etc).
> One thing that may have caused this is that the job was failing – an uncaught exception
on our code – and, during one of its restarts, I issued a "flink cancel <jobid>".
The job was cancelled, but the JobManager interface took a very long time to put the slots
as available again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message