hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (MAPREDUCE-5703) Job client gets failure though RM side job execution result is FINISHED and SUCCEEDED
Date Sat, 09 May 2015 00:42:02 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vinod Kumar Vavilapalli resolved MAPREDUCE-5703.
------------------------------------------------
    Resolution: Duplicate

I have run into the same problem many times before.
 - The reason why AM would fail writing the history-file is because it goes out of good nodes
to write to. The reports on this JIRA (1 node cluster, 3 nodes cluster) point to this.
 - The reason why RM says it succeeded, but JHS cannot say so is the same as that of MAPREDUCE-5547.

Closing as dup. Please revert back if you disagree.

> Job client gets failure though RM side job execution result is FINISHED and SUCCEEDED
> -------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5703
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5703
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client
>            Reporter: Ashutosh Jindal
>            Priority: Critical
>
> 1) Run MR job 
> 2) After reduce completed and while JHS file writing, restart DN.
> RM side job is shown as successful.
> JHS doesnt have info about the job.
> Job client gets NPE and exit code as 255.
> java.io.IOException: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException):
java.lang.NullPointerException
> 	at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:269)
> 	at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:173)
> 	at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:283)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:929)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2080)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2076)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2074)
> 	at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:330)
> 	at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskCompletionEvents(ClientServiceDelegate.java:382)
> 	at org.apache.hadoop.mapred.YARNRunner.getTaskCompletionEvents(YARNRunner.java:529)
> 	at org.apache.hadoop.mapreduce.Job$5.run(Job.java:668)
> 	at org.apache.hadoop.mapreduce.Job$5.run(Job.java:665)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> 	at org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:665)
> 	at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1349)
> 	at org.apache.hadoop.mapred.JobClient$NetworkedJob.monitorAndPrintJob(JobClient.java:407)
> 	at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:855)
> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:835)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message