hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (MAPREDUCE-2718) Job fails if AppMaster is killed
Date Fri, 16 Sep 2011 23:39:09 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Arun C Murthy resolved MAPREDUCE-2718.
--------------------------------------

    Resolution: Not A Problem

> Job fails if AppMaster is killed
> --------------------------------
>
>                 Key: MAPREDUCE-2718
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2718
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Amol Kekre
>             Fix For: 0.23.0
>
>
> Started a cluster. Sumitted a sleep job with around 10000 maps and 1000 reduces.
> when 5000 maps got completed, It killed AppMaster.
> RM web UI Application as failed.
> And jobclient after retry for 50 times -:
> {
> java.lang.reflect.UndeclaredThrowableException
>         at
> org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:161)
>         at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskCompletionEvents(ClientServiceDelegate.java:254)
>         at org.apache.hadoop.mapred.YARNRunner.getTaskCompletionEvents(YARNRunner.java:520)
>         at org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:540)
>         at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1130)
>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1084)
>         at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:259)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
>         at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:191)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
>         at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:111)
>         at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:118)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
> Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call to /98.137.103.174:42557
failed on
> connection exception: java.net.ConnectException: Connection refused
>         at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:96)
>         at $Proxy11.getTaskAttemptCompletionEvents(Unknown Source)
>         at
> org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:154)
>         ... 21 more
> Caused by: java.net.ConnectException: Call to /... failed on connection exception:
> java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1087)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1063)
>         at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:250)
>         at org.apache.hadoop.yarn.ipc.$Proxy10.call(Unknown Source)
>         at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:94)
>         ... 23 more
> Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:375)
>         at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:448)
>         at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:536)
>         at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1040)
> }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message