hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gera Shegalov (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (MAPREDUCE-6129) Job failed due to counter out of limited in MRAppMaster
Date Thu, 16 Oct 2014 20:27:34 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gera Shegalov resolved MAPREDUCE-6129.
--------------------------------------
    Resolution: Duplicate

[~kasha] , [~coderplay], Yes, this is a subset of MAPREDUCE-5875. 

> Job failed due to counter out of limited in MRAppMaster
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-6129
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6129
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster
>    Affects Versions: 3.0.0, 2.3.0, 2.5.0, 2.4.1, 2.5.1
>            Reporter: Min Zhou
>         Attachments: MAPREDUCE-6129.diff
>
>
> Lots of of cluster's job use more than 120 counters, those kind of jobs  failed with
exception like below
> {noformat}
> 2014-10-15 22:55:43,742 WARN [Socket Reader #1 for port 45673] org.apache.hadoop.ipc.Server:
Unable to read call parameters for client 10.180.216.12on connection protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol
for rpcKind RPC_WRITABLE
> org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 121 max=120
> 	at org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:103)
> 	at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:110)
> 	at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:175)
> 	at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:324)
> 	at org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:314)
> 	at org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:489)
> 	at org.apache.hadoop.mapred.ReduceTaskStatus.readFields(ReduceTaskStatus.java:140)
> 	at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
> 	at org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:157)
> 	at org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1802)
> 	at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1734)
> 	at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1494)
> 	at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:732)
> 	at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:606)
> 	at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:577)
> {noformat}
> The class org.apache.hadoop.mapreduce.counters.Limits load the mapred-site.xml on nodemanager
node for JobConf if it hasn't been inited. 
> If the mapred-site.xml on nodemanager node is not exist or the mapreduce.job.counters.max
hasn't been defined on that file, Class org.apache.hadoop.mapreduce.counters.Limits will just
 use the default value 120. 
> Instead, we should read user job's conf file rather than config files on nodemanager
for checking counters limits.
> I will submitt a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message