hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 付庆午 <qingwu...@qunar.com>
Subject hadoop2.2.0 fairsharescheduler error
Date Thu, 28 Nov 2013 07:40:15 GMT
Dear all,
         Our hadoop2.2.0 cluster had an error, for the info:
         The appatempt of jobs didn’t been putted into the eventQueue of FairScheduler when
many jobs being submitted to the resourcemanager on the same time. But we couldn’t reappear
the error, so I think that maybe resoult in concurrent.
Normal logs:
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService:
Application with id 1120 submitted by user root
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
Storing application with id application_1384743376038_1120
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
USER=root     IP=192.168.24.101       OPERATION=Submit Application Request    TARGET=ClientRMService
 RESULT=SUCCESS  APPID=application_1384743376038_1120
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1384743376038_1120 State change from NEW to NEW_SAVING
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
Storing info for app: application_1384743376038_1120
2013-11-27 14:25:36,516 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1384743376038_1120 State change from NEW_SAVING to SUBMITTED
2013-11-27 14:25:36,516 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
Registering app attempt : appattempt_1384743376038_1120_000001
2013-11-27 14:25:36,516 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1384743376038_1120_000001 State change from NEW to SUBMITTED
2013-11-27 14:25:36,516 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Application Submission: appattempt_1384743376038_1120_000001, user: root, currently active:
2
2013-11-27 14:25:36,516 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1384743376038_1120_000001 State change from SUBMITTED to SCHEDULED
2013-11-27 14:25:36,516 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1384743376038_1120 State change from SUBMITTED to ACCEPTED
2013-11-27 14:25:36,816 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable:
Node offered to app: application_1384743376038_1120 reserved: false


Abnormal logs:  these logs doesn’t contain the red line log : 2013-11-27 14:25:36,516 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application Submission:
appattempt_1384743376038_1120_000001, user: root, currently active: 2

2013-11-27 14:27:01,391 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService:
Allocated new applicationId: 1122
2013-11-27 14:27:01,391 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService:
Allocated new applicationId: 1121
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService:
Application with id 1121 submitted by user yangping.wu
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService:
Application with id 1122 submitted by user yangping.wu
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
USER=yangping.wu      IP=192.168.24.101 OPERATION=Submit Application Request    TARGET=ClientRMService
 RESULT=SUCCESS  APPID=application_1384743376038_1121
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
Storing application with id application_1384743376038_1122
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
USER=yangping.wu      IP=192.168.24.101
       OPERATION=Submit Application Request    TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1384743376038_1122
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1384743376038_1122 State change from NEW to NEW_SAVING
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
Storing application with id application_1384743376038_1121
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
Storing info for app: application_1384743376038_1122
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1384743376038_1121 State change from NEW to NEW_SAVING
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
Storing info for app: application_1384743376038_1121
2013-11-27 14:27:02,252 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1384743376038_1122 State change from NEW_SAVING to SUBMITTED
2013-11-27 14:27:02,253 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1384743376038_1121 State change from NEW_SAVING to SUBMITTED
2013-11-27 14:27:02,253 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
Registering app attempt : appattempt_1384743376038_1122_000001
2013-11-27 14:27:02,253 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1384743376038_1122_000001 State change from NEW to SUBMITTED
2013-11-27 14:27:02,253 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
Registering app attempt : appattempt_1384743376038_1121_000001
2013-11-27 14:27:02,253 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1384743376038_1121_000001 State change from NEW to SUBMITTED
2013-11-27 14:27:02,258 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_1384743376038_1122_000001
2013-11-27 14:27:02,258 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_1384743376038_1121_000001


I’m suspicious about SchedulerEventDispatcher.handle() doesn’t put the AppAttempt into
its eventQueue. But, the eventQueue is a LinkedBlockingQueue, so, I doubt that is there some
bugs of jdk1.7 ?
Anybody can help me?  Thank you !

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message