Marcus Christie created AIRAVATA-2519:
-----------------------------------------
Summary: Email monitoring stopped, without errors
Key: AIRAVATA-2519
URL: https://issues.apache.org/jira/browse/AIRAVATA-2519
Project: Airavata
Issue Type: Bug
Components: GFac
Affects Versions: 0.18
Reporter: Marcus Christie
Assignee: Shameera Rathnayaka
Today at 12:16 am the EmailBasedMonitor just appears to have stopped working and died silently:
>From Hipchat
{quote}
@marlon I didn't see any errors in the gfac log, its just that the last log messages from
the EmailBasedMonitor where it is processing emails occurs at 2017-09-20 00:16:49,447
{quote}
Here are the last messages from the EmailBasedMonitor in the logs
{noformat}
2017-09-20 00:16:25,815 [Thread-5] ERROR o.a.a.g.m.e.EmailBasedMonitor - FROM: root <root@ncsa.illinois.edu>
2017-09-20 00:16:25,815 [Thread-5] ERROR o.a.a.g.m.e.EmailBasedMonitor - TO: gw77jobs@scigap.org
2017-09-20 00:16:25,815 [Thread-5] ERROR o.a.a.g.m.e.EmailBasedMonitor - SUBJECT: Non-zero
exit code for job 3231343
2017-09-20 00:16:41,930 [Thread-5] INFO o.a.a.g.m.e.EmailBasedMonitor - [EJM]: 5 job/s in
job monitor map
2017-09-20 00:16:42,167 [Thread-5] INFO o.a.a.g.m.e.EmailBasedMonitor - [EJM]: Retrieving
unseen emails
2017-09-20 00:16:42,913 [Thread-5] INFO o.a.a.g.m.e.EmailBasedMonitor - [EJM]: 75 new email/s
received
2017-09-20 00:16:49,447 [Thread-5] ERROR o.a.a.g.m.e.p.PBSEmailParser - [EJM]: No matched
found for content =>
PBS Job Id: 48.torque-server
Job Name: A746448754
Exec host: compute-1/0-3
An error has occurred processing your job, see below.
Post job file processing error; job 48.torque-server on host compute-1Unknown resource type
REJHOST=compute-1 MSG=Root
cannot open home directory '/home/grid_user' specified, errno=2 (No such file or directory)
-- Ignore if root squashin
g is enabled
2017-09-20 00:16:49,447 [Thread-5] INFO o.a.a.g.m.e.EmailBasedMonitor - Returned null for
job id, message subject-->
PBS JOB 48.torque-server
2017-09-20 00:16:49,447 [Thread-5] INFO o.a.a.g.m.e.EmailBasedMonitor - Returned null for
job name, message subject -
-> PBS JOB 48.torque-server
{noformat}
If an error was thrown I think it would have been logged since the [EmailBasedMonitor thread
catches an logs Throwable|https://github.com/apache/airavata/blob/d82842ddcf8ae3a57e74db6fb48e872da8df0a27/modules/gfac/gfac-impl/src/main/java/org/apache/airavata/gfac/monitor/email/EmailBasedMonitor.java#L243-L243].
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
|