airavata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Christie (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (AIRAVATA-2519) Email monitoring stopped, without errors
Date Fri, 22 Sep 2017 12:28:00 GMT

     [ https://issues.apache.org/jira/browse/AIRAVATA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marcus Christie reassigned AIRAVATA-2519:
-----------------------------------------

    Assignee:     (was: Marcus Christie)

> Email monitoring stopped, without errors
> ----------------------------------------
>
>                 Key: AIRAVATA-2519
>                 URL: https://issues.apache.org/jira/browse/AIRAVATA-2519
>             Project: Airavata
>          Issue Type: Bug
>          Components: GFac
>    Affects Versions: 0.18
>            Reporter: Marcus Christie
>
> Today at 12:16 am the EmailBasedMonitor just appears to have stopped working and died
silently:
> From Hipchat
> {quote}
> @marlon I didn't see any errors in the gfac log, its just that the last log messages
from the EmailBasedMonitor where it is processing emails occurs at 2017-09-20 00:16:49,447
> {quote}
> Here are the last messages from the EmailBasedMonitor in the logs
> {noformat}
> 2017-09-20 00:16:25,815 [Thread-5] ERROR o.a.a.g.m.e.EmailBasedMonitor  - FROM: root
<root@ncsa.illinois.edu>
> 2017-09-20 00:16:25,815 [Thread-5] ERROR o.a.a.g.m.e.EmailBasedMonitor  - TO: gw77jobs@scigap.org
> 2017-09-20 00:16:25,815 [Thread-5] ERROR o.a.a.g.m.e.EmailBasedMonitor  - SUBJECT: Non-zero
exit code for job 3231343
> 2017-09-20 00:16:41,930 [Thread-5] INFO  o.a.a.g.m.e.EmailBasedMonitor  - [EJM]: 5 job/s
in job monitor map
> 2017-09-20 00:16:42,167 [Thread-5] INFO  o.a.a.g.m.e.EmailBasedMonitor  - [EJM]: Retrieving
unseen emails
> 2017-09-20 00:16:42,913 [Thread-5] INFO  o.a.a.g.m.e.EmailBasedMonitor  - [EJM]: 75 new
email/s received
> 2017-09-20 00:16:49,447 [Thread-5] ERROR o.a.a.g.m.e.p.PBSEmailParser  - [EJM]: No matched
found for content => 
> PBS Job Id: 48.torque-server
> Job Name:   A746448754
> Exec host:  compute-1/0-3
> An error has occurred processing your job, see below.
> Post job file processing error; job 48.torque-server on host compute-1Unknown resource
type  REJHOST=compute-1 MSG=Root
>  cannot open home directory '/home/grid_user' specified, errno=2 (No such file or directory)
-- Ignore if root squashin
> g is enabled
> 2017-09-20 00:16:49,447 [Thread-5] INFO  o.a.a.g.m.e.EmailBasedMonitor  - Returned null
for job id, message subject--> 
> PBS JOB 48.torque-server
> 2017-09-20 00:16:49,447 [Thread-5] INFO  o.a.a.g.m.e.EmailBasedMonitor  - Returned null
for job name, message subject -
> -> PBS JOB 48.torque-server
> {noformat}
> If an error was thrown I think it would have been logged since the [EmailBasedMonitor
thread catches an logs Throwable|https://github.com/apache/airavata/blob/d82842ddcf8ae3a57e74db6fb48e872da8df0a27/modules/gfac/gfac-impl/src/main/java/org/apache/airavata/gfac/monitor/email/EmailBasedMonitor.java#L243-L243].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message