james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefano Bagnara (JIRA)" <server-...@james.apache.org>
Subject [jira] Commented: (JAMES-603) Outgoing spooling stuck over old mails when more than 1000 old mails are present in outgoing.
Date Sun, 03 Sep 2006 15:41:23 GMT
    [ http://issues.apache.org/jira/browse/JAMES-603?page=comments#action_12432345 ] 
Stefano Bagnara commented on JAMES-603:

The worst scenario: everything stuck and not accepting new mail:

I already described how it happens to have the ougoing spool locked and every outgoing thread
waiting to obtain the lock.
Now I experienced something worse and I think I got why:
I have 10 spool threads, 10 smtp workers.
I have 9 email in the spool to be remotly delivered.
9 of the 10 spool threads lock the 9 emails from the spool and start waiting to lock the outgoing
The 10th spool threads start an infinite loop over the accept of the main spool because it
find 9 mails, but it can't lock them because are being processed by the other threads, so
it keeps an infinite lock over the main spool.(this happen because the loadPendingMessages
take more than 1 second maybe because the server is already stressing the db with the outgoing
thread looping into the accpet)
The first 10 incoming smtp connections will stuck trying to acquire the lock on the main spool
to store the messages and you are under DOS.

I clearly remember user reports in the mailing list in the past months/years reporting similar
scenario and maybe we finally found the problem.

So this bug also affect the main spool even if it is more rare because mails in the main spool
are always acceptable if they are not locked and this happens only when all the available
messages are locked and the accept query takes more than 1 second: but it happens because
I saw it and I have the thread dump if anyone want to look at it.

> Outgoing spooling stuck over old mails when more than 1000 old mails are present in outgoing.
> ---------------------------------------------------------------------------------------------
>                 Key: JAMES-603
>                 URL: http://issues.apache.org/jira/browse/JAMES-603
>             Project: James
>          Issue Type: Bug
>          Components: SpoolManager & Processors, Remote Delivery
>    Affects Versions: 2.3.0rc2
>            Reporter: Stefano Bagnara
>            Priority: Blocker
>             Fix For: 2.3.0
> scenario:
> remote delivery has 6 hours for the third delaytime
> insert into the outgoing spool 1000 messages with a last_updated 5 hours ago and error_message
> start james
> send a new message
> the first remote delivery thread is stuck in the main accept method because getNextPendingMessage
ALWAYS return a new pending message but none of them is ready to be processed. The bad news
is that after it finish the 1000 messages from pendingMessages it simply restart the loadPendingMessages
and try them again, without waiting.
> So 100% CPU used until we are able to spool the 1000 "old" messages and then our james
will return to normality.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org

View raw message