james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Charles <e...@apache.org>
Subject Re: IMAP Fetch OOM [WAS: Re: ActiveMQ the cause?]
Date Thu, 14 Oct 2010 10:10:38 GMT
Hi Norman,

I tested with the patch you sent me off-ml, and it rebuilding index on a 
large index goes 10 ways faster :)
I still didn't succeed to get the oom synchronizing.

Do you think we could have this patch in the coming imap 0.2-M1 release?

For later releases, we could still talk about ways to enhance the FETCH 
performance (full streaming solution for example).

Tks,

Eric


On 12/10/2010 10:27, Eric Charles wrote:
> - read "raw mail" instead of "raw content"
> - to have added value, "streaming from store to socket via listener" 
> goes with "store raw mail".
>
>
> On 12/10/2010 10:10, Eric Charles wrote:
>> Hi Norman,
>>
>> Yes, Iterator<MessageResult> getMessages(MessageRange set, ...) is 
>> really the place where we have to  act to reduce the memory consumption.
>>
>> I also though about some ways to better scale.
>> I didn't really think to a batch way, but this can help, even if we 
>> will have to go to the repository (database for example) 10 times in 
>> your example.
>> Also I'm not sure the gc will act before the end of the batch 
>> processing (depends on the gc config), so you could still have memory 
>> growing very fast to serve a huge fetch, without being really garbaged.
>>
>> But this could be a quite quick fix (if easy to implement ?)
>>
>> I thought here on "streaming" and "raw content".
>>
>> - Streaming: I worked on a project where the http socket was directly 
>> streamed to the jdbc driver for file upload,  and the inverse for 
>> file download. For mail, upload is more smtp and is managed by jms 
>> queue (with  eventual claim-check pattern if needed). The download 
>> part is imap, and we load everything in memory before writing the 
>> response. We could imagine some kind of listener that would be 
>> notified of each mail reading (reading done via stream). The listener 
>> would write the response via stream also, without storing it in memory.
>>
>> - "raw content". The store API forces to work with MailboxMembership 
>> that clearly isolates the headers from the content,... This is good 
>> to map the IMAP RFC that allows a client to ask  for headers only for 
>> example. With the current API, we "impose" the store to structure the 
>> mails in content/header/... and if the store can not structure the 
>> mail (for example MailDir), we impose it to parse it, even if we have 
>> to return the raw content in case of a classical plain FETCH 
>> command... I was wondering if we could not add a raw content storage 
>> obligation for all stores, and if we need to return the raw content, 
>> simply use it. This could play nice with the stream option above.
>>
>> But yes, the batch approach is quicker to implement.
>>
>> Tks,
>>
>> Eric
>>
>>
>> On 11/10/2010 19:28, Norman Maurer wrote:
>>> Hi there,
>>>
>>> I tried to get my head around the cause of the OOM and I think I found
>>> a good solution. Let me outline the idea...
>>>
>>> The MessageManager interface expose many methods which expose kinds of
>>> Iterator<...>  objects. So the idea is to build up some "batch retrieve
>>> Iterator" which then retrieve the needed data in batches. So the GC
>>> has a chance to kick in and free up resources. So here is an example..
>>>
>>> When call MessageManager.getMessages(...) (which returns
>>> Iterator<MessageResult>  ) and use a MessageRange of 1:1000 we would
>>> fetch the MessagResult 1:100 as starting point. Then wait till
>>> Iterator.hasNext() or Iterator.next() is called and we have no
>>> MessageResult left we would fetch 101:200 and so on..
>>>
>>> I think this should work and will be much more efficient in terms of
>>> memory. All this could get done in the "store" implementations and
>>> could be configurable. Like use 100 as default batch count and be able
>>> to set it via a setter.
>>>
>>> WDYT ?
>>>
>>> Bye,
>>> Norman
>>>
>>>
>>> 2010/10/11 Norman Maurer<norman@apache.org>:
>>>> Hi Eric,
>>>>
>>>> Did you also have a look with wireshark what the exact command and
>>>> argument was which triggered the OOM?
>>>>
>>>> Thx
>>>> Norman
>>>>
>>>> 2010/10/11, Eric Charles<eric@apache.org>:
>>>>> Hi Norman,
>>>>>
>>>>> There were 2 main problems:
>>>>> 1. The amq one which is now resolved tks to your last commit
>>>>> 2. James no more responding on imap which is always caused by OOM (I
>>>>> missed some log the first time).
>>>>>
>>>>> For the second one, analysis of memory dump shows oom comes from huge
>>>>> usage of memory due to loading of message, headers,... (in case of
>>>>> 10.000 message fetch for example).
>>>>> I don't benefit from Lob streaming on derby database, but it won't 
>>>>> help
>>>>> much because jpaheader for example also take much memory.
>>>>>
>>>>> Tks,
>>>>>
>>>>> Eric
>>>>>
>>>>> On 11/10/2010 13:10, Norman Maurer wrote:
>>>>>> Ok 4/5 is fixed now... Just to keep you updated..
>>>>>>
>>>>>> Bye.
>>>>>> Norman
>>>>>>
>>>>>> 2010/10/11 Norman Maurer<norman@apache.org>:
>>>>>>> Ok at least you can reproduce it, thats good ;) Did you take
a  
>>>>>>> thread
>>>>>>> dump ?
>>>>>>>
>>>>>>> Bye,
>>>>>>> Norman
>>>>>>>
>>>>>>>
>>>>>>> 2010/10/11 Eric Charles<eric@apache.org>:
>>>>>>>> It's the same with latest thunderbird
>>>>>>>> I restarted disabling 'Check for new messages on startup
on all my
>>>>>>>> accounts.
>>>>>>>> If I go quickly from one folder to another, I fall back in
the 
>>>>>>>> endless
>>>>>>>> 'downloading'/'indexing'...
>>>>>>>> However, if I quietly click on 'Get Mail' folder per folder,

>>>>>>>> it's ok.
>>>>>>>>
>>>>>>>> I think we are still with Bug 1 (Bug 2 and 3 should be resolved

>>>>>>>> if 1 is
>>>>>>>> resolved) for IMAP, fetching simultaneously some folders.
>>>>>>>> Bug 4 is for amq.
>>>>>>>>
>>>>>>>> Tks,
>>>>>>>>
>>>>>>>> Eric
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/10/2010 20:03, Eric Charles wrote:
>>>>>>>>> I tried to resync thunderbird without clicking on any
folder.
>>>>>>>>> Still the same behaviour : "downloading xxx on yyy",
www on 
>>>>>>>>> zzz,...
>>>>>>>>>
>>>>>>>>> Wireshark tells me more: I never saw such red/black lines
in 
>>>>>>>>> the tcp
>>>>>>>>> stream (one red/black on every 5/10 tcp packet: "segment
lost").
>>>>>>>>> 1783    8.626604    91.183.38.48    192.168.1.12    IMAP
   [TCP
>>>>>>>>> Previous
>>>>>>>>> segment lost] Response:
>>>>>>>>> ss.properties?rev=1005079&r1=1005078&r2=1005079&view=diff
>>>>>>>>>
>>>>>>>>> I was wondering if my cable was right:
>>>>>>>>> - tested plain http via cable: wireshark is green.
>>>>>>>>> - tested thunderbird/james via wifi : same black/red
lines in
>>>>>>>>> wireshark.
>>>>>>>>>
>>>>>>>>> I have saved the dump and will analyze further tomorrow,
but a 
>>>>>>>>> tcp
>>>>>>>>> conversation selected from a "segment lost" seems ok.
>>>>>>>>>
>>>>>>>>> So for now (this may change), I think we have:
>>>>>>>>>
>>>>>>>>> 1. A client is in a stage that causes the "segment lost"
tcp 
>>>>>>>>> errors ==>
>>>>>>>>> Bug 1
>>>>>>>>> 2. Client/server conversation loops endless ==>  
 Bug 2
>>>>>>>>> 3.1. James finally hangs ==>    Bug 3
>>>>>>>>> 3.2. James finally gets oom ==>    Bug 3
>>>>>>>>> 4. Manual stop is needed.
>>>>>>>>> 5. After manual stop in state 3.1 or 3.2, there's a activemq
>>>>>>>>> java.io.EOFException: Chunk stream does not exist at
page: 0 
>>>>>>>>> ==>    Bug 4
>>>>>>>>>
>>>>>>>>> So 4 bugs ?
>>>>>>>>> I will upgrade my thunderbird 3.0.3 on linux to the latest

>>>>>>>>> version and
>>>>>>>>> see
>>>>>>>>> if bug 1 is not resolved.
>>>>>>>>> Bug 4 may be resolved with 5.4.1 and latest commits for
the 
>>>>>>>>> james stop
>>>>>>>>> procedure.
>>>>>>>>>
>>>>>>>>> Tks,
>>>>>>>>>
>>>>>>>>> Eric
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/10/2010 18:31, Eric Charles wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have on James 3 (trunk of 2 week ago) my INBOX
with 10 
>>>>>>>>>> subfolders,
>>>>>>>>>> some
>>>>>>>>>> of these subfolders having more than 10.000 mails.
>>>>>>>>>> I mainly use a PC, so the IMAP sync is done regulary
along 
>>>>>>>>>> the day.
>>>>>>>>>>
>>>>>>>>>> I also have another PC I synchronize once a week.
>>>>>>>>>> During the IMAP sync of that PC, I selected randomly
some 
>>>>>>>>>> subfolders
>>>>>>>>>> and
>>>>>>>>>> saw (this occured twice, but not always...):
>>>>>>>>>> - Thunderbird syncs well during a some minutes (10?)
>>>>>>>>>> - After, Thunderbird begins to say "downloading xx
of yy 
>>>>>>>>>> mails"..
>>>>>>>>>> .when
>>>>>>>>>> yy is reached, he says "downloading ww of zz" where
zz is a 
>>>>>>>>>> little
>>>>>>>>>> greater
>>>>>>>>>> than yy.
>>>>>>>>>> - I wait, wait, and finally have timeout, and the
mails are 
>>>>>>>>>> no more
>>>>>>>>>> viewable in thunderbird.
>>>>>>>>>>
>>>>>>>>>> James is stucked.
>>>>>>>>>> The first time I had OOM (I think), today, I had
no OOM, but 
>>>>>>>>>> James was
>>>>>>>>>> no
>>>>>>>>>> more reachable via IMAP, though accepting mails via
SMTP.
>>>>>>>>>>
>>>>>>>>>> I stopped, and when restarting, I had the following
exception 
>>>>>>>>>> (James
>>>>>>>>>> was
>>>>>>>>>> not usable anymore):
>>>>>>>>>> INFO  18:16:37,646 | 
>>>>>>>>>> org.apache.activemq.store.kahadb.plist.PListStore
>>>>>>>>>> |
>>>>>>>>>> PListStore:activemq-data/localhost/tmp_storage started
>>>>>>>>>> INFO  18:16:37,648 | org.apache.activemq.broker.BrokerService

>>>>>>>>>> | Using
>>>>>>>>>> Persistence Adapter:
>>>>>>>>>> KahaDBPersistenceAdapter[activemq-data/localhost/KahaDB]
>>>>>>>>>> INFO  18:16:38,248 | 
>>>>>>>>>> org.apache.activemq.store.kahadb.plist.PListStore
>>>>>>>>>> |
>>>>>>>>>> PListStore:../data/localhost/tmp_storage started
>>>>>>>>>> ERROR 18:16:38,301 | org.apache.activemq.broker.BrokerService

>>>>>>>>>> | Failed
>>>>>>>>>> to
>>>>>>>>>> start ActiveMQ JMS Message Broker. Reason: 
>>>>>>>>>> java.io.EOFException: Chunk
>>>>>>>>>> stream does not exist at page: 0
>>>>>>>>>> java.io.EOFException: Chunk stream does not exist
at page: 0
>>>>>>>>>>          at
>>>>>>>>>> org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:454)

>>>>>>>>>>
>>>>>>>>>>          at
>>>>>>>>>> org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:431)

>>>>>>>>>>
>>>>>>>>>>          at
>>>>>>>>>> org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:428)

>>>>>>>>>>
>>>>>>>>>>          at
>>>>>>>>>> org.apache.kahadb.page.Transaction.load(Transaction.java:404)
>>>>>>>>>>          at
>>>>>>>>>> org.apache.kahadb.page.Transaction.load(Transaction.java:361)
>>>>>>>>>>          at
>>>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase$1.execute(MessageDatabase.java:243)

>>>>>>>>>>
>>>>>>>>>>          at
>>>>>>>>>> org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
>>>>>>>>>>          at
>>>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.loadPageFile(MessageDatabase.java:230)

>>>>>>>>>>
>>>>>>>>>>          at
>>>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:309)

>>>>>>>>>>
>>>>>>>>>>          at
>>>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:353)

>>>>>>>>>>
>>>>>>>>>>          at
>>>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:217)

>>>>>>>>>>
>>>>>>>>>>          at
>>>>>>>>>> org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:178)

>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Sounds l ike https://issues.apache.org/activemq/browse/AMQ-2935.
>>>>>>>>>>
>>>>>>>>>> To solve it, I had to remove the activemq-data directory

>>>>>>>>>> (btw, 2 weeks
>>>>>>>>>> ago was activemq 5.4.0 with 2 brokers started and

>>>>>>>>>> activemq-data in bin
>>>>>>>>>> directory).
>>>>>>>>>>
>>>>>>>>>> I made a test to restart from scratch my account
in 
>>>>>>>>>> thunderbird, and
>>>>>>>>>> it
>>>>>>>>>> was OK.
>>>>>>>>>>
>>>>>>>>>> Is it because it does a incremental sync and I select
different
>>>>>>>>>> folders
>>>>>>>>>> (just to make things complicated :) ) during the
download ?
>>>>>>>>>>
>>>>>>>>>> Anyway, it is not easy to reproduce.
>>>>>>>>>> Activemq 5.4.1. may be worth to try, but I'm not
sure it the the
>>>>>>>>>> cause...
>>>>>>>>>>
>>>>>>>>>> Tks,
>>>>>>>>>>
>>>>>>>>>> Eric
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>>>>>> For additional commands, e-mail: 
>>>>>>>>>> server-dev-help@james.apache.org
>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>
>>>>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>>>>
>>>>>>>>
>>>>>> ---------------------------------------------------------------------

>>>>>>
>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>
>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>> For additional commands, e-mail: server-dev-help@james.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message