james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Charles <e...@apache.org>
Subject Re: IMAP Fetch OOM [WAS: Re: ActiveMQ the cause?]
Date Tue, 12 Oct 2010 08:10:44 GMT
Hi Norman,

Yes, Iterator<MessageResult> getMessages(MessageRange set, ...) is 
really the place where we have to  act to reduce the memory consumption.

I also though about some ways to better scale.
I didn't really think to a batch way, but this can help, even if we will 
have to go to the repository (database for example) 10 times in your 
example.
Also I'm not sure the gc will act before the end of the batch processing 
(depends on the gc config), so you could still have memory growing very 
fast to serve a huge fetch, without being really garbaged.

But this could be a quite quick fix (if easy to implement ?)

I thought here on "streaming" and "raw content".

- Streaming: I worked on a project where the http socket was directly 
streamed to the jdbc driver for file upload,  and the inverse for file 
download. For mail, upload is more smtp and is managed by jms queue 
(with  eventual claim-check pattern if needed). The download part is 
imap, and we load everything in memory before writing the response. We 
could imagine some kind of listener that would be notified of each mail 
reading (reading done via stream). The listener would write the response 
via stream also, without storing it in memory.

- "raw content". The store API forces to work with MailboxMembership 
that clearly isolates the headers from the content,... This is good to 
map the IMAP RFC that allows a client to ask  for headers only for 
example. With the current API, we "impose" the store to structure the 
mails in content/header/... and if the store can not structure the mail 
(for example MailDir), we impose it to parse it, even if we have to 
return the raw content in case of a classical plain FETCH command... I 
was wondering if we could not add a raw content storage obligation for 
all stores, and if we need to return the raw content, simply use it. 
This could play nice with the stream option above.

But yes, the batch approach is quicker to implement.

Tks,

Eric


On 11/10/2010 19:28, Norman Maurer wrote:
> Hi there,
>
> I tried to get my head around the cause of the OOM and I think I found
> a good solution. Let me outline the idea...
>
> The MessageManager interface expose many methods which expose kinds of
> Iterator<...>  objects. So the idea is to build up some "batch retrieve
> Iterator" which then retrieve the needed data in batches. So the GC
> has a chance to kick in and free up resources. So here is an example..
>
> When call MessageManager.getMessages(...) (which returns
> Iterator<MessageResult>  ) and use a MessageRange of 1:1000 we would
> fetch the MessagResult 1:100 as starting point. Then wait till
> Iterator.hasNext() or Iterator.next() is called and we have no
> MessageResult left we would fetch 101:200 and so on..
>
> I think this should work and will be much more efficient in terms of
> memory. All this could get done in the "store" implementations and
> could be configurable. Like use 100 as default batch count and be able
> to set it via a setter.
>
> WDYT ?
>
> Bye,
> Norman
>
>
> 2010/10/11 Norman Maurer<norman@apache.org>:
>> Hi Eric,
>>
>> Did you also have a look with wireshark what the exact command and
>> argument was which triggered the OOM?
>>
>> Thx
>> Norman
>>
>> 2010/10/11, Eric Charles<eric@apache.org>:
>>> Hi Norman,
>>>
>>> There were 2 main problems:
>>> 1. The amq one which is now resolved tks to your last commit
>>> 2. James no more responding on imap which is always caused by OOM (I
>>> missed some log the first time).
>>>
>>> For the second one, analysis of memory dump shows oom comes from huge
>>> usage of memory due to loading of message, headers,... (in case of
>>> 10.000 message fetch for example).
>>> I don't benefit from Lob streaming on derby database, but it won't help
>>> much because jpaheader for example also take much memory.
>>>
>>> Tks,
>>>
>>> Eric
>>>
>>> On 11/10/2010 13:10, Norman Maurer wrote:
>>>> Ok 4/5 is fixed now... Just to keep you updated..
>>>>
>>>> Bye.
>>>> Norman
>>>>
>>>> 2010/10/11 Norman Maurer<norman@apache.org>:
>>>>> Ok at least you can reproduce it, thats good ;) Did you take a  thread
>>>>> dump ?
>>>>>
>>>>> Bye,
>>>>> Norman
>>>>>
>>>>>
>>>>> 2010/10/11 Eric Charles<eric@apache.org>:
>>>>>> It's the same with latest thunderbird
>>>>>> I restarted disabling 'Check for new messages on startup on all my
>>>>>> accounts.
>>>>>> If I go quickly from one folder to another, I fall back in the endless
>>>>>> 'downloading'/'indexing'...
>>>>>> However, if I quietly click on 'Get Mail' folder per folder, it's
ok.
>>>>>>
>>>>>> I think we are still with Bug 1 (Bug 2 and 3 should be resolved if
1 is
>>>>>> resolved) for IMAP, fetching simultaneously some folders.
>>>>>> Bug 4 is for amq.
>>>>>>
>>>>>> Tks,
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>> On 10/10/2010 20:03, Eric Charles wrote:
>>>>>>> I tried to resync thunderbird without clicking on any folder.
>>>>>>> Still the same behaviour : "downloading xxx on yyy", www on zzz,...
>>>>>>>
>>>>>>> Wireshark tells me more: I never saw such red/black lines in
the tcp
>>>>>>> stream (one red/black on every 5/10 tcp packet: "segment lost").
>>>>>>> 1783    8.626604    91.183.38.48    192.168.1.12    IMAP    [TCP
>>>>>>> Previous
>>>>>>> segment lost] Response:
>>>>>>> ss.properties?rev=1005079&r1=1005078&r2=1005079&view=diff
>>>>>>>
>>>>>>> I was wondering if my cable was right:
>>>>>>> - tested plain http via cable: wireshark is green.
>>>>>>> - tested thunderbird/james via wifi : same black/red lines in
>>>>>>> wireshark.
>>>>>>>
>>>>>>> I have saved the dump and will analyze further tomorrow, but
a tcp
>>>>>>> conversation selected from a "segment lost" seems ok.
>>>>>>>
>>>>>>> So for now (this may change), I think we have:
>>>>>>>
>>>>>>> 1. A client is in a stage that causes the "segment lost" tcp
errors ==>
>>>>>>> Bug 1
>>>>>>> 2. Client/server conversation loops endless ==>    Bug 2
>>>>>>> 3.1. James finally hangs ==>    Bug 3
>>>>>>> 3.2. James finally gets oom ==>    Bug 3
>>>>>>> 4. Manual stop is needed.
>>>>>>> 5. After manual stop in state 3.1 or 3.2, there's a activemq
>>>>>>> java.io.EOFException: Chunk stream does not exist at page: 0
==>    Bug 4
>>>>>>>
>>>>>>> So 4 bugs ?
>>>>>>> I will upgrade my thunderbird 3.0.3 on linux to the latest version
and
>>>>>>> see
>>>>>>> if bug 1 is not resolved.
>>>>>>> Bug 4 may be resolved with 5.4.1 and latest commits for the james
stop
>>>>>>> procedure.
>>>>>>>
>>>>>>> Tks,
>>>>>>>
>>>>>>> Eric
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/10/2010 18:31, Eric Charles wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have on James 3 (trunk of 2 week ago) my INBOX with 10
subfolders,
>>>>>>>> some
>>>>>>>> of these subfolders having more than 10.000 mails.
>>>>>>>> I mainly use a PC, so the IMAP sync is done regulary along
the day.
>>>>>>>>
>>>>>>>> I also have another PC I synchronize once a week.
>>>>>>>> During the IMAP sync of that PC, I selected randomly some
subfolders
>>>>>>>> and
>>>>>>>> saw (this occured twice, but not always...):
>>>>>>>> - Thunderbird syncs well during a some minutes (10?)
>>>>>>>> - After, Thunderbird begins to say "downloading xx of yy
mails"..
>>>>>>>> .when
>>>>>>>> yy is reached, he says "downloading ww of zz" where zz is
a little
>>>>>>>> greater
>>>>>>>> than yy.
>>>>>>>> - I wait, wait, and finally have timeout, and the mails are
no more
>>>>>>>> viewable in thunderbird.
>>>>>>>>
>>>>>>>> James is stucked.
>>>>>>>> The first time I had OOM (I think), today, I had no OOM,
but James was
>>>>>>>> no
>>>>>>>> more reachable via IMAP, though accepting mails via SMTP.
>>>>>>>>
>>>>>>>> I stopped, and when restarting, I had the following exception
(James
>>>>>>>> was
>>>>>>>> not usable anymore):
>>>>>>>> INFO  18:16:37,646 | org.apache.activemq.store.kahadb.plist.PListStore
>>>>>>>> |
>>>>>>>> PListStore:activemq-data/localhost/tmp_storage started
>>>>>>>> INFO  18:16:37,648 | org.apache.activemq.broker.BrokerService
| Using
>>>>>>>> Persistence Adapter:
>>>>>>>> KahaDBPersistenceAdapter[activemq-data/localhost/KahaDB]
>>>>>>>> INFO  18:16:38,248 | org.apache.activemq.store.kahadb.plist.PListStore
>>>>>>>> |
>>>>>>>> PListStore:../data/localhost/tmp_storage started
>>>>>>>> ERROR 18:16:38,301 | org.apache.activemq.broker.BrokerService
| Failed
>>>>>>>> to
>>>>>>>> start ActiveMQ JMS Message Broker. Reason: java.io.EOFException:
Chunk
>>>>>>>> stream does not exist at page: 0
>>>>>>>> java.io.EOFException: Chunk stream does not exist at page:
0
>>>>>>>>          at
>>>>>>>> org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:454)
>>>>>>>>          at
>>>>>>>> org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:431)
>>>>>>>>          at
>>>>>>>> org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:428)
>>>>>>>>          at
>>>>>>>> org.apache.kahadb.page.Transaction.load(Transaction.java:404)
>>>>>>>>          at
>>>>>>>> org.apache.kahadb.page.Transaction.load(Transaction.java:361)
>>>>>>>>          at
>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase$1.execute(MessageDatabase.java:243)
>>>>>>>>          at
>>>>>>>> org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
>>>>>>>>          at
>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.loadPageFile(MessageDatabase.java:230)
>>>>>>>>          at
>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:309)
>>>>>>>>          at
>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:353)
>>>>>>>>          at
>>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:217)
>>>>>>>>          at
>>>>>>>> org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:178)
>>>>>>>>
>>>>>>>> Sounds l ike https://issues.apache.org/activemq/browse/AMQ-2935.
>>>>>>>>
>>>>>>>> To solve it, I had to remove the activemq-data directory
(btw, 2 weeks
>>>>>>>> ago was activemq 5.4.0 with 2 brokers started and activemq-data
in bin
>>>>>>>> directory).
>>>>>>>>
>>>>>>>> I made a test to restart from scratch my account in thunderbird,
and
>>>>>>>> it
>>>>>>>> was OK.
>>>>>>>>
>>>>>>>> Is it because it does a incremental sync and I select different
>>>>>>>> folders
>>>>>>>> (just to make things complicated :) ) during the download
?
>>>>>>>>
>>>>>>>> Anyway, it is not easy to reproduce.
>>>>>>>> Activemq 5.4.1. may be worth to try, but I'm not sure it
the the
>>>>>>>> cause...
>>>>>>>>
>>>>>>>> Tks,
>>>>>>>>
>>>>>>>> Eric
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>>
>>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>
>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message