james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Burrell Donkin <robertburrelldon...@gmail.com>
Subject Re: [IMAP] Over-designed /Some thoughts ?
Date Wed, 28 Apr 2010 06:24:03 GMT
On Wed, Apr 28, 2010 at 5:55 AM, Norman Maurer
<norman.maurer@googlemail.com> wrote:

<snip>

>>> I think it would be a good think to "simplify" the api a bit to make
>>> it a bit easier to understand. So some points which came to me mind:
>>>
>>> 1) UidChangeTracking:
>>>
>>> Is this really necessary ? It does some kind of caching but I don't
>>> see something else for which its useful. Why not just fire the events
>>> directly with a shared MailboxEventDispatcher which is the same for
>>> all Mailboxes?
>>
>> i'm not convinced it's needed but beware...
>>
>> this is one of the few areas retained from the design before i started
>> reworking. i had hoped to replace it but never really worked out how
>> to do that without crippling performance or breaking IMAP.
>
> I'm currently testing imap without the UidChangeTracker and so far it
> seems like its not really slower then before..

it's only slower than the alternatives that required to make IMAP work
properly ;-)

IIRC UIDChangeTracker tracks UID changes made by concurrent sessions
accessing the same mailbox. the local caching should work for users
own changes. it's possible that some of the changes i might have made
it redundant by now but i don't trust the functional concurrency
tests.

>>> 2) Global Mailbox caching
>>>
>>> At the moment the Mailbox is cached in a HashMap. The problem with
>>> this is it will never get recycled by the GC. This can generate a OOM
>>> over long time
>>
>> i run IMAP with approx 1.5G spread over around a hundred mailboxes.
>> i've never had an OOM. so i never bothered changing this.
>
> I think you use Torque right ? Maybe it behave a bit different there.

i inherited torque and this is one area i left alone ;-)

> I'm using JPA and its reproducable with feeding a mailbox with ca 1
> million emails. You will see the memory usage just grow and grow..
> When I took a heap dump it seems like the OpenJPA objects where never
> released, because the where hold in the HashMap.

for torque the session needs to be held to manage concurrency (mailbox
access needs to be synchronized). for OpenJPA, sounds like the mailbox
structure needs to be there to manage synchronization and caching but
a new OpenJPA object needs to be created each time.

>>> The other problem with this is, the Mailbox should be "tight" to the
>>> MailboxSession. Let me explain why. For example in JCR we could use
>>> the User/Pass which is bound to the MailboxSession to access different
>>> parts of the JCR Repository etc..
>>
>> i thought this too originally but i couldn't work out how to do so
>> without cripple performance or breaking IMAP.
>
> Sure good performance is a must, but I would prefer to have a "good"
> api first ;)

this wasn't a good performance issue but a usable at all one

when two sessions are accessing the same mailbox, there are a handful
number of operations which require caching and concurrency control to
maintain correctness. there are a number of ways that this design
could work. mailbox et al is inherited, and probably not my first
choice.

i would prefer to revise the API by pushing the Mailbox functions into
MailboxManager, and so making it an internal feature which could be
varied by implementations. the namespace handling is problematic, so i
would then model namespace by a Mailbox object which could be passed
in to each method in the API.

>> IIRC these are related issues. the essential function is caching and
>> synchronization. in performances terms, i think much higher
>> performance could be achieved by replacement by something asynchronous
>> and event driven using a blocking queue. this would be a substantial
>> change.
>>
>
> I agree with you here. But as you outlined already, its not a "easy"
> thing todo, without rewrite a lot of stuff.

very little rewriting but hard, and risky for the poorly tested
concurrent use cases. then again, maybe these don't work ATM.

the best place to start would be by using creating some more
concurrency tests. there's an application that creates tests in the
package org.apache.james.imap.functional.builder in seda.

> I even tend to believe we should do something similar to what we have
> in SMTP/POP3.  Just have some kind of LineHandler which push data in
> the processor when a CRLF was detected and so not using blocking
> streams as input at all.

the IMAP protocol makes this approach tricky, but in general yes. the
protocol handling foo is intended to address this, and should be quite
close now.

- robert

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message