james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "robert burrell donkin" <robertburrelldon...@gmail.com>
Subject Re: Questions on the Mail and MailRepository interfaces
Date Wed, 16 May 2007 22:01:13 GMT
On 5/16/07, Stefano Bagnara <apache@bago.org> wrote:
> robert burrell donkin ha scritto:
> >> > The MOST IMPORTANT thing at all is that if I store a message and I
> >> later
> >> > retrieve it every single space, every single header, everything is
> >> > exactly as I wrote it. Even if it was malformed.
> >>
> >> Is this a hard requirement? If yes, then I could just model the entire
> >> mime message as a normal nt:resource node, in which case the JCR
> >> repository would act just like an advanced file system with
> >> transactions and some search features.
> >
> > IMAP is *VERY* sensitive about malformed messages: a MIME message
> > *MUST* be well formed. it's all too easy to crash modern IMAP clients
> > with malformed emails.
>
> I know this, but we have to make sure we know how the original mail
> looked like because in SMTP relaying we need this.
> Then we may want to do any operation for IMAP, but this must not be a
> limit of the JCR repository. (IMO).

exploding the representation seems like the way to go

the initial (top level) representation would include very basic audit
data plus the raw mail

later processes could then explode the representation: parsing the raw
data and adding new nodes containing the results. this process could
be controlled with a fine granularity: some processes may just explode
raw headers from the raw blob.

> > one approach would be to take advantage of the typing available in
> > JCRs to help the server understand mail. malformed MIME could
> > gracefully degrade to RFC822 and malformed RFC882 to a general mail
> > type.
>
> I think we should also be sure we don't loose time parsing a message if
> we don't need to parse it. If I use JAMES as a relay only mail server I
> don't want to waste resources by parsing every message.

but IMAP will be too slow if it has to parse the message every time

again, i think exploding will satisfy both needs

<snip>

> >> > To achieve performance we'll probably have to avoid parsing the mime
> >> > structure at all: we don't need this for most SMTP/POP3 operations.
> >> Some
> >> > IMAP operation needs this, but this should probably done on demand and
> >> > not when writing the message to the repository.
> >>
> >> One possible approach, at the expense of storing potentially redundant
> >> duplicate data, is that the original message source is stored as a
> >> verbatim binary stream and the message content is automatically
> >> "exploded" when the first client that actually needs to parse the
> >> message.
> >
> > i really like this idea :-)
> >
> > there is no need to wait until the first client with SEDA: exploding
> > would just another task to execute
> >
> > IMAP is write rarely and read regularly. unless MIME messages are
> > parsed and stored as separate parts, performance will be very poor in
> > normal operation.
>
> Once the message is in an IMAP folder imho there is no problem in
> translating it from rawmessage to structured/parsed message.

with a JCR there should be less need for actual message transfer. once
the message has been entered into permanent storage, there is usually
no need for the node to be routinely copied.

the extra data can be exploded on demand

> The important fact is that in SMTP relay we need to store messages and
> to retrieve them as simple streams: a 1GB message should be streamed
> reading a writing without using more than 1MB memory and without having
> to parse it.

+1

but i think that exploding will work for both cases

> > but again, the key is to be able to access the original, raw data when
> > needed
>
> Is this important also when the message is already in an IMAP repository
> and we don't need to relay it to another SMTP server? Or in this case a
> reconstructed message would suffice?

it's important in the case of malformed mail

one of the major issues i have with IMAP-fetchmail ATM is that if the
messages are malformed then the storage really mangles them and tends
to crash clients. i don't want to lose mail. i also want to be able to
debug cases where the parsing fails.

i think that exploding should satisfy both use cases

- robert

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message