james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "robert burrell donkin" <robertburrelldon...@gmail.com>
Subject Re: Questions on the Mail and MailRepository interfaces
Date Thu, 17 May 2007 16:56:35 GMT
On 5/17/07, Jukka Zitting <jukka.zitting@gmail.com> wrote:
> Hi,
>
> On 5/17/07, robert burrell donkin <robertburrelldonkin@gmail.com> wrote:
> > On 5/16/07, Jukka Zitting <jukka.zitting@gmail.com> wrote:
> > > One possible approach, at the expense of storing potentially redundant
> > > duplicate data, is that the original message source is stored as a
> > > verbatim binary stream and the message content is automatically
> > > "exploded" when the first client that actually needs to parse the
> > > message.
> >
> > i really like this idea :-)
>
> There's one major caveat with this approach: redundant information and
> the performance cost of maintaining that.

yes

> Maintaining updates in the raw message stream is in some (many?) cases
> much more expensive than in a fully parsed representation. Consider
> for example a mailet that wants to modify a subject line or add a
> footer to all messages. Such operations would require that we update
> the original message content as well as the individual header property
> or body part in question. Updating the raw message source can in such
> case easily take an order of magnitude more time than updating the
> parsed representation.

this cost is only required if we choose to update the original

> Note that I believe that it is possible to parse an incoming message
> into a JCR node tree and recreate it back into a byte stream in the
> same O(n) time and O(1) memory as is required to stream the raw
> message source to a traditional spool file.

i suspect that nio -> file will be quicker but let's save this
argument and let the number decide

i didn't mean that intermediary spooling would be the only way but an
architecture that could support it would be worthwhile. being able to
use an intermediary spool file enables some designs which would not be
otherwise possible. for example, splitting the processing between two
instances. this would allow the email parsing and processing to be
done as non-root.

> Perhaps we should have two modes for the JCR mail repository
> implementation: one for pure relaying and one for more complex
> processing. The former satisfies the relaying requirements of the SMTP
> spec, while the latter is optimized for message transformations and
> complex access patterns like in IMAP or webmail clients.

i don't think that two modes are necessary and it would be good if
this could be avoided . there is a danger that JAMES is drifting
towards become just a collection of unrelated protocol implementations
unless the data set is held together.

there is a case for retaining the original raw contents of a mail even
for rich patterns. this would allow better auditing and error
recovery.

exploding would work well when coupled with a changed flag. the
original message would be retained unaltered whatever the processing.
on demand, the original could be parsed and stored in a rich
representation. if the original cannot be parsed then the mail would
be marked.

if the mail is altered then a flag would be marked and the mail would
be reconstructed on demand from the rich representation. if the
message has not been transformed then the raw original can be used
directly.

- robert

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message