james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: Questions on the Mail and MailRepository interfaces
Date Wed, 16 May 2007 14:00:32 GMT

On 5/16/07, Stefano Bagnara <apache@bago.org> wrote:
> Jukka Zitting ha scritto:
> > A message consists of a envelope and the contained message. In JCR
> > this is represented as the james:mail subclass of the standard nt:file
> > node type (see http://wiki.apache.org/jackrabbit/nt%3afile):
> >
> >    [james:mail] > nt:file
> >    - james:state (STRING)
> >    - james:error (STRING)
> >    - james:sender (STRING)
> >    - james:recipients (STRING) multiple
> >    - james:remotehost (STRING)
> >    - james:remoteaddr (STRING)
> >    - jamesattr:* (UNDEFINED)
> If we move to MessageRepository (JCR based) + EnvelopeRepository (JMS
> based) model then we don't need the state, error, sender, recipients,
> remotehost, remoteaddr, attributes stuff in the message repository.

OK. Currently I'm just trying to store everything specified by the
Mail interface, but modifying the content model won't be a problem. In
fact I placed the envelope information on the nt:file parent node on
purpose to avoid having them mixed with the message stuff in the
content node.

> Instead we may need some IMAP stuff in the MessageRepository (for the
> IMAP stuff you may be interested in this document written by Joachim
> months ago: http://www.joachim-draeger.de/JamesImap/drafts.html )

I'll give it a look...

> > [..]
> > Normal mail messages are represented as a tree of MIME entities or
> > parts. Each entity is individually referenceable (for easy linking and
> > quick access) and contains associated the mail headers as string
> > attributes:
> > [...]
> > I'm still undecided on how deep I should go in pre-parsing the message
> > contents. For example should I parse Date headers and store them as
> > JCR DATE properties to enable efficient date-based queries? Another
> > complex question is how to best handle encryption and digital
> > signature mechanisms like S/MIME...
> I'm not sure at all that the backend should be aware of the
> content/structure of the message.

I guess that depends on the requirements. If you're only interested in
having a dumb message store that just passes messages back and forth
as-is, then not parsing them is a good idea. But if you want to be
able to efficiently search, manage, and manipulate the messages inside
the repository, then understanding the content structure makes very
much sense. A good requirement that I'm trying to achieve is the IMAP
feature of selectively downloading parts of a multipart message. I
wouldn't want to have to parse the entire multipart message over and
over again to serve such client requests.

More generally, I guess the question is whether you see the James mail
repository as just a transient space where the message resides for a
while until it is either forwarded via SMTP or retrieved over POP.
What I'm trying (at least for now) to achieve is a more persistent
mail storage that is actually used as the *endpoint* of the email
delivery and accessed in-place through interfaces like IMAP or a
webmail client. Perhaps there's some reasonable common ground?

> The MOST IMPORTANT thing at all is that if I store a message and I later
> retrieve it every single space, every single header, everything is
> exactly as I wrote it. Even if it was malformed.

Is this a hard requirement? If yes, then I could just model the entire
mime message as a normal nt:resource node, in which case the JCR
repository would act just like an advanced file system with
transactions and some search features.

Personally I don't see the exact storage requirement as essential, as
the mail specs explicitly allow all sorts of intermediate nodes to
perform various types of reformattings on messages while in transit.
Things should be fine as long as the original intended content is

> To achieve performance we'll probably have to avoid parsing the mime
> structure at all: we don't need this for most SMTP/POP3 operations. Some
> IMAP operation needs this, but this should probably done on demand and
> not when writing the message to the repository.

One possible approach, at the expense of storing potentially redundant
duplicate data, is that the original message source is stored as a
verbatim binary stream and the message content is automatically
"exploded" when the first client that actually needs to parse the


Jukka Zitting

To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org

View raw message