james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Burrell Donkin (JIRA)" <server-...@james.apache.org>
Subject [jira] Commented: (MIME4J-5) Mime4j takes really long to parse big messages
Date Tue, 08 Jul 2008 19:36:32 GMT

    [ https://issues.apache.org/jira/browse/MIME4J-5?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611765#action_12611765

Robert Burrell Donkin commented on MIME4J-5:

Yeah: I had a good think about this one and couldn't find a direct way around this problem.
Jochen also had major problems supporting limitations and this use case.

The recursive parse is an important use case (it's required by some mail protocols including
IMAP) but an exceptionally rare one. The primary use case should the simple parsing of flat
MIME messages.  

I think that the complete parsing of MIME messages containing deeply nested MIME parts is
bound to be slow and memory intensive. I suspect that it would be possible to trade quick
and easy parsing of flat MIME messages for more complex and slow parsing of nested ones. I
think this is the design we should be looking for: a good, quick efficient pull parser with
limited recursion. On top we build support for less efficient recursion (of various sorts).
This is the opposite of the currect situation.

Here's a sketch of the sort of thing I'm talking about (you'll probably come up with something
better). Remove recursion mode from EntityStateMachine. Pull out MimePullParser from MimeTokenStream.
Make this fast without recursion support. Add explicit public EntityStateMachine recurse()
to EntityStateMachine. Support fully featured recursion as an separate operation.

> Mime4j takes really long to parse big messages
> ----------------------------------------------
>                 Key: MIME4J-5
>                 URL: https://issues.apache.org/jira/browse/MIME4J-5
>             Project: Mime4j
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Norman Maurer
>            Assignee: Robert Burrell Donkin
>             Fix For: 0.4
>         Attachments: mime4j-2.patch, mime4j-3.patch, mime4j.patch
> From ml:
> Mime4j has general demonstrable performance problems:
> http://buni.org/bugzilla/show_bug.cgi?id=137
> http://blog.buni.org/blog/mbarker/Meldware/2007/01/27/Look-out-Its-behind-you
> I'd suggest a general code review for the "byte at a time + buffered input stream" anti-pattern
> and general refactoring to do things in blocks where possible.
> -Andy

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org

View raw message