james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleg Kalnichevski (JIRA)" <server-...@james.apache.org>
Subject [jira] Updated: (MIME4J-5) Mime4j takes really long to parse big messages
Date Tue, 24 Jun 2008 17:00:45 GMT

     [ https://issues.apache.org/jira/browse/MIME4J-5?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Oleg Kalnichevski updated MIME4J-5:

    Attachment: mime4j.patch


I am attaching for your consideration a patch that resolves the greatest perforce bottleneck
in the mime4j parsing framework by improving the way the parser scans for mime boundaries
in data streams

According my (very unscientific) tests the patch improves parsing performance by almost as
much as 10 times

52KB file with 3 large attachments, 5000 repetitions: 28.618s before, 3.34 after

I did not want to change too many things in too many places at the same time. I mostly concentrated
on fixing MimeBoundaryInputStream as the first essential step. As a result I had to hack the
Cursor related stuff quick and dirty just to make it stick. The code in MimeTokenStream class
got somewhat uglier. I am willing to continue improving mime4j parsing code if this patch
and the overall approach get approved. I could also port MIME field parsing code from HttpCore
to resolve the last case of one byte read in mime4j if you want.

All existing test cases pass for me.

Please review the patch and let me know what you think.


> Mime4j takes really long to parse big messages
> ----------------------------------------------
>                 Key: MIME4J-5
>                 URL: https://issues.apache.org/jira/browse/MIME4J-5
>             Project: Mime4j
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Norman Maurer
>            Assignee: Niklas Therning
>             Fix For: 0.4
>         Attachments: mime4j.patch
> From ml:
> Mime4j has general demonstrable performance problems:
> http://buni.org/bugzilla/show_bug.cgi?id=137
> http://blog.buni.org/blog/mbarker/Meldware/2007/01/27/Look-out-Its-behind-you
> I'd suggest a general code review for the "byte at a time + buffered input stream" anti-pattern
> and general refactoring to do things in blocks where possible.
> -Andy

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org

View raw message