tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Douglas (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-640) RFC822Parser should configure Mime4j not to fail reading mails containing more than 1000 chars in one headers text (even if folded)
Date Sun, 08 May 2011 03:45:03 GMT

     [ https://issues.apache.org/jira/browse/TIKA-640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Douglas updated TIKA-640:
----------------------------------

    Attachment: TIKA-640.patch

I'll concede that given the fact that the Metadata structure holds entire fields in strings,
that emails should behave no differently. This patch sets the max field length at unlimited,
which should not be a problem in all but the most unusual of circumstances. Setting MaxContentLength
to unlimited, as suggested by the jira author, is not necessary as that is the default.

> RFC822Parser should configure Mime4j not to fail reading mails containing more than 1000
chars in one headers text (even if folded)
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-640
>                 URL: https://issues.apache.org/jira/browse/TIKA-640
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 0.9
>         Environment: All
>            Reporter: Jens Wilmer
>              Labels: mail, rfc822parser
>         Attachments: TIKA-640.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> Standard configuration of Mime4j accepts only 1000 characters per line and 1000 charackters
per header. The streaming approach of tika should not need theese limitations, an exception
is being thrown and none of the data read is available.
> Solution:
> Replace all occurences of:
> Parser parser = new RFC822Parser();
> by:
> MimeEntityConfig config = new MimeEntityConfig();
> config.setMaxLineLen(-1);
> config.setMaxContentLen(-1);
> Parser parser = new RFC822Parser(config);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message