lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Making tika process mail attachments eludes me
Date Mon, 01 Apr 2013 18:54:03 GMT
: I believe that the handling of the multipart MIME lacks some error checking, and 
: it is probably related to the content outside the MIME boundaries (in my 
: example, the text "This is a multi-part message in MIME format."):
: 
: I really hope that some SOLR developer can have a look, we cannot be the only 
: ones having this problem. And I've spent almost twenty hours debugging this.

I am largely unfamiliar with the MailEntityProcessor, but IIRC it has not 
recieved much love over the years due to the lack of automated tests -- I 
believe all of the existing tests are disabled by default because they 
require an external IMAP server.

If anyone is interested in helping to contribute some tests that could 
be automated by using some sort of mock IMAP server library, that would go 
a long way towards being able to verify correctness & make improvemenets 
(even for people like me who are not familiar with the code and haven't 
thought very hard about MIME encapsulation in over 15 years)


-Hoss

Mime
View raw message