tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] Updated: (TIKA-361) Update OutlookExtractor to match new POI API
Date Wed, 16 Jun 2010 21:28:22 GMT

     [ https://issues.apache.org/jira/browse/TIKA-361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nick Burch updated TIKA-361:
----------------------------

    Attachment: outlook.patch

Updated patch which inserts more of the from/to/cc information into the metadata. Includes
some new metadata keys for this, and updates the Mbox parser to generate these keys too.

> Update OutlookExtractor to match new POI API
> --------------------------------------------
>
>                 Key: TIKA-361
>                 URL: https://issues.apache.org/jira/browse/TIKA-361
>             Project: Tika
>          Issue Type: New Feature
>    Affects Versions: 0.6
>            Reporter: Nick Burch
>         Attachments: outlook.patch
>
>
> OutlookExtractor currently uses POIChunkParser, which is a somewhat internal class, and
has recently undergone a large number of changes.
> The attached patch changes OutlookExtractor to use the more stable MAPIMessage for text
extraction, which allows it to continue extracting with the latest POI code in svn.
> The changes in POI's svn also allow for easy access to a few more bits of the message.
The patch adds date support, but possibly a few others will be wanted in future as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message