tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1186) Missing sender mail address in Outlook 2010
Date Tue, 22 Oct 2013 10:25:44 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801685#comment-13801685

Nick Burch commented on TIKA-1186:

Currently, Apache POI pretty much only uses the variable sized property values in Outlook
files. In part, this is because much of the early work was done by reverse engineering and
comparing with the bits of the spec that were open, and the variable sized properties are
much easier to spot in a hex dump!

There's some work on fixed sized properties in POI now, but it's not finished, and probably
will need a rework of a lot of the code to properly support

For now, can you try running HMEFDumper against the file, and see if the strings you want
show up in the MAPI Properties dump section?

> Missing sender mail address in Outlook 2010
> -------------------------------------------
>                 Key: TIKA-1186
>                 URL: https://issues.apache.org/jira/browse/TIKA-1186
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata, parser
>    Affects Versions: 1.4
>         Environment: Windows 7, 32 bit, CLI version
>            Reporter: Christian Leubner
>            Priority: Minor
>         Attachments: b.msg, b.msg
> When extracting metadata with Tika from an Outlook 2010 message file the quite important
information "sender mail address" is not extracted, but only the "Message-Recipient-Address".
However, for the exact identification of a sender/author the mail address is the most important

This message was sent by Atlassian JIRA

View raw message