tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Gibby (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1130) Text extract leaves out text
Date Wed, 05 Jun 2013 21:57:21 GMT
Daniel Gibby created TIKA-1130:

             Summary: Text extract leaves out text
                 Key: TIKA-1130
                 URL: https://issues.apache.org/jira/browse/TIKA-1130
             Project: Tika
          Issue Type: Bug
    Affects Versions: 1.3, 1.2
         Environment: OpenJDK x86_64
            Reporter: Daniel Gibby
            Priority: Critical

When parsing a Microsoft Word .docx (application/vnd.openxmlformats-officedocument.wordprocessingml.document),
certain portions of text remain unextracted.

I have a .docx file that can be tested against. Please contact me to receive it.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message